Network Coding Fundamentals
Network Coding Fundamentals
1
École Polytechnique Fédérale de Lausanne (EPFL), Switzerland,
christina.fragouli@epfl.ch
2
Bell Laboratories, Alcatel-Lucent, USA, emina@research.bell-labs.com
Abstract
Network coding is an elegant and novel technique introduced at the turn
of the millennium to improve network throughput and performance. It
is expected to be a critical technology for networks of the future. This
tutorial addresses the first most natural questions one would ask about
this new technique: how network coding works and what are its bene-
fits, how network codes are designed and how much it costs to deploy
networks implementing such codes, and finally, whether there are meth-
ods to deal with cycles and delay that are present in all real networks.
A companion issue deals primarily with applications of network coding.
1
Introduction
2
3
ments, and methods to deal with cycles and delay. A companion volume
is concerned with application areas of network coding, which include
wireless and peer-to-peer networks.
In order to provide a meaningful selection of literature for the
novice reader, we reference a limited number of papers representing
the topics we cover. We refer a more interested reader to the webpage
www.networkcoding.info for a detailed literature listing. An excellent
tutorial focused on the information theoretic aspects of network coding
is provided in [49].
1.1.1 Benefits
Network coding promises to offer benefits along very diverse dimensions
of communication networks, such as throughput, wireless resources,
security, complexity, and resilience to link failures.
1.1 Introductory Examples 5
Throughput
The first demonstrated benefits of network coding were in terms of
throughput when multicasting. We discuss throughput benefits in
Chapter 4.
Fig. 1.2 The Butterfly Network. Sources S1 and S2 multicast their information to receivers
R1 and R2 .
6 Introduction
Wireless Resources
In a wireless environment, network coding can be used to offer benefits
in terms of battery life, wireless bandwidth, and delay.
Fig. 1.3 Nodes A and C exchange information via relay B. The network coding approach
uses one broadcast transmission less.
The benefits in the previous example arise from that broadcast trans-
missions are made maximally useful to all their receivers. Network cod-
ing for wireless is examined in the second part of this review. As we will
8 Introduction
Security
Sending linear combinations of packets instead of uncoded data offers
a natural way to take advantage of multipath diversity for security
against wiretapping attacks. Thus systems that only require protection
against such simple attacks, can get it “for free” without additional
security mechanisms.
Similar ideas can also help to identify malicious traffic and to protect
against Byzantine attacks, as we will discuss in the second part of this
review.
Fig. 1.4 Mixing information streams offers a natural protection against wiretapping.
1.1 Introductory Examples 9
1.1.2 Challenges
The deployment of network coding is challenged by a number of issues
that will also be discussed in more detail throughout the review. Here
we briefly outline some major concerns.
Complexity
Employing network coding requires nodes in the network to have addi-
tional functionalities.
Security
Networks where security is an important requirement, such as net-
works for banking transactions, need to guarantee protection against
sophisticated attacks. The current mechanisms in place are designed
around the assumption that the only eligible entities to tamper with
the data are the source and the destination. Network coding on the
other hand requires intermediate routers to perform operations on
the data packets. Thus deployment of network coding in such net-
works would require to put in place mechanisms that allow network
coding operations without affecting the authenticity of the data. Ini-
tial efforts toward this goal are discussed in the second part of the
review.
10 Introduction
11
12 The Main Theorem of Network Multicast
For unit capacity edges, the value of a cut equals the number of edges
in the cut, and it is sometimes referred to as the size of the cut. We will
use the term min-cut to refer to both the set of edges and to their total
number. Note that there exists a unique min-cut value, and possibly
several min-cuts, see for example Figure 2.1.
One can think about a min-cut as being a bottleneck for information
transmission between source S and receiver R. Indeed, the celebrated
max-flow, min-cut theorem, which we discuss below, claims that the
maximum information rate we can send from S to R is equal to the
min-cut value.
Fig. 2.1 A unicast connection over a network with unit capacity edges. The min-cut between
S and R equals three. There exist three edge disjoint paths between S and R that bring
symbols x1 , x2 , and x3 to the receiver.
2.1 The Min-Cut Max-Flow Theorem 13
Note that each step of the algorithm increases the number of edge
disjoint paths that connect the source to the receiver by one, and thus
at the end of step k we have identified k edge disjoint paths.
To prove that the algorithm works, we need to prove that, at each
step k, with 1 ≤ k ≤ h, there will exist a path whose edges satisfy the
condition in (2.1). We will prove this claim by contradiction.
Fig. 2.2 The paths to the three receivers overlap for example over edges BD and GH.
Definition 2.2. The local coding vector c! (e) associated with an edge e
is the vector of coefficients over Fq with which we multiply the incoming
symbols to edge e. The dimension of c! (e) is 1 × |In(e)|, where In(e) is
the set of incoming edges to the parent node of e.
Since we do not know what values should the coefficients in the local
coding vectors take, we assume that each used coefficient is an unknown
variable, whose value will be determined later. For the example in
Figure 2.2, the linear combination of information is shown in Figure 2.3.
The local coding vectors associated with edges BD and GH are
Fig. 2.3 The linear network coding solution sends over edges BD and GH linear combina-
tions of their incoming flows.
We next discuss how this linear combining can enable the receivers to
get the information at rate h.
Note that, since we start from the source symbols and then at inter-
mediate nodes only perform linear combining of the incoming symbols,
through each edge of G flows a linear combination of the source sym-
bols. Namely, the symbol flowing through some edge e of G is given by
σ1
" # σ2
c1 (e)σ1 + c2 (e)σ2 + · · · + ch (e)σh = c1 (e) c2 (e) · · · ch (e) . ,
$ %& ' .
.
c(e)
σh
where the vector c(e) = [c1 (e) c2 (e) . . . ch (e)] belongs to an h-
dimensional vector space over Fq . We shall refer to the vector c(e)
as the global coding vector of edge e, or for simplicity as the coding
vector.
Definition 2.3. The global coding vector c(e) associated with an edge
e is the vector of coefficients of the source symbols that flow (linearly
combined) through edge e. The dimension of c(e) is 1 × h.
2.2 The Main Network Coding Theorem 19
The network code design problem is to select values for the coefficients
{αk } = {α1 , . . . , α4 }, so that all matrices Aj , 1 ≤ j ≤ 3, are full rank.
The main multicast theorem can be expressed in algebraic language as
follows:
When dealing with polynomials over a finite field, one has to keep
in mind that even a polynomial not identically equal to zero over that
field can evaluate to zero on all of its elements. Consider, for example,
the polynomial x(x + 1) over F2 , or the following set of matrices:
. / . /
x2 x(1 + x) 1 x
A1 = and A2 = . (2.4)
x(1 + x) x2 1 1
Over F2 , at least one of the determinants of these matrices is equal to
zero (namely, for x = 0, det(A1 ) = 0 and x = 1, det(A2 ) = 0).
The following lemma tells us that we can find values for the coef-
ficients {αk } over a large enough finite field such that the condition
f ({αk }) &= 0 in (2.3) holds, and thus concludes our proof. For exam-
ple, for the matrices in (2.4) we can use x = 2 over F3 so that both
det(A1 ) &= 0 and det(A2 ) &= 0.
Note that we have given a simple proof that network codes exist
under certain assumptions, rather than a constructive proof. A number
of questions naturally arise at this point, and we will discuss them in the
next session. In Chapter 3, we will learn how to calculate the parame-
terized matrices Aj for a given graph and multicast configuration, and
give an alternative information-theoretical proof of the main theorem.
In Chapter 5, we will look into the existing methods for network code
design, i.e., methods to efficiently choose the linear coefficients.
2.2.4 Discussion
Theorem 2.2 claims the existence of network codes for multicast, and
immediately triggers a number of important questions. Can we con-
struct network codes efficiently? How large does the operating field Fq
need to be? What are the complexity requirements of network cod-
ing? Do we expect to get significant benefits, i.e., is coding/decoding
worth the effort? Do network codes exist for other types of network
scenarios, such as networks with noisy links, or wireless networks, or
traffic patterns other than multicast? What is the possible impact on
applications? We will address some of these issues in later chapters.
In the rest of this chapter, we check how restrictive are the modeling
assumptions we made, that is, whether the main theorem would hold
or could be easily extended if we removed or relaxed these assumptions.
Unit capacity edges – not restrictive, provided that the capacity of
the edges are rational numbers and we allow parallel edges.
Collocated sources – not restrictive. We can add to the graph an
artificial common source vertex, and connect it to the h source vertices
through unit capacity edges.
2.2 The Main Network Coding Theorem 23
Fig. 2.4 An undirected graph where the main theorem does not hold.
min-cuts, but cannot always transmit to each receiver at the rate equal
to its min-cut. For example, consider again the butterfly network in
Figure 1.2 and assume that each vertex of the graph is a receiver.
We will then have receivers with min-cut two, and receivers with min-
cut one. Receivers with min-cut two require that linear combining is
performed somewhere in the network, while receivers with min-cut one
require that only routing is employed. These two requirements are not
always compatible.
Notes
The min-cut max-flow theorem was proven in 1927 by Menger [40]. It
was also proven in 1956 by Ford and Fulkerson [22] and independently
the same year by Elias et al. [20]. There exist many different variations
of this theorem, see for example [18]. The proof we gave follows the
approach in [8], and it is specific to unit capacity undirected graphs.
The proof for arbitrary capacity edges and/or directed graphs is a direct
extension of the same ideas.
The main theorem in network coding was proved by Ahlswede et al.
[2, 36] in a series of two papers in 2000. The first origin of these ideas
can be traced back to Yeung’s work in 1995 [48]. Associating a random
variable with the unknown coefficients when performing linear network
coding was proposed by Koetter and Médard [32]. The sparse zeros
lemma is published by Yeung et al. in [49] and by Harvey in [25],
and is a variation of a result proved in 1980 by Schwartz [45]. The
proof approach of the main theorem also follows the proof approach
in [25, 49].
3
Theoretical Frameworks for Network Coding
Network coding can and has been studied within a number of different
theoretical frameworks, in several research communities. The choice of
framework a researcher makes most frequently depends on his back-
ground and preferences. However, one may also argue that each net-
work coding issue (e.g., code design, throughput benefits, complexity)
should be put in the framework in which it can be studied the most
naturally and efficiently.
We here present tools from the algebraic, combinatorial, information
theoretic, and linear programming frameworks, and point out some of
the important results obtained within each framework. At the end, we
discuss different types of routing and coding. Multicasting is the only
traffic scenario examined in this chapter. However, the presented tools
and techniques can be extended to and are useful for studying other
more general types of traffic as well.
Definition 3.4. Coding points are the edges of the graph G$ where we
need to perform network coding operations.
Example 3.1. Consider the network with two sources and two
receivers shown in Figure 3.1(a) that is created by putting one butter-
fly network on the top of another. A choice of two sets of edge-disjoint
paths (corresponding to the two receivers) is shown in Figures 3.1(b)
and 3.1(c). This choice results in using two coding points, i.e., linearly
combining flows at edges AB and CD. Notice, however, that nodes
A and B in Figure 3.1(a) and their incident edges can be removed
without affecting the multicast condition. The resulting graph is then
minimal, has a single coding point, and is identical to the butterfly
network shown in Figure 1.2.
Fig. 3.1 A network with two sources and two receivers: (a) the original graph, (b) two edge-
disjoint paths from the sources to the receiver R1 , and (c) two edge-disjoint paths from the
sources to the receiver R2 .
3.1 A Network Multicast Model 29
where L(Si , Rj ) denotes the line graph of the path (Si , Rj ), that is, each
vertex of L(Si , Rj ) represents an edge of (Si , Rj ), and any two vertices
of L(Si , Rj ) are adjacent if and only if their corresponding edges share
a common vertex in (Si , Rj ). Figure 3.2 shows the network with two
sources and three receivers which we studied in Chapter 2 together
with its line graph.
Fig. 3.2 A network with 2 sources and 3 receivers, and its line graph.
30 Theoretical Frameworks for Network Coding
Definition 3.6. Source nodes {S1e , S2e , . . . , She } are the nodes of γ cor-
responding to the h sources. Equivalently, they are the h edges in G$
emanating from the source vertex S.
Each node in γ with a single input edge merely forwards its input
symbol to its output edges. Each node with two or more input edges
performs a coding operation (linear combining) on its input symbols,
and forwards the result to all of its output edges. These nodes are the
coding points in Definition 3.4. Using the line graph notation makes
the definition of coding points transparent. In Figure 3.2(b), BD and
GH are coding points.
Finally, we refer to the node corresponding to the last edge of the
path (Si , Rj ) as the receiver node for receiver Rj and source Si . For a
configuration with h sources and N receivers, there exist hN receiver
nodes. In Figure 3.2(b), AF , HF , HK, DK, DE, and CE are receiver
nodes. Our definitions for feasible and valid network codes directly
translate for line graphs, as well as our definition for minimality.
Note that vertices corresponding to the edges of In(e) are parent
nodes of the vertex corresponding to e. The edges coming into the
node e are labeled by the coefficients of local coding vector c! (e). Thus
designing a network code amounts to choosing the values {αk } for the
labels of edges in the line graph.
Aj = Dj + Cj (I − A)−1 B. (3.4)
In (3.2) matrix A is common for all receivers and reflects the way
the memory elements (states) are connected (network topology). Its
elements are indexed by the states (nodes of the line graph), and an
element of A is nonzero if and only if there is an edge in the line graph
between the indexing states. The nonzero elements equal either to 1
or to an unknown variable in {αk }. Network code design amounts to
selecting values for the variable entries in A.
Matrix B is also common for all receivers and reflects the way the
inputs (sources) are connected to our graph. Matrices Cj and Dj ,
32 Theoretical Frameworks for Network Coding
since permuting columns of a matrix only affects the sign of its determi-
nant. Now, using a result known as Schur’s formula for the determinant
of block matrices, we further obtain
. /
0 Cj
± det Nj = det = det(I − A) det(Cj (I − A)−1 B).
B I−A
The claim (3.6) follows directly from the above equation by noticing
that, since A is strictly upper triangular, then det(I − A) = 1.
For the network code design problem, we only need to know how
the subtrees are connected and which receiver nodes are in each Ti .
Thus we can contract each subtree to a node and retain only the edges
that connect the subtrees. The resulting combinatorial object, which
we will refer to as the subtree graph Γ, is defined by its underlying
topology (VΓ , EΓ ) and the association of the receivers with nodes VΓ .
Figure 3.3(b) shows the subtree graph for the network in Figure 3.2.
The network code design problem is now reduced to assigning an
h-dimensional coding vector c(Ti ) = [c1 (Ti ) · · · ch (Ti )] to each sub-
tree Ti . Receiver j observes h coding vectors from h distinct subtrees
to form the rows of the matrix Aj .
Example 3.2. A valid code for the network in Figure 3.2(a) can be
obtained by assigning the following coding vectors to the subtrees in
Figure 3.3(b):
For this code, the field with two elements is sufficient. Nodes B and G
in the network (corresponding to coding points BD and GH) perform
binary addition of their inputs and forward the result of the operation.
Fig. 3.3 (a) Line graph with coding points BD and GH for the network in Figure 3.2, and
(b) the associated subtree graph.
36 Theoretical Frameworks for Network Coding
The rest of the nodes in the network merely forward the information
they receive. The matrices for receivers R1 , R2 , and R3 are
. / . / . / . /
c(T1 ) 1 0 c(T2 ) 0 1
A1 = = , A2 = = , and
c(T4 ) 0 1 c(T3 ) 1 1
. / . /
c(T3 ) 1 1
A3 = = .
c(T4 ) 0 1
(1) Each Vi! contains exactly one source node (see Definition 3.6)
or a coding point (see Definition 3.4), and
(2) Each node that is neither a source node nor a coding point
belongs to the Vi! which contains its closest ancestral coding
point or source node.
(5) This claim holds because there are N receivers and, by the
previous claim, all receiver nodes contained in the same sub-
tree are distinct.
(1) There does not exist a valid network code where a subtree is
assigned the same coding vector as one of its parents.
(2) There does not exist a valid network code where the vec-
tors assigned to the parents of any given subtree are linearly
dependent.
(3) There does not exist a valid network code where the coding
vector assigned to a child belongs to a subspace spanned by
a proper subset of the vectors assigned to its parents.
(4) Each coding subtree has at most h parents.
(5) If a coding subtree has 2 ≤ P ≤ h parents, then there exist P
vertex-disjoint paths from the source nodes to the subtree.
Proof.
Proof.
(1) Suppose that the minimal subtree graph has two source and
m coding subtrees. Consider a network code which assigns
coding vectors [0 1] and [1 0] to the source subtrees, and a
different vector [1 αk ], 0 < k ≤ m, to each of the coding sub-
trees, where α is a primitive element of Fq and q > m. As
discussed in Appendix, any two different coding vectors form
a basis for F2q , and thus this code is feasible and valid. Let
Ti and Tj denote a parent and a child subtree, respectively,
with no child or receiver in common. Now, alter the code
by assigning the coding vector of Ti to Tj , and keep the
rest of coding vectors unchanged. This assignment is fea-
sible because Ti and Tj do not share a child, which would
have to be assigned a scalar multiple of the coding vector
of Ti and Tj . Since Ti and Tj do not share a receiver, the
code is still valid. Therefore, by claim (1) of Theorem 3.5,
the configuration is not minimal, which contradicts the
assumption.
(2) Consider a coding subtree T . Let P1 and P2 be its parents.
By claim (1), a parent and a child have either a receiver or
a child in common. If T has a receiver in common with each
of its two parents, then T has two receiver nodes. If T and
one of the parents, say P1 , do not have a receiver in common,
then they have a child in common, say C1 . Similarly, if T and
C1 do not have receiver in common, then they have a child
in common. And so forth, following the descendants of P1 ,
one eventually reaches a child of T that is a terminal node of
the subtree graph, and thus has no children. Consequently, T
has to have a receiver in common with this terminal subtree.
Similarly, if T and P2 do not have a child in common, there
exists a descendant of P2 and child of T which must have a
receiver in common with T .
3.3 Combinatorial Framework 41
(3) If the two source subtrees have no children, then each must
contain N receiver nodes. If the source subtree has a child,
say T1 , then by claim (1) it will have a receiver or a child
in common with T1 . Following the same reasoning as in the
previous claim, we conclude that each source subtree contains
at least one receiver node.
Lemma 3.7. For a network with two sources and two receivers, there
exist exactly two minimal subtree graphs shown in Figure 3.4.
Proof. Figure 3.4(a) shows the network scenario in which each receiver
has access to both source subtrees, and thus no network coding
is required, i.e., there are no coding subtrees. If network coding is
required, then by Theorem 3.6, there can only exist one coding sub-
tree containing two receiver nodes. Figure 3.4(b) shows the network
scenario in which each receiver has access to a different source subtree
and a common coding subtree. This case corresponds to the familiar
butterfly example of a network with two sources and two receivers.
Therefore, all instances {G, S, R} with two sources and two receivers
where network coding is required, “contain” the butterfly network in
Figure 1.2, and can be reduced to it in polynomial time by removing
edges from the graph G.
Continuing along these lines, it is easy to show that for example
there exist exactly three minimal configurations with two sources and
Fig. 3.4 Two possible subtree graphs for networks with two sources and two receivers: (a)
no coding required, and (b) network coding necessary.
42 Theoretical Frameworks for Network Coding
three receivers, and seven minimal configurations with two sources and
four receivers. In fact, one can enumerate all the minimal configurations
with a given number of sources and receivers.
fiS : 2nωS → 2n ,
packet of length n
fiv : 2n|In(v)| → 2n .
the information rate. That is, each intermediate node performs the
same operation, no matter what the min-cut value is, or where the
node is located in the network. This is not the case with routing,
where an intermediate node receiving linear independent information
streams needs, for example, to know which information stream to for-
ward where and at which rate. This property of network coding proves
to be very useful in dynamically changing networks, as we will dis-
cuss in the second part of the review. Second, the packet length n
can be thought of as the “alphabet size” required to achieve a cer-
tain probability of error when randomly selecting the coding oper-
ations. Third, although the proof assumed equal alphabet sizes, the
same arguments go through for arbitrary alphabet sizes. Finally, note
that the same proof would apply if, instead of unit capacity edges, we
had arbitrary capacities, and more generally, if each edge were a Dis-
crete Memoryless Channel (DMC), using the information theoretical
min-cut.1
Max-Flow LP:
maximize fRS
subject to
0 0
fvu = fuw , ∀u∈V (flow conservation)
(v,u)∈E (u,w)∈E
Fig. 3.5 The butterfly network with a Steiner tree rooted at S2 and a partial Steiner tree
rooted at S1 .
This is a hard problem to solve, as the set of all possible Steiner trees
can be exponentially large in the number of vertices of the graph.
It is now this flow that needs to satisfy the capacity constraints. Note
that the conceptual flow f does not need to satisfy flow conservation.
Again we add virtual edges (Ri , S) of infinite capacity that connect
each receiver Ri to the source vertex.
The first constraint ensures that each receiver has min-cut at least χ.
Once we solve this program and find the optimal χ, we can apply the
main theorem in network coding to prove the existence of a network
code that can actually deliver this rate to the receivers, no matter how
the different flows fi overlap. The last missing piece is to show that we
can design these network codes in polynomial time, and this will be the
subject of Chapter 5. Note that, over directed graphs, instead of solving
the LP, we could simply identify the min-cut from the source to each
receiver, and use the minimum such value. The merit of this program
is that it can easily be extended to apply over undirected graphs as
well, and that it introduces an elegant way of thinking about network
coded flows through conceptual flows that can be applied to problems
with similar flavor, such as the one we describe next.
In many information flow instances, instead of maximizing the
rate, we may be interested in minimizing a cost that is a function of
the information rate and may be different for each network edge. For
example, the cost of power consumption in a wireless environment
can be modeled in this manner. Consider an instance {G, S, R} where
the min-cut to each receiver is larger or equal to h, and assume
3.6 Types of Routing and Coding 49
that the cost of using edge (v, u) is proportional to the flow fvu
with a weight wvu ≥ 0 (i.e., the cost equals wvu fvu ). The problem of
minimizing the overall cost in the network is also computationally hard
if network coding is not allowed. On the other hand, by using network
coding, we can solve it in polynomial time by slightly modifying the
previous LP, where we associate zero cost with the virtual edges (Ri , S):
subject to
fRi i S ≥ h, ∀ i
0 0
i
fvu = i
fuw , ∀u ∈ V, ∀i (flow conservation)
(v,u)∈E (u,w)∈E
i
fvu ≤ fvu , ∀ (v, u) ∈ E (conceptual flow constraints)
fvu ≤ cvu , ∀ (v, u) ∈ E (capacity constraints)
i
fvu ≥ 0 and fvu ≥ 0, ∀ (v, u) ∈ E, ∀i
3.6.1 Routing
Routing refers to the case where flows are sent uncoded from the source
to the receivers, and intermediate nodes simply forward their incoming
50 Theoretical Frameworks for Network Coding
(1) Integral routing, which requires that through each unit capac-
ity edge we route at most one unit rate source.
(2) Fractional routing, which allows multiple fractional rates
from different sources to share the same edge as long as they
add up to at most the capacity of the edge.
(3) Vector routing, which can be either integral or fractional but
is allowed to change at the beginning of each time slot.
! !
Given a solution y ∗ to this linear program, let m = k t∈τ y ∗ (t, k).
We can use an erasure code that employs m coded symbols to convey
the same f information symbols to all receivers, i.e., to achieve rate
Ti = f /m. (To be precise, we need f /m to be a rational and not a real
number.) For integer edge capacities ce , there is an optimum solution
with rational coordinates.
This scheme is very related to the rate-less codes approach over lossy
networks, such as LT codes and Raptor codes. In lossy networks, packet
erasures can be thought of as randomly pruning trees. At each time slot,
the erasures that occur correspond to a particular packing of partial
Steiner trees. Rate-less codes allow the receivers to decode as soon as
they collect a sufficient number of coded symbols.
3.6 Types of Routing and Coding 53
Fig. 3.6 A network with unit capacity edges where, if we are restricted to integral routing
over one time-slot, we can pack (Case 1) one unit rate Steiner tree, or (Case 2) three unit
rate partial Steiner trees.
Notes
The algebraic framework for network coding was developed by Koetter
and Médard in [32], who translated the network code design to an alge-
braic problem which depends on the structure of the underlying graph.
Lemma 3.1 is proved by Ho et al. in [26] and by Harvey in [25]. The pre-
sented combinatorial framework follows the approach of Fragouli and
Soljanin in [24]. The information theoretic framework comes from the
original paper by Ahlswede et al. [2]. Introduction to LP can be found
in numerous books, see for example [10, 44]. The network coding LP
54 Theoretical Frameworks for Network Coding
55
56 Throughput Benefits of Network Coding
• Tij and Tfj denote the rate that receiver Rj experiences with
fractional and integral routing, respectively, under some rout-
ing strategy.
• Ti and Tf denote the maximum integral and fractional rate
we can route to all receivers, where the maximization is over
all possible routing strategies. We refer to this throughput as
symmetric or common.
! !N
• Tiav = N1 max N j 1
j=1 Ti and Tf = N max
av j
j=1 Tf denote the
maximum integral and fractional average throughput.
Tc = h,
from the main network multicast theorem. Also, because there exists a
tree spanning the source and the receiver nodes, the uncoded through-
put is at least 1. We, therefore, have
1 ≤ Tiav ≤ Tfav ≤ h,
4.2 Linear Programming Approach 57
and thus
1 T av Tfav
≤ i ≤ ≤ 1. (4.1)
h Tc Tc
does not use an edge. To state this problem, we need the notion of a
separating set of vertices. We call a set of vertices D ⊂ V separating
if D contains the source vertex S and V \ D contains at least one of
the terminals in R. Let δ(D) denote the set of edges from D to V \ D,
that is, δ(D) = {(u, v) ∈ E : u ∈ D, v ∈
/ D}. We consider the following
integer programming (IP) formulation for the minimum Steiner tree
problem:
xe ≥ 0, ∀ e∈E
60 Throughput Benefits of Network Coding
Let lp(G, w, S, R) denote the optimum value of this linear program for
the given instance. The value lp(G, w, S, R) lower bounds the cost of
the integer program solution opt(G, w, S, R). The integrality gap of the
relaxation on G is defined as
opt(G, w, S, R)
α(G, S, R) = max ,
w≥0 lp(G, w, S, R)
where the maximization is over all possible edge weights. Note that
α(G, S, R) is invariant to scaling of the optimum achieving weights.
The theorem tells us that, for a given configuration {G, S, R}, the
maximum throughput benefits we may hope to get with network coding
equals the largest integrality gap of the Steiner tree problem possible
on the same graph. This result refers to fractional routing; if we restrict
our problem to integral routing on the graph, we may get even larger
throughput benefits from coding.
(1) For the instance {G, S, R}, find c = c∗ such that the left-hand
side ratio is maximized, i.e.,
Tc (G, S, R, c∗ ) Tc (G, S, R, c)
= max .
Tf (G, S, R, c )
∗ c Tf (G, S, R, c)
(2) Now solve the dual program in (4.3). From strong duality,1
we get that
0 0
Tf (G, S, R, c∗ ) = yt∗ = ce ze∗ . (4.8)
opt(G, w = z ∗ , S, R) = 1. (4.9)
Tc (G, S, R, c) opt(G, w, S, R)
≥ max . (4.11)
Tf (G, S, R, c) w lp(G, w, S, R)
1 Strong duality holds since there exists a feasible solution in the primal.
2 See [10, 44] for an overview of LP.
62 Throughput Benefits of Network Coding
(1) We start from the Steiner tree problem in (4.4) and (4.5).
Let w∗ be a weight vector such that
opt(G, w∗ , S, R) opt(G, w, S, R)
α(G, S, R) = = max .
lp(G, w∗ , S, R) w lp(G, w, S, R)
Let x∗ be an optimum solution for the LP on the instance
(G, w∗ , S, R). Hence
0
lp(G, w∗ , S, R) = we∗ x∗e .
e
Note that vectors with positive components, and thus the
vectors x = {xe , e ∈ E} satisfying the constraints of (4.5), can
be interpreted as a set of capacities for the edges of G. In
particular, we can then think of the optimum solution x∗ as
associating a capacity ce = x∗e with each edge e so that the
min-cut to each receiver is greater or equal to one, and the
!
cost e we∗ x∗e is minimized.
(2) Consider now the instance with these capacities, namely,
{G, c = x∗ , S, R} and examine the coding throughput ben-
efits we can get. Since the min-cut to each receiver is at least
one, with network coding we can achieve throughput
Tc (G, c = x∗ , S, R) ≥ 1. (4.12)
With fractional routing we can achieve
0
Tf (G, c = x∗ , S, R) = yt∗ (4.13)
for some yt∗ .
(3) We now need to bound the cost opt(G, w∗ , S, R) that the
solution of the IP Steiner tree problem will give us. We will
use the following averaging argument. We will show that the
average cost per Steiner tree will be less or equal to
! ∗ ∗
we xe
.
Tf (G, c = x∗ , S, R)
Therefore, there exists a Steiner tree with the weight smaller
than the average, which upper bounds the solution of (4.5):
! ∗ ∗
∗ we xe lp(G, w∗ , S, R)
opt(G, w , S, R) ≤ =
Tf (G, x∗ , S, R) Tf (G, x∗ , S, R)
4.2 Linear Programming Approach 63
0 0
wt yt∗ ≥ min {wt } yt∗ .
t∈τ
t∈τ t∈τ
Let Tfav (G, S, R, c) denote the value of the above linear program on a
given instance. The coding advantage for average throughput on G is
given by the ratio
Tc (G, c, S, R)
β(G, S, R) = max .
c Tfav (G, c, S, R)
β(G, S, R∗ ) = max
"
β(G, S, R$ ).
R ⊆R
4.3 Configurations with Large Network Coding Benefits 65
Note that the maximum value of Tf and Tfav is not necessarily achieved
for the same capacity vector c, or for the same number of receivers N .
What this theorem tells us is that, for a given {G, S, R}, with |R| = N ,
the maximum common rate we can guarantee to all receivers will be at
most HN times smaller than the maximum average rate we can send
from S to any subset of the receivers R. The theorem quantitatively
bounds the advantage in going from the stricter measure α(G, S, R)
to the weaker measure β(G, S, R∗ ). Furthermore, it is often the case
that for particular instances (G, S, R), either α(G, S, R) or β(G, S, R∗ )
is easier to analyze and the theorem can be useful to get an estimate
of the other quantity.
We now comment on the tightness of the bounds in the theo-
rem. There are instances in which β(G, S, R∗ ) = 1, take for example
the case when G is a tree rooted at S. On the other hand there
are instances in which β(G, S, R∗ ) = O(1/ ln N )α(G, S, R). Examples
include network graphs discussed in the next section. In general, the
ratio α(G, S, R)/β(G, S, R∗ ) can take a value in the range [1, HN ].
!k "
Fig. 4.1 The combination B(h, k) network has h sources and N = h receivers, each receiver
observing a distinct subset of h B-nodes. Edges have unit capacity and sources have unit
rate.
4 −15
in-degree of the receiver nodes is Np−1 . In fact, it is easy to show that
4N −15
there exist exactly p−1 edge disjoint paths between the source and
4 −15
each receiver, and thus the min-cut to each receiver equals Np−1 .
4N −15
Theorem 4.3. In a zk(N, p) network (Figure 4.2) with h = p−1
sources and N receivers,
Tfav p−1 1
≤ + . (4.18)
Tc N −p+1 p
count the number of the receivers spanned by the tree as follows: Let
nt (Aj ) be the number of C-nodes connected to Aj , the jth A-node in
the tree. Note that
at
0
nt (Aj ) = ct .
j=1
Tiav p−1 1
≤ + . (4.20)
Tc N −p+1 p
We can apply the exact same arguments to upper bound Tfav , by allow-
ing at and ct to take fractional values, and interpreting these values as
the fractional rate of the corresponding trees.
Example 4.3. For a network with two sources the bounds in (4.1)
give
1 Tiav
≤ ≤1
2 Tc
by setting h = 2. We can achieve the lower bound by simply finding a
spanning tree. In general, a tighter bound also holds:
Tiav 1 1
≥ + ,
Tc 2 2N
and there are networks for which this bound holds with equality.
There are networks with two sources with even smaller coding through-
put advantage.
Example 4.4. In any combination network with two unit rate sources
(network coding throughput Tc = 2) a simple routing scheme can
achieve an average throughput of at least 3Tc /4 as follows: We route S1
through one half of the k coding points Ai Bi and S2 through the other
half. Therefore, the average routing throughput, for even k, is given by
2 6k7 .6 7 6 k 7/3 6 7
1 2 k 2 3 1 3
Ti = 4k 5 2
av
+2 −2 =2 + > Tc .
2 2 2 4 4(k − 1) 4
2
Tc ≤ 2Tf .
To prove the theorem, we use the following result for directed graphs.
h/2 of them. In other words, these trees span all receivers, and thus
the routing throughput can be made to be at least h/2. We conclude
that Tc ≤ 2Tf .
Notes
The first multicast throughput benefits in network coding referred
to the symmetric integral throughput in directed networks, and were
reported by Sanders et al. in [43]. Theorem 4.1 was subsequently proved
by Agarwal and Charikar in [1], and it was extended to Theorem 4.2
by Chekuri et al. in [15]. The interested reader is referred to [44] for
details on integer and linear programs, and to [4, Ch. 9] for details
on the Steiner tree problem formulations. The zk networks were intro-
duced by Zosin and Khuller in [50], and were used to demonstrate
network coding throughput benefits in [1, 15]. Theorem 4.5 was proved
Bang-Jensen et al. in [5]. Throughput benefits over undirected graphs
were examined by Li et al. in [37], and by Chekuri et al. in [14]. Infor-
mation theoretic rate bounds over undirected networks with two-way
channels where provided by Kramer and Savari in [33]. Experimental
results by Wu et al. reported in [47] showed small throughput bene-
4.5 Undirected Graphs 75
76
5.1 Common Initial Procedure 77
if (V , E2\ {e}) satisfies the multicast property
$ $
∀e ∈ E $ E $ ← E $ \ {e}
then
G$ ← (V $ , E $ )
return (G$ )
(Si , Rj ).
For each receiver Rj , the algorithm maintains a set Cj of h coding
points, and a set Bj = {cj1 , . . . , cjh } of h coding vectors. The set Cj keeps
track of the most recently visited coding point in each of the h edge
disjoint paths from the source to Rj . The set Bj keeps the associated
coding vectors.
Initially, for all Rj , the set Cj contains the source nodes
{S1 , S2e , . . . , She }, and the set Bj contains the orthonormal basis
e
{e1 , e2 , . . . , eh }, where the vector ei has one in position i and zero else-
where. Preserving the multicast property amounts to ensuring that the
set Bj forms a basis of the h-dimensional space Fhq at all steps of the
algorithm and for all receivers Rj . At step k, the algorithm assigns a
coding vector c(δk ) to the coding point δk , and replaces, for all receivers
Rj ∈ R(δk ):
• the point f←
j (δ ) in C with the point δ ,
k j k
• the associated vector c(f←
j (δ )) in B with c(δ ).
k j k
The algorithm selects the vector c(δk ) so that for every receiver
Rj ∈ R(δk ), the set (Bj \ {c(f←
j (δ ))}) ∪ c(δ ) forms a basis of the
k k
h-dimensional space. Such a vector c(δk ) always exists, provided
that the field Fq has size larger than N , as Lemma 5.2 proves.
When the algorithm terminates, the set Bj contains the set of lin-
ear equations the receiver Rj needs to solve to retrieve the source
symbols.
j (δ)). Then
c(f←
we can find a valid value for c(δ). Applying the same argument succes-
sively to all coding points concludes the proof.
Note that that this bound may not be tight, since it was obtained by
assuming that the only intersection of the |R(δ)| (h − 1)-dimensional
spaces V(Rj , δ), Rj ∈ R(δ) is the all-zero vector, which was counted
5.2 Centralized Algorithms 81
3 Note j
that Bj \ {c(f← (δk ))} forms an hyperplane, i.e., an h − 1 dimensional subspace,
of the h-dimensional space that can be uniquely described by the associated orthogonal
vector.
82 Network Code Design Methods for Multicasting
left nodes corresponding to the rows and the right nodes the columns
of matrix A. For a non-zero entry in the position (i, j) in A, place an
edge between left node i and right node j. A perfect matching is a sub-
set of the edges such that each node of the constructed bipartite graph
is adjacent to exactly one of the edges. Identifying a perfect matching
can be accomplished in polynomial time. It is also easy to see that the
identified matching corresponds to an assignment of binary values to
the matrix entries (one for an edge used, zero otherwise) such that the
matrix is full rank.
We omit the proof of this lemma, and instead present a simple proof
of a slightly weaker but more often used result which, in Theorem 5.4,
replaces η $ by the total number of local coding coefficients η. For this
result, the following lemma is sufficient.
size of the generation affects how well packets are “mixed,” and thus
it is desirable to have a fairly large generation size. Indeed, if we use a
large number of small-size generations, intermediate nodes may receive
packets destined to the same receivers but belonging to different gener-
ations. Experimental studies have started investigating this trade-off.
Theorem 5.7. Let h be the binary min-cut capacity. For any - > 0,
the permute-and-add code achieves rate R = h − (|E| + 1)- with prob-
|E|+h
)
ability greater than 1 − 2L'+log(N L+1 .
We here only give a high-level outline of the proof. The basic idea is an
extension of the polynomial time algorithms presented in Section 5.2.1.
To show that the transfer matrix from the source to each receiver is
invertible, it is sufficient to make sure that the information crossing
specific cut-sets allows to infer the source information. These cut-sets
can be partially ordered, so that no cut-set is visited before its “ances-
tor” cut-sets. The problem is then reduced to ensuring that the transfer
matrix between successive cut-sets is, with high probability, full rank.
88 Network Code Design Methods for Multicasting
Codes for networks with two sources are also simple to construct.
Recall that for a valid network code, it is sufficient and necessary that
the coding vector associated with a subtree lies in the linear span of
the coding vectors associated with its parent subtrees, and the coding
vectors of any two subtrees having a receiver in common are linearly
independent. Since any two different points of (5.2) are linearly inde-
pendent in F2q , and thus each point is in the span of any two different
points of (5.2), both coding conditions for a valid network code are
satisfied if each node in a subtree graph of a network is assigned a
unique point of the projective line PG(1, q). We here present two algo-
rithms which assign distinct coding vectors to the nodes in the subtree
graph.
In Chapter 7, we will discuss a connection between the problem
of designing codes for networks with two sources and the problem of
5.3 Decentralized Algorithms 89
We now discuss how the code design problem can be simplified at the
cost of using some additional network resources. These algorithms are
well suited to applications such as overlay and ad-hoc networks, where
90 Network Code Design Methods for Multicasting
Notes
LIF, proposed by Sanders et al. in [43], and independently by Jaggi
et al. in [29] was the first polynomial algorithms for network code
design (see also [30]). These algorithms were later extended to include
procedures that attempt to minimize the required field size by Barbero
and Ytrehus in [6]. Randomized algorithms were proposed by Ho et al.
in [26], and also by Sanders et al. in [43], and their asynchronous imple-
mentation over practical networks using generations by Chou et al.
in [16]. Lemma 5.5 instrumental for Theorem 5.4 was proved by Ho et al.
in [26]. Codes that use the algebraic structure were designed by Koetter
and Médard [32], while the matrix completion codes were investigated
by Harvey in [25]. Permute-and-add codes were recently proposed by
Jaggi et al. in [28]. Decentralized deterministic code design was intro-
duced by Fragouli and Soljanin in [24]. Minimal configurations and the
brute force algorithm to identify them were also introduced in [24].
Improved algorithms for identifying minimal configurations were con-
structed in [34].
6
Networks with Delay and Cycles
For most of this review we have assumed that all nodes in the network
simultaneously receive all their inputs and produce their outputs, and
that networks have no cycles. We will now relax these assumptions.
We first look at how to deal with delay over acyclic graphs. We then
formally define networks with cycles, and discuss networks that have
both cycles and delay. In fact, we will see that one method to deal with
cycles is by artificially introducing delay.
93
94 Networks with Delay and Cycles
only need to know at which rate to inject coded packets into their out-
going links. This approach is ideally suited to dynamically changing
networks, where large variations in delay may occur.
In contrast, the approach discussed in this chapter deals with the
lack of synchronization by using time-windows at intermediate network
nodes to absorb the variability of the network links delay. This second
approach is better suited for networks with small variability in delay.
Intermediate nodes wait for a predetermined time interval (larger than
the expected delay of the network links) to receive incoming packets
before combining them. Under this scenario, we can model the network
operation by associating one unit delay D (or more) either with each
edge or only with edges that are coding points. We next discuss how to
design network codes while taking into account the introduced (fixed)
delay.
6.1.1 Algorithms
There are basically two ways to approach network code design for net-
works for delay:
Fig. 6.1 Configuration with 2 sources and 5 receivers: (a) the subtree graph; (b) the corre-
sponding convolutional encoder.
2(α − 1) + m−2(α−1)
2 1 α−1
= +
m 2 m
1 Forsimplicity, we ignore the fact that the network coding solution does not preserve the
ordering of the bits.
6.3 Dealing with Cycles 99
timeslots. Indeed, for the first α − 1 timeslots each receiver gets one
uncoded bit through AD and BF , respectively, then for (m − 2(α −
1))/2 timeslots each receiver can successfully decode two bits per time-
slot, and finally for α − 1 timeslots the receivers only receive and decode
one bit per timeslot through CE.
On the other hand, by using routing along AD and BF and time-
sharing along CE, the source can deliver the information in
α − 1 + 2(m−α+1)
3 2 1α−1
= +
m 3 3 m
timeslots. Depending on the value of α and m, the network coding solu-
tion might lead to either less or more delay than the routing solution.
Finally, if a receiver exclusively uses the network resources, the
resulting delay would be
1 1α−1
+
2 2 m
that outperforms both previous approaches.
Fig. 6.2 Two networks with a cycle ABCD: (a) paths (S1 , R1 ) = S1 A → AB → BC →
CD → DR1 and (S2 , R2 ) = S2 B → BC → CD → DA → AR2 impose consistent partial
orders to the edges of the cycle; (b) paths (S1 , R1 ) = S1 A → AB → BC → CD → DR1
and (S2 , R2 ) = S2 C → CD → DA → AB → BR2 impose inconsistent partial orders to the
edges of the cycle.
When the partial orders imposed by the different paths are consis-
tent, although the underlying network topology contains cycles, the line
graph associated with the information flows does not. Thus, from the
network code design point of view, we can treat the network as acyclic.
When the partial orders imposed by the different paths are not consis-
tent, then the network code design needs to be changed accordingly. It
is this set of networks we call networks with cycles.
Fig. 6.3 A networks with a cycle ABA. Unless we introduce delay in the cycle, the infor-
mation flow is not well defined.
which is not consistent with the assumption that xA and xB are inde-
pendent. It is also easy to see that local and global coding vectors are
no longer well defined.
that decoding can still be performed using the trellis diagram of the
recursive convolutional code, a trellis decoding algorithm like Viterbi,
and trace-back depth of one. The decoding complexity mainly depends
on the size of the trellis, exactly as in the case of feed-forward convo-
lutional codes. As a result, cycles per se do not increase the number of
states and the complexity of trellis decoding.
The second network code design approach attempts to remove the
cycles. This approach applies easily to networks that have simple cycles,
i.e., cycles that do not share edges with other cycles. Observe that an
information source needs to be transmitted through the edges of a cycle
at most once, and afterwards can be removed from the circulation by
the node that introduced it. We illustrate this approach through the
following example.
Example 6.5. Consider the cycle in Figure 6.2(b), and for simplicity
assume that each edge corresponds to one memory element. Then the
flows through the edges of the cycle are
AB : σ1 (t − 1) + σ2 (t − 3)
BC : σ1 (t − 2) + σ2 (t − 4)
(6.1)
CD : σ1 (t − 3) + σ2 (t − 1)
DA : σ1 (t − 4) + σ2 (t − 2),
Example 6.6. Consider the network depicted in Figure 6.5 that has
a knot, that is, an edge (in this case r2 r3 ) shared by multiple cycles.
There are three receivers t1 , t2 , and t3 and four sources A, B, C, and D.
All four information streams σA , σB , σC , and σD have to flow through
the edge r2 r3 . Consider flow xA : to reach receiver t1 it has to circulate
104 Networks with Delay and Cycles
Fig. 6.5 A network configuration with multiple cycles sharing the common edge r2 r3 (knot).
Fig. 6.6 Convolutional code corresponding to the network configuration in Figure 6.5 if we
associate unit delay with the edges r1 r2 , r4 r2 , r3 r1 , and r3 r4 .
inside the cycle (r2 , r3 , r4 ). Since none of the nodes r2 , r3 , and r4 has
σA , σA cannot be removed from the cycle. Thus, the second approach
would not work. To apply the first approach, assume we associate a
unit delay element with each of the edges r1 r2 , r4 r2 , r3 r1 , and r3 r4 .
The resulting convolutional code is depicted in Figure 6.6.
Notes
Coding in networks with cycles and delay was first considered in the
seminal papers on network coding by Ahlswede et al. [2, 36]. Using poly-
nomials and rational functions over the delay operator D to take delay
6.3 Dealing with Cycles 105
Now that we have learned how to design network codes and how much
of throughput increase to expect in networks using network coding,
it is natural to ask how much it costs to operate such networks. We
focus our discussion on resources required to linear network coding
for multicasting, as this is the better understood case today. We can
distinguish the complexity of deploying network coding to the following
components:
106
107
Proof. We prove the claim separately for source and coding subtrees.
Fig. 7.1 A subtree graph Γ and its associated graph Ω. The receiver edges in Ω are labeled
by the corresponding receivers.
110 Resources for Network Coding
Lemma 7.2. ( [9, Ch. 9]) Every k-chromatic graph has at least k
vertices of degree at least k − 1.
Proof. Assume that our graph Ω has n vertices and chromatic number
χ(Ω) = k ≤ n. Let m − k, where m is a nonnegative integer. We are
going to count the number of edges in Ω in two different ways:
(1) From Lemmas 7.1 and 7.2, we know that each vertex has
degree at least 2, and at least k vertices have degree at least
k − 1. Consequently, we can lower bound the number of edges
of Ω as
k(k − 1) + 2m
E(Ω) ≥ . (7.2)
2
(2) Since there are N receivers and n − 2 coding subtrees, we
have at most N receiver edges and at most n − 2 distinct
flow edges (we count parallel edges only once). Thus,
E(Ω) ≤ N + n − 2 = N + k + m − 2. (7.3)
7.1 Bounds on Code Alphabet Size 111
This proves the first claim of the theorem that, C for any minimal con-
figuration with N receivers, an alphabet of size 5 2N − 7/4 + 1/26 is
sufficient.
To show that there exist configurations for which an alphabet of
this size is necessary, we are going to construct a subtree graph where
(7.4) becomes equality, i.e., we will construct a subtree graph that
has N = k(k−1)2 − k + 2 receivers and the corresponding graph Ω has
chromatic number k. We start with a minimal subtree graph Γ that
has k vertices and k − 1 receivers. Such a graph can be constructed as
depicted in Figure 7.2. The corresponding graph Ω has k − 2 flow edges
and k − 1 receiver edges. Add k(k−1)
2 − [(k − 2) + (k − 1)] receivers, so
that Ω becomes a complete graph with E(Ω) = k(k−1) 2 edges. Thus Ω
cannot be colored with less than k colors. The corresponding subtree
graph has N = k(k−1)2 − k + 2 receivers, and requires an alphabet of
size q = k − 1.
Fig. 7.2 A minimal subtree graph for a network with two sources, N receivers, and N − 1
coding subtrees.
We have already given two different proofs of this result, in Theorem 3.2
and Lemma 5.2. Here we discuss whether this bound is tight or not.
The proof of Theorem 3.2 does not use the fact that we only care
about minimal configurations, where paths cannot overlap more than
a certain number of times, and thus the entries of the matrices Aj are
not arbitrary. For example, there does not exist a minimal configuration
with two sources and two receivers corresponding to the matrices
. / . /
1 0 1 1
A1 = , A2 = . (7.5)
0 x 1 x
In fact, as we saw in Theorem 3.5, there exist two minimal configuration
with two sources and two receivers. Therefore, the proof of Lemma 3.1
is more general than it needs to be, and does not necessarily give a tight
bound. Similarly, recall that in the proof of Lemma 5.2, we overesti-
7.2 Bounds on the Number of Coding Points 113
7.1.3 Complexity
It is well known that the problem of finding the chromatic number of a
graph is NP-hard. We saw in Section 7.1.1 that the problem of finding
the minimum alphabet size can be reduced to the chromatic number
problem. Here we show that the reverse reduction also holds: we can
reduce the problem of coloring an arbitrary graph G with the minimum
number of colors, to the problem of finding a valid network code with
the smallest alphabet size, for an appropriately chosen network (more
precisely, an appropriately chosen subtree graph Γ, that corresponds
to a family of networks). This instance will have two sources, and the
coding vectors (corresponding to colors) will be the vectors in (7.1).
Let G = (V, E) be the graph whose chromatic number we are seek-
ing. Create a subtree graph Γ = (Vγ = V, Eγ ), that has as subtrees the
vertices V of G. We will first select the subtrees (vertices) in Γ that act
as source subtrees. Select an edge e ∈ E that connects vertices v1 and
v2 , with v1 , v2 ∈ V . Let v1 and v2 in Γ act as source subtrees, and the
remaining vertices as coding subtrees, that is, Eγ = {(v1 , v), (v2 , v)|v ∈
V \ {v1 , v2 }}. For each e = (v1 , v2 ) ∈ E, create a receiver that observes
v1 and v2 in Γ. It is clear that finding a coding scheme for Γ is equivalent
to coloring G. We thus conclude that finding the minimum alphabet
size is also an NP-hard problem.
if we have h sources and N receivers, the paths from the sources to the
receivers can only intersect in a fixed number of ways. The bounds we
give below for minimal configurations depend only on h and N .
The problem of finding the minimum number of coding points is NP-
hard for the majority of cases. However, it is polynomial time provided
that h = O(∞), N = O(∞), and the underlying graph is acyclic.
Proof. Recall that there are exactly 2N receiver nodes. The first part
of the claim then follows directly from Theorem 3.6. Figure 7.2 demon-
strates a minimal subtree graph for a network with two sources and N
receivers that achieves the upper bound on the maximum number of
subtrees.
7.2.3 Complexity
The problem of minimizing the number of coding points is NP-hard
for the majority of cases. It is polynomial time in the case where the
number of sources and receivers is constant, and the underlying graph
is acyclic. The following theorem deals with the case of integral network
coding over cyclic graphs (graphs that are allowed to have cycles).
116 Resources for Network Coding
1 We underline that this reduction applies for the case of integral routing.
7.3 Coding with Limited Resources 117
expensive and may cause delay) and (ii) the required code alphabet size
which determines the complexity of the electronic modules performing
finite field arithmetics. We now examine whether we can quantify the
trade-offs between alphabet size, number of coding points, throughput,
and min-cut. Up to now we only have preliminary results toward these
directions that apply for special configurations such as the following
examples.
2 Note however that several of these choices may turn out to correspond to the same minimal
configuration.
118 Resources for Network Coding
In our case, X is the set of subtrees, m is the min-cut from the sources
to each receiver, each element of F corresponds to the set of m sub-
trees observed by a receiver, and the set of colors are the points on
the projective line. Therefore, F is a family of sets each of size m, and
|F| = N . Suppose that we can use an alphabet size k − 1 (which gives
k colors). Note that each receiver can observe both sources if F is col-
orable, since that means that no set of subtrees observed by a receiver
is monochromatic. From Theorem 7.9, this holds as long as
N < k m−1 .
The above inequality shows a trade-off between the min-cut m to each
user and the alphabet size k − 1 required to accommodate N receivers
for the special class of configurations we examine. We expect a similar
tradeoff in the case where the graph is not bipartite as well. However,
Theorem 7.9 cannot be directly applied, because in this case there are
additional constraints on coloring of the elements of X coming from the
requirement that each child subtree has to be assigned a vector lying
in the linear span of its parents’ coding vectors.
Theorem 7.10. For every family F whose all members have size
exactly m, there exists a k-coloring of its points that colors at most
|F|k 1−m of the sets of F monochromatically.
! e (p) .
−N N /q
PNd ≥ 1 −
q
Thus, if we want this bound to be smaller than, say, e−1 , we need to
4 5
choose q ≥ N Np .
For arbitrary values of p and N , network coding using a binary
alphabet can be achieved as follows: We first remove the edges going
4 −15
out of S into those A-nodes whose labels contain N . There are Np−2
such edges. Since the number of edges going out of S into A-nodes is
4N 5 4 N 5 4N −15 4N −15
p−1 , the number of remaining edges is p−1 − p−2 = p−1 . We
4 −15
label these edges by the h = Np−1 different basis elements of Fh2 . We
further remove all A-nodes which have lost their connection with the
source S, as well as their outgoing edges. The B-nodes merely sum
their inputs over Fh2 , and forward the result to the C-nodes.
Consider a C-node that the N th receiver is connected to. Its label,
say ω, is a p-element subset of I containing N . Because of our edge
removal, the only A-node that this C-node is connected to is the one
with the label ω \ {N }. Therefore, all C-nodes that the N th receiver
is connected to have a single input, and all those inputs are different.
Consequently, the N th receiver observes all the sources directly.
120 Resources for Network Coding
h8 N −p p − 19
T (Akzk ) = p+ +k . (7.6)
N p h
7.3 Coding with Limited Resources 121
4 5
Proof. If only k out of the N p−1 coding points are allowed to code, we
get that
.6 7 6 7 66 7 7 /
1 N −1 N −2 N −1
Tk =
av
+ (N − 1) + − k + kp .
N p−1 p−2 p
(7.7)
Notes
The very first paper on network coding by Ahlswede et al. required
codes over arbitrarily large sequences. Alphabets of size N · h were
shown to be sufficient for designing codes based on the algebraic
122 Resources for Network Coding
approach of Koetter and Médard [32]. The polynomial time code design
algorithms of Jaggi et al. in [30] operate of alphabets of size N . That
√
codes for all networks with two sources require no more than O( N )
size alphabets was proved in by Fragouli and Soljanin in [24] by observ-
ing a connection with network code design and vertex coloring. The con-
nection with coloring was independently observed by Rasala-Lehman
and Lehman in [35]. It is not known whether alphabet of size O(N )
is necessary for general
√ networks; only examples of networks for which
alphabets of size O( N ) are necessary were demonstrated in [24, 35].
Algorithms for fast finite field operations can be found in [46].
The encoding complexity in terms of the number of coding points
was examined in [24] who showed that the maximum number of coding
points a network with two sources and N receivers can have equals to
N − 1. Bounds on the number of coding points for general networks
were derived by Langberg et al. in [34], who also showed that not only
determining the number of coding points but even determining whether
coding is required or not for a network is for most cases NP-hard.
Coding with limited resources was studied by Chekuri et al. in [15],
Cannons and Zeger in [12], and Kim et al. in [31].
Appendix
Points in General Position
For networks with h sources and linear network coding over a field
Fq , the coding vectors lie in Fhq , the h-dimensional vector space over
the field Fq . Since in network coding we only need to ensure linear
independence conditions, we are interested in many cases in sets of
vectors in general position:
f (α) = [1 α α2 . . . αh ].
123
124 Points in General Position
The vectors in the moment curve are in general position. Indeed, take
any h of these q vectors, the corresponding determinant
1 x1 x21 . . . xh−1
1
1 x2 x2 . . . xh−1
2 2
det . . .. .. ,
.. .. . ... .
1 xh x2h . . . xh−1
h
[1 x1 x21 . . . xh−1
1 ]
[1 x2 x22 . . . x2 ]
h−1
Example A.2. For networks with two sources, we often use the q + 1
vectors in the set
g(q − 1) + 1 ≤ q 2 ⇒ g ≤ q + 1.
This follows from a simple counting argument. For each of the g vector,
we cannot reuse any of its q − 1 nonzero multiples. We also cannot use
125
the all-zero vector. Hence the left-hand side. The right-hand side counts
all possible vectors in F2q .
Thus, for networks with two sources, we can, without loss of gener-
ality, restrict our selection of coding vectors to the set (A.1).
Notes
Although the problem of identifying g(h, q) looks combinatorial in
nature, most of the harder results on the maximum size of arcs have
been obtained by using algebraic geometry which is also a natural tool
to use for understanding the structure (i.e., geometry) of arcs. The
moment curve was discovered by Carathéodory in 1907 and then inde-
pendently by Gale in 1956. A good survey on the size of arcs in pro-
jective spaces can be found in [3].
Acknowledgments
127
Notations and Acronyms
128
Notations and Acronyms 129
[1] A. Agarwal and M. Charikar, “On the advantage of network coding for improv-
ing network throughput,” IEEE Information Theory Workshop, San Antonio,
Texas, 2004.
[2] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, “Network information
flow,” IEEE Transactions on Information Theory, vol. 46, pp. 1204–1216, July
2000.
[3] A. H. Ali, J. W. P. Hirschfeld, and H. Kaneta, “On the size of arcs in projective
spaces,” IEEE Transaction on Information Theory, vol. 41, pp. 1649–1656,
September 1995.
[4] M. Ball, T. L. Magnanti, C. L. Monma, and G. L. Nemhauser, “Network Mod-
els,” in Handbooks in Operations Research and Management Science, North
Holland, 1994.
[5] J. Bang-Jensen, A. Frank, and B. Jackson, “Preserving and increasing local
edge-connectivity in mixed graphs,” SIAM Journal on Discrete Mathematics,
vol. 8, pp. 155–178, 1995.
[6] Á. M. Barbero and Ø. Ytrehus, “Heuristic algorithms for small field mul-
ticast encoding,” 2006 IEEE International Symposium Information Theory
(ISIT’06), Chengdu, China, October 2006.
[7] A. M. Barbero and Ø. Ytrehus, “Cycle-logical treatment for cyclopathic net-
works,” Joint special issue of the IEEE Transactions on Information Theory
and the IEEE/ACM Transaction on Networking, vol. 52, pp. 2795–2804, June
2006.
[8] B. Bollobás, Modern Graph Theory. Springer-Verlag, 2002.
[9] J. A. Bondy and U. S. R. Murty, Graph Theory with Applications. Amsterdam:
North-Holland, 1979.
130
References 131
[44] A. Schrijver, Theory of Linear and Integer Programming. John Wiley & Sons,
June 1998.
[45] J. T. Schwartz, “Fast probabilistic algorithms for verification of polynomial
identities,” Journal of the ACM, vol. 27, pp. 701–717, 1980.
[46] J. von zur Gathen and J. Gerhard, Modern Computer Algebra. Cambridge
Univ. Press, Second Edition, September 2003.
[47] Y. Wu, P. A. Chou, and K. Jain, “A comparison of network coding and tree
packing,” ISIT 2004, 2004.
[48] R. W. Yeung, “Multilevel diversity coding with distortion,” IEEE Transaction
on Information Theory, vol. 41, pp. 412–422, 1995.
[49] R. W. Yeung, S.-Y. R. Li, N. Cai, and Z. Zhang, “Network coding theory:
A tutorial,” Foundation and Trends in Communications and Information The-
ory, vol. 2, pp. 241–381, 2006.
[50] L. Zosin and S. Khuller, “On directed Steiner trees,” in Proceedings of the 13th
Annual ACM/SIAM Symposium on Discrete Algorithms (SODA), pp. 59–63,
2002.