0% found this document useful (0 votes)

86 views16 pages

Paper On Graph Theory

1. The paper addresses the definition of the Graph Fourier Transform (GFT) for directed graphs. Existing definitions of the GFT rely on the eigenvectors of the graph Laplacian, which is only applicable to undirected graphs. 2. The paper proposes an alternative approach that defines the GFT basis as the set of orthonormal vectors that minimize a continuous extension of the graph cut size. 3. The paper presents two iterative optimization methods to solve the non-convex problem of minimizing the graph cut size while enforcing orthogonality constraints on the basis vectors.

Uploaded by

Vikas Kanaujia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views16 pages

Paper On Graph Theory

Uploaded by

Vikas Kanaujia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

1

On the Graph Fourier Transform

for Directed Graphs
Stefania Sardellitti, Member, IEEE, Sergio Barbarossa, Fellow, IEEE, and Paolo Di Lorenzo, Member, IEEE

Abstract—The analysis of signals defined over a graph is Fourier basis is constituted by the eigenvectors of the graph
relevant in many applications, such as social and economic Laplacian, which represent the basis that minimizes the l2 -
networks, big data or biological networks, and so on. A key norm graph total variation. This approach is well motivated
tool for analyzing these signals is the so called Graph Fourier
Transform (GFT). Alternative definitions of GFT have been on undirected graphs where the minimization of the ℓ2 -norm
arXiv:1601.05972v3 [math.SP] 1 Jul 2017

suggested in the literature, based on the eigen-decomposition of total variation is equivalent to minimizing the quadratic form
either the graph Laplacian or adjacency matrix. In this paper, built on the Laplacian matrix. Hence an orthonormal basis
we address the general case of directed graphs and we propose minimizing the ℓ2 -norm total variation leads to the eigenvec-
an alternative approach that builds the graph Fourier basis tors of the Laplacian matrix. However, these properties do
as the set of orthonormal vectors that minimize a continuous
extension of the graph cut size, known as the Lovász extension. not hold anymore in the directed graph case. An alternative
To cope with the non-convexity of the problem, we propose two approach, valid for the more general and challenging case of
alternative iterative optimization methods, properly devised for directed graphs, was proposed in [1], [4]. That method builds
handling orthogonality constraints. Finally, we extend the method on the Jordan decomposition of the adjacency matrix, and
to minimize a continuous relaxation of the balanced cut size. defines the associated generalized eigenvectors as the GFT
The formulated problem is again non-convex and we propose an
efficient solution method based on an explicit-implicit gradient basis. This second method is rooted on the association of the
algorithm. graph adjacency matrix with the signal shift operator, which
is at the basis of all shift-invariant linear filtering methods
Index Terms—Graph signal processing, Graph Fourier Trans-
form, total variation, clustering. for graph signals [15], [16]. This approach paved the way
to the algebraic signal processing framework. However, the
GFT definition proposed in [4] raises some important issues
I. I NTRODUCTION requiring further investigation. First, the basis vectors are
Graph signal processing (GSP) has attracted a lot of interest linearly independent, but in general they are not orthogonal,
in the last years because of its many potential applications, so that the resulting transform is not unitary and then it
from social and economic networks to smart grids, gene does not preserve scalar products. Second, the total variation
regulatory networks, and so on. GSP represents a promising introduced in [4], does not respect some desirable properties,
tool for the representation, processing and analysis of complex for example, it does not guarantee that a constant graph
networks, where discrete signals are defined on the vertices of signal has zero total variation [17], [18]. Finally, the numerical
a (possibly weighted) graph. Many works in the recent litera- computation of the Jordan decomposition often incurs into
ture attempt to extend the classical discrete signal processing well-known numerical instabilities, even for moderate size
(DSP) theory from time signals or images to signals defined matrices [19], although alternative decomposition methods
over the vertices of a graph by introducing the basic concepts have been recently suggested to tackle these instability issues
of graph-based filtering [1]–[3], graph-based transforms [4]– [20]. In some applications, one of the major motivations for
[7], sampling and uncertainty principle [8]–[12]. A central role using the GFT is the analysis of graph signals that exhibit
in GSP is played by the spectral analysis of graph signals, clustering properties, i.e. signals that are smooth within subsets
which is based on the introduction of the so called Graph of highly interconnected nodes (clusters), while they can vary
Fourier Transform (GFT). Alternative definitions of GFT have arbitrarily across different clusters. In such cases, the GFT of
been introduced see, e.g., [5], [4], [8], [13], [14], each of these signals is typically sparse and its sparsity carries relevant
them coming from different motivations, like building a basis information on the data under analysis. These signals are said
with minimal variation, filtering signals defined over graphs, to be band-limited, in analogy with what happens to smooth
etc. Two basic approaches have been suggested. The first one time signals. Within the machine learning context, GSP can
is rooted on spectral graph theory and it uses the graph- play a key role in unsupervised and semi-supervised learning,
Laplacian as the central unit, see e.g. [5] and the references as suggested in [21], [22]. In these applications, the input is a
therein. This approach applies to undirected graphs and the point cloud and the goal is to detect clusters, either without or
with limited supervision. Graph-based methods tackle these
S. Sardellitti and S. Barbarossa are with Sapienza University of Rome, problems by associating a graph to the point cloud, where
DIET Dept., Via Eudossiana 18, 00184 Rome, Italy (e-mail: stefa- the vertices are the points themselves, whereas edges between
[email protected], [email protected]). P. Di Lorenzo is
with the Dept. of Engineering, University of Perugia, Via G. Duranti 93, pairs of points are established if two points are sufficiently
06125 Perugia, Italy (e-mail: [email protected]). This work has been close. The goal of clustering/classification is to associate a
supported by TROPIC Project, Nr. ICT-318784. The work of P. Di Lorenzo different label to each cluster. If we look at these labels as a
was funded by the “Fondazione Cassa di Risparmio di Perugia”. Matlab
code to implement the algorithms proposed in this paper is available at signal defined over the points (vertices), this signal is band-
https://sites.google.com/site/stefaniasardellitti/code-supplement limited by construction [21], [22].
2

In this paper, we propose a novel alternative approach II. M IN - CUT SIZE AND ITS L OV ÁSZ EXTENSION
to build the GFT basis for the general case of directed In this section, we recall the definitions of cut size and
graphs. Rather than starting from the decomposition of one Lovász extension, as they will form the basic tools for our
of the graph matrix descriptors, either adjacency or Laplacian, definition of GFT. We consider a graph G = {V, E} consisting
we start identifying an objective function to be minimized of a set of N vertices (or nodes) V = {1, . . . , N } along with
and then we build an orthogonal matrix that minimizes that a set of edges E = {aij }i,j∈V , such that aij > 0 if there is
objective function. More specifically, we choose as objective a direct link from node j to node i, or aij = 0 otherwise.
function the graph cut size, as its minimization leads to We denote with |V| the cardinality of V, i.e. the number of
identifying clusters. We consider the general case of directed elements of V. A signal s on a graph G is defined as a mapping
graphs, which subsumes the undirected graphs as a particular from the vertex set to a real vector of size N = |V|, i.e.
case. The cut function is a set function and its minimization s : V → R. Let A denote the N × N adjacency matrix with
is NP-hard, however exploiting the sub-modularity property entries given by the edge weights aij for i, j = 1, . . . , N .
of the cut size, it has been shown that there exists a lossless The graph Laplacian is defined as L := D − A where the
convex relaxation of the cut size, named its Lovász extension
P D is a diagonal matrix whose ith diagonal
in-degree matrix
[23], [24], whose minimization preserves the optimality of the entry is di = j aij .
solution of the original non-convex problem. Interestingly, the One of the basic operations over graphs is clustering,
Lovász extension of the cut size gives rise to an alternative i.e. the partition of the graph onto disjoint subgraphs, such
definition of total variation of a graph signal that captures that the vertices within each subgraph (cluster) are highly
the edges’ directivity. Furthermore, in the case of undirected interconnected, whereas there are only a few links between
graphs, the Lovász extension reduces to the l1 norm total different clusters. Finding a good partition can be formulated
variation of a graph signal, which represents the discrete as the minimization of the cut size [28], whose definition
counterpart of the total variation of continuous-time signals, is reported here below. Let us consider a subset of vertices
which plays a fundamental role in the continuous time Fourier S ⊂ V, and its complement set in V denoted by S̄. The edge
Transform, see, e.g., [17], [13]. We define the GFT basis boundary of S is defined as the set of edges with one end in
as the set of orthonormal vectors that minimize the Lovász S and the other end in S̄. The cut size between S and S̄ is
extension of the cut size. Unfortunately, even though the ob- defined as the sum of the weights over the boundary [28], i.e.
jective function is convex, the resulting problem is non-convex, X
¯ :=
cut(S, S) aji . (1)
because of the orthogonality constraint imposed on the basis
vectors. Thus, to find a (possibly local) solution of the problem i∈S,j∈S̄
in an efficient manner, we exploit two recently developed Finding the partition that minimizes the cut size in (1) is an
methods that are specifically tailored to handle non-convex NP-hard problem. To overcome this difficulty, we exploit the
orthogonality constraints, namely, the splitting orthogonality sub-modularity property of the cut size [24], which ensures
constraints (SOC) method [25], and the proximal alternating that its Lovász extension is a convex function [24]. We briefly
minimized augmented Lagrangian (PAMAL) method [26]. recall some of the main definitions and properties here below.
SOC method is quite simple to implement and, even if no Given the set V and its power set 2V , i.e. the set of all its
convergence proof has been provided yet, extensive numerical subsets, let us consider a real-valued set function F : 2V → R.
results validate the effectiveness and robustness of such a The cut size in (1) is an example of set function, with
strategy. Conversely, PAMAL algorithm, which hybridizes the F (S) := cut(S, S̄). Every element of the power set 2V may
augmented Lagrangian method and the proximal minimization be associated to a vertex of the hyper-cube {0, 1}N . Namely,
scheme, is known to guarantee convergence. Furthermore, any a set S ⊆ V can be uniquely identified to the indicator vector
limit point of each sequence generated by PAMAL method 1S , i.e. the vector which is 1 at entry j, if j ∈ S, and
satisfies the Karush-Kuhn Tucker conditions of the original 0 otherwise. Then, a set-function F can be defined on the
non-convex problem [26]. Finally, to prevent the resulting vertices of the hyper-cube {0, 1}N . The Lovász extension of
basis vectors to be excessively sparse vectors, we consider a graph function F [23], [24], allows the extension of a set-
the minimization of a continuous relaxation of the balanced function defined on the vertices of the hyper-cube {0, 1}N , to
cut size. To solve the corresponding non-convex fractional the full hypercube [0, 1]N and hence to the entire space RN .
problem, we adopt an efficient and convergent algorithm based We recall its definition hereafter.
on the explicit-implicit gradient method [27]. Definition 1: Let F : 2V → R be a set function with F (∅) =
The paper is organized as follows. Sec. II introduces the 0. Let x ∈ RN be ordered w.l.o.g. in increasing order such
graph signal variations as the continuous Lovász extension of that x1 ≤ x2 ≤ . . . ≤ xN . Define C0 , V and Ci , {j ∈ V :
the min-cut size. In Sec. III, we define the GFT as the set xj > xi } for i > 0. Then, the Lovász extension f : RN → R
of optimal orthonormal vectors minimizing the graph signal of F , evaluated at x, is given by:
variation, and in Sec. IV we illustrate the optimization methods
N
X
used for solving the resulting non-convex problem. Therefore,
f (x) = xi (F (Ci−1 ) − F (Ci ))
in Sec. V we conceive the GFT as the solution of a balanced
i=1
min cut problem, while Sec. VI illustrates some numerical N −1
(2)
X
examples validating the effectiveness of the proposed ap- = F (Ci )(xi+1 − xi ) + x1 F (V).
proaches. Finally, Sec. VII draws some conclusions. i=1
3

Note that f (x) is piecewise affine w.r.t. x, and F (S) = f (1S ) III. G RAPH F OURIER BASIS AND D IRECTED
for all S ⊆ V. An interesting class of set functions is given by T OTAL VARIATION
the submodular set functions, whose definition follows next. Alternative definitions of GFT have been proposed in the
Definition 2: A set function F : 2V → R is submodular if literature, depending on the different perspectives used to em-
and only if, ∀A, B ⊆ V, it satisfies the following inequality: phasize specific signal features. In case of undirected graphs,
the GFT of a vector s was defined as [5]
F (A) + F (B) ≥ F (A ∪ B) + F (A ∩ B).
ŝ = UT s, (5)
A fundamental property of a submodular set function is that
its Lovász extension is a convex function. This is formally where the columns of matrix U are the eigenvectors of the
stated in the following proposition [24, p.23]. Laplacian matrix L, i.e. L = UΛUT . This definition is basi-
Proposition 1: Let F : 2V → R be a submodular function cally rooted on the clustering properties of these eigenvectors,
and f be its Lovász extension. Then, it holds see, e.g., [30]. In fact, by definition of eigenvector, the Fourier
basis used in (5) can be thought as the solution of the following
min F (S) = min f (x) = min f (x). sequence of optimization problems:
S⊆V x∈{0,1}N x∈[0,1]N
uk = arg min uTk Luk := arg min GQV(uk )
Moreover, the set of minimizers of f (x) on [0, 1]N is the uk ∈RN uk ∈RN (6)
convex hull of the minimizers of f (x) on {0, 1}N . s.t. T
uk uℓ = δkℓ , ℓ = 1, . . . , k,
The cut size function in (1) is known for being submodular,
see, e.g., [24], [29]. More specifically, as shown in [24, p.54], for k = 2, . . . , N , where δkℓ is the Kronecker delta, and we
the cut function is equal to the positive linear combination of used the property that the quadratic form built on the Laplacian
the function Gij : S 7→ (1S )i [1 − (1S )j ], i.e. is the ℓ2 -norm, or graph quadratic variation (GQV), i.e.
X N
X
cut(S) = aji Gij . GQV(x) := aji (xi − xj )2 .
i,j∈V i,j=1,j>i

The function Gij is the extension to V of a function G eij Thus, the Fourier basis obtained from (6) coincides with
defined only on the power set of {i, j}, where G e ij ({i}) = 1 the set of orthonormal vectors that minimize the ℓ2 -norm
and all other values are zero, so that, from (2), its Lovász total variation. In all applications where the graph signals
e ij (xi , xj ) = [xi − xj ]+ with [y]+ := max{y, 0}.
extension is G exhibit a cluster behavior, meaning that the signal is relatively
Therefore the Lovász extension of the cut size function, in the smooth within each cluster, whereas it can vary arbitrarily from
general case of directed graphs, is given by: cluster to cluster, the GFT defined as in (5) helps emphasizing
the presence of clusters [30]. However, the identification of
N
X the Laplacian eigenvectors as the orthonormal vectors that
f (x) = aji [xi − xj ]+ := GDV(x). (3) minimize the GQV is only valid for undirected graphs, for
i,j=1
which the quadratic form built on the Laplacian reduces to
We term this function the Graph Directed Variation (GDV), the GQV. For directed graphs, the quadratic form in (6) cap-
as it captures the edges’ directivity. For undirected graphs, tures only properties associated to the symmetrized Laplacian
imposing aij = aji , the Lovász extension of the cut size boils (i.e., Ls = (L + LT )/2), and hence it cannot capture the
down to edges’ directivity. The generalization to directed graphs, was
N proposed in [4] as
X
f (x) = aji |xi − xj | := GAV(x). (4) ŝ = V−1 s, (7)
i,j=1,i>j
where V comes from the Jordan decomposition of the non-
Interestingly, this function, which we call Graph Absolute symmetric adjacency matrix A, i.e. A = VJV−1 . To estimate
Variation (GAV), represents the discrete counterpart of the l1 variations of the graph Fourier basis and to identify an order
norm total variation, which plays a key role in the classical among frequencies, the total variation of a vector was defined
Fourier Transform of continuous time signals [17], [13]. in [4] as
It is easy to show that the directed variation GDV satisfies the TVA (s) = ks − Anorm sk1 , (8)
following properties:
where Anorm := A/|λmax (A)|. The previous definition leads to
i) GDV(x) ≥ 0, ∀ x ∈ RN ; the elegant theory of algebraic signal processing over graphs
ii) GDV(x) = 0, ∀ x = c1 with c ≥ 0; [1,4,15,16]. However, there are some critical issues associated
iii) GDV(α x) = α GDV(x), ∀ α ≥ 0, i.e. it is positively to that definition that need to be further explored. First, the
homogeneous; definition of total variation as given in (8) does not ensure
iv) GDV(x + y) ≤ GDV(x) + GDV(y), ∀ x, y ∈ RN . that a constant graph signal has zero total variation, and this
GDV is neither a proper norm nor a semi-norm, since, in this collides with the common meaning of total variation [17],
latter case, it should be absolutely homogeneous. However, it [13], [18]. Second, the columns of V are linearly independent
meets the desired property ii) ensuring that a constant graph complex generalized eigenvectors, but in general they are not
signal has zero total variation. orthogonal. This gives rise to a GFT that does not preserve
4

inner products when passing from the observation to the trans- solutions and the choice of the initial points, ensuring a fast
formed domain. Furthermore, the computation of the Jordan convergence rate, is usually nontrivial. To cope with these
decomposition incurs into serious and intractable numerical issues, in this section we present two alternative iterative
instabilities when the graph size exceeds even moderate values algorithms to solve the non-convex, non-smooth problem P,
[19] and more stable matrix decomposition methods have to hinging on some recently developed methods for solving
be adopted to tackle its instability issues [20]. To overcome non-differentiable problems with non-convex constraints [25],
some of these criticalities, very recently the authors of [14] [26]. The first method, introduced in [25], called splitting
proposed a shift operator based on the directed Laplacian of a orthogonality constraints (SOC) method, is based on the
graph. Using the Jordan decomposition, the graph Laplacian alternating method of multipliers (ADMM) [35], [36] and the
is decomposed as split Bregman method [37], [38]. The SOC method leads to
L = VL JL V−1 L (9) some important benefits, as it is simple to implement and the
resulting non-convex sub-problem with orthonormal constraint
and the GFT is defined in [14] as
admits a closed form solution. Although no convergence proof
ŝ = V−1
L s. (10) of SOC method has been provided yet, numerical results
validate its value and robustness.
To quantify oscillations in the graph harmonics and to order An alternative optimization method that tackles the non-
the frequencies, the total variation was defined in [14] as convex minimization problem P and guarantees convergence
TVL (s) = kL sk1 . (11) is the PAMAL algorithm recently developed in [26]. The
algorithm combines the augmented Lagrangian method with
This definition of total variation ensures a zero value for con- proximal alternating minimization. A convergence proof was
stant graph signals. Furthermore, the eigenvalues with small provided in [26]. More specifically, this method has the so-
absolute value correspond to low frequencies. Nevertheless, called sub-sequence convergence property, i.e. there exists
the GFT given by F = V−1 L is still a non-unitary transform at least one convergent sub-sequence, and any limit point
and its computation is affected by the numerical instabilities satisfies the Karush-Kuhn Tucker (KKT) conditions of the
associated to the Jordan decomposition. original nonconvex problem. Building on these algorithms, in
In this paper, we propose a novel method to build the the sequel we introduce two efficient optimization strategies
graph Fourier basis as the set of N orthonormal vectors that build the basis for the Graph Fourier Transform, as the
xi , i = 1, . . . , N , that minimizes the total variation defined in solution of problem P.
(3), which represents the continuous convex Lovász extension
of the graph cut size in (1). The first vector √ is certainly the
constant vector, i.e. x1 = b 1, with b = 1/ N , as this (unit-
norm) vector yields a total variation equal to zero. Let us A. SOC method
introduce the matrix X := (x1 , . . . , xN ) ∈ RN ×N containing
all the basis vectors. Thus, the search for the GFT basis can The SOC algorithm was developed in [25] and tackles
be formally stated as the search for the orthonormal vectors orthogonality constrained problems by iteratively solving a
that minimize the directed total variation in (3), i.e. convex problem and a quadratic problem that admits a closed-
N
form solution. More specifically, introducing an auxiliary
X variable P = X to split the orthogonality constraint, problem
min GDV(X) := GDV(xk ) (P)
X∈RN ×N
k=1
P is equivalent to
s.t. XT X = I, x1 = b1.
min GDV(X)
The constraints are used to find an orthonormal basis and X,P∈RN ×N
(12)
to prevent the trivial null solution. Although the objective s.t. X = P, x1 = b1, PT P = I.
function is convex, problem P is non-convex due to the
orthogonality constraint. In the next section, we present two
alternative optimization strategies aimed at solving the non- The first constraint is linear and, as discussed in [25], it can
convex, non-differentiable problem P in an efficient manner. be solved using Bregman iteration. Therefore, by adding the
Bregman penalty function [37], problem (12) is equivalent to
IV. O PTIMIZATION A LGORITHMS the following simple two-step procedure:
To avoid handling the non-convex orthogonality constraints
β
directly, several methods have been proposed in the literature (Xk , Pk ) , arg min GDV(X) + kX − P + Bk−1 k2F
based on the solution of a sequence of unconstrained problems X,P∈RN ×N 2
approaching the feasibility condition, such as the penalty meth- s.t. x1 = b1, PT P = I;
ods [31], [32] and the augmented Lagrangian based methods
Bk = Bk−1 + Xk − Pk ,
[33], [34]. The penalty method is generally simple, but it
suffers from slow-convergence and ill-conditioning. On the
other hand, the standard augmented Lagrangian method solves where β is a strictly positive constant. Similarly to ADMM
a sequence of sub-problems that usually have no analytical and split Bregman iteration [39], the above problem can be
5

Algorithm 1: SOC method PT P = I}, which represents the Stiefel manifold [40]. For
0 N×N 0T 0 0 0
Set β > 0, X ∈ R , X X = I, x1 = b1, P = X , any set S, its indicator function is defined as
B0 = 0, k = 1.
Repeat 0, if X ∈ S
δS (X) = (14)
Find Xk as solution of Pk in (13), +∞, otherwise.
Y k = Xk + Bk−1 ,
T Given these symbols, problem (12) is equivalent to the fol-
Compute SVD decomposition Y k = Q̄SR̄ ,
k
P = Q̄R̄ ,
T lowing one:
Bk = Bk−1 + Xk − Pk , min f (X, P) , GDV(X) + δS1 (x1 ) + δSt (P) (Pe )
k = k + 1, X,P∈RN ×N
until convergence. s.t. H(X, P) , P − X = 0.
The basic idea to solve a problem in the form of Pe was
solved by iteratively minimizing with respect to X and P: proposed in [26], and combines the augmented Lagrangian
method [41], [33] with the alternating proximal minimization
k β k−1 k−1 2 algorithm. The result is known as the PAM method [42], which
1. X , arg min GDV(X) + kX − P +B kF
X∈RN ×N 2 deals with non-smooth, non-convex optimization. According
s.t. x1 = b1 (Pk ) to the augmented Lagrangian method, we add a penalty term
to the objective function in order to associate a high cost
2. Pk , arg min kP − (Xk + Bk−1 )k2F to unfeasible points. In particular, the augmented Lagrangian
P∈RN ×N
function associated to the non-smooth problem Pe , is
s.t. PT P = I (Qk ) ρ
k k−1 k k
L(X, P, Λ) = f (X, P) + hΛ, H(X, P)i + kH(X, P)k2F ,
3. B = B +X −P . 2
(13) where ρ is a positive penalty coefficient, Λ ∈ RN ×N repre-
The interesting aspect of this formulation is that subproblem sents the multipliers matrix, while the matrix inner product
Pk is convex and the second constrained quadratic problem is defined as hA, Bi , tr(AT B). The proposed augmented
Qk has a closed-form solution, as illustrated in the following Lagrangian method reduces problem Pe to a sequence of
proposition. problems that alternately update, at each iteration k, the
Proposition 2: Define Y k = Xk + Bk−1 and let following three steps:
T 1. Compute the critical point (Xk , Pk ) of the function
Yk = Q̄SR̄
L(X, P, Λk ; ρk ) by solving
be its SVD decomposition, where Q̄, R̄ ∈ RN ×N are unitary (Xk , Pk ) , min L(X, P, Λk ; ρk ); (15)
matrices, and S ∈ RN ×N is the diagonal matrix with entries X,P∈RN ×N

the singular values of Y k . Then, the optimal solution of the 2. Update the multiplier estimates Λk ;
T
quadratic non-convex problem Qk in (13) is Pk = Q̄R̄ . 3. Update the penalty parameter ρk .
Proof. See the proof of Theorem 2.1 in [25]. We will show next how to implement the previous steps, which
Combining (13) and Proposition 2, the main steps of the are described in detail in Algorithm 2.
SOC method are summarized in Algorithm 1. It is important Computation of the critical points (Xk , Pk ). The optimal
to remark that the choice of the coefficient β strongly affects solution (Xk , Pk ) of problem (15) is computed using an
the convergence behavior of the algorithm: a large value of β approximate algorithm, i.e. finding a subgradient point Θk ∈
will force a stronger equality constraint, while a too small ∂L(Xk , Pk , Λk ; ρk ) satisfying, with a prescribed tolerance
β might not be able to guarantee the solution to satisfy value ǫk , the following inequality
the orthogonality constraint. Then, a proper tuning of the
coefficient β is important to ensure a fast convergence of k Θk k∞ ≤ ǫk (16)
the algorithm. Although, as remarked in [25], the convergence k
with P ∈ St . To evaluate such point, we exploit a coordinate-
analysis of SOC algorithm is still an open problem, we will descent method with proximal regularization based on the
show next that the numerical results testify the validity and PAM method proposed in [43]. More specifically, at the k-
robustness of this method when applied to our case. th outer iteration of the algorithm, we compute (Xk , Pk ) by
iteratively solving, at each inner iteration n, the following
proximal regularization of a two blocks Gauss-Seidel method:
B. PAMAL method Xk,n = arg min L(X, Pk,n−1 , Λk ; ρk )
X∈R N ×N ,x1 =b1
As an alternative efficient method to tackle the non-
convexity of problem P, we propose here an approach based c1k,n−1
+ k X − Xk,n−1 k2F (P̃k,n )
on PAMAL algorithm [26]. The method solves the orthogo- 2
nality constrained problem by iteratively updating the primal Pk,n = arg min L(Xk,n−1 , P, Λk ; ρk )
P∈RN ×N
variables and the multipliers estimates. To this end, let us
reformulate the problem as follows. Let us introduce the sets c2k,n−1
+ k P − Pk,n−1 k2F (Q̃k,n )
S1 , defined as S1 , {x = ±b1}, and St , {P ∈ RN ×N : 2
6

where the proximal parameters ck,n

i can be arbitrarily chosen Algorithm 2: PAMAL method
as long as they satisfy Given the parameters {ǫk }k∈N , 0 < ǫk < 1, τ ∈ [0, 1), γ > 1,
k = 1, ρk > 0, Λk ∈ RN×N , Λmin ≤ Λk ≤ Λmax .
0 < c ≤ ck,n
i ≤ c̄ < ∞, k, n ∈ N, i = 1, 2, c > 0, c̄ > 0. Repeat
(17) Step.1: Compute (Xk , Pk ) as in Algorithm 3 such that there exists
The first convex problem P̃k,n can be solved through any Θk ∈ ∂L(Xk , Pk , Λk ; ρk ) with k Θk k∞ ≤ ǫk , (Pk )T Pk = I.
Step.2: Update the multiplier estimates
convex optimization numerical tool, whereas the second prob-
lem in Q̃k,n admits a closed-form solution as stated in the Λk+1 = [Λk + ρk (Pk − Xk )]T
following proposition. where [·]T is the projection on T , {Λ : Λmin ≤ Λ ≤ Λmax }.
Step.3: Set Rk = Pk − Xk , and update the penalty parameter as
Proposition 3: Define the matrix k
ρ if k Rk k∞ ≤ τ k Rk−1 k∞
ρk+1 = ,
F , (c2k,n−1 Pk,n−1 + ρk Xk,n − Λk )(ρk + c2k,n−1 )−1 γρk otherwise
k = k + 1,
with SVD decomposition F = QΣTT , where Q, T ∈ RN ×N until convergence.
are unitary matrices, while Σ is a diagonal matrix with entries
given by the singular values of F. The optimal solution of the Algorithm 3: PAM method for solving step 1 in Algorithm 2
non-convex problem Q̃k,n is given by Pk,n = QTT . Let (X1,0 , P1,0 ) be any finite initialization. For k ≥ 2, set
Proof. See Appendix A. (Xk,0 , Pk,0 ) = (Xk−1 , Pk−1 ), n = 0.
Algorithm 2 describes the outer loop of the PAMAL method Repeat
Step.1: Set n = n + 1. Compute Xk,n by solving problem P̃k,n .
whereas in Algorithm 3 we report the inner iterations needed Step.2: Pk,n = QTT where Q, T come from the following SVD
to solve problems P̃k,n and Q̃k,n in step 1 of Algorithm 2. decomposition
The inner iterations are terminated when there exists a sub- c
k,n−1
Pk,n−1 +ρk Xk,n −Λk
QΣTT = 2 .
gradient point Θk,n ∈ ∂L(Xk,n , Pk,n , Λk ; ρk ) satisfying k k,n−1
ρ +c2

k Θk,n k∞ ≤ ǫk , Pk,n ∈ St , where Θk,n , (Θk,n k,n Step.3: Set (Xk , Pk ) = (Xk,n , Pk,n ), Θk = Θk,n ,
1 , Θ2 ) until k Θk,n k∞ ≤ ǫk .
with the subgradients given by
Θk,n
1 = c1k,n−1 (Xk,n−1 − Xk,n ) + ρk (Pk,n−1 − Pk,n )
convergence to a critical point [43, Th. 6.2], provided that the
Θk,n
2 = c2k,n−1 (Pk,n−1 − Pk,n ). penalty parameters {ρk }k∈N in Algorithm 2 satisfy some mild
(18)
conditions, as stated in the following theorem.
Update of the multipliers and penalty coefficients. The rule
Theorem 1: Denote by {(Xk,n , Pk,n )}n∈N the sequence
for updating the multipliers matrix in Step 2 of Algorithm
generated by Algorithm 3. The function Lk in (15) satisfies
2 needs some further discussion. We adopt the classical
the Kurdyka-Łojasiewicz (K-Ł) property1. Then Θk,n defined
first-order approximation by imposing that the estimates of
by (18) satisfies
multipliers must be bounded. Then, we explicitly project the
multipliers matrix on the compact box set T , {Λ : Λmin ≤ Θk,n ∈ ∂L(Xk,n , Pk,n , Λk ; ρk ), ∀n ∈ N. (19)
Λ ≤ Λmax } with −∞ < [Λmin ]i,j ≤ [Λmax ]i,j < ∞,
Also, if γ > 1, ρ1 > 0, for each k ∈ N, it holds
∀i, j. The boundedness of the multipliers is a fundamental
assumption needed to preserve the property that global min- k Θk,n k∞ → 0, as n → ∞. (20)
imizers of the original problem are obtained if each outer
iteration of the penalty method computes a global minimum Proof. See Appendix B.
of the subproblem. Unfortunately, assumptions that imply The convergence claim for Algorithm 2 to a stationary
boundedness of multipliers tend to be very strong and often solution of problem Pe is stated in the following theorem.
hard to be verified. Nevertheless, following [26], [41], [44], we Theorem 2: Let {(Xk , Pk )}k∈N be the sequence generated
also impose the boundedness of the multipliers. This implies by Algorithm 2. Suppose ρ1 > 0 and γ > 1. Then, the set of
that, in the convergence proofs, we will assume that the true limit points of {(Xk , Pk )}k∈N is non-empty, and every limit
multipliers fall within the bounds imposed by the algorithm, point satisfies the KKT conditions of the original problem Pe .
see, e.g. [26]. Regarding the setting of the remaining param- Proof. The proof follows similar arguments as in [26, Th. 3.1-
eters of the proposed algorithm, we will assume that: i) the 3.5], and thus is omitted due to space limitation.
sequence of positive tolerance parameters {ǫk }k∈N is chosen Remark 1. Note that both Algorithms 1 and 3 at each step
such that limk→∞ ǫk = 0; ii) the penalty parameter ρk is of their loops have to compute the SVD of an N × N
updated according to the infeasibility degree by following the matrix. Therefore, at each iteration their computational cost is
rule described in step 3 of Algorithm 2 [26], [33]. proportional to O(N 3 ). So, clearly, there is a complexity issue
Convergence Analysis. We now discuss in details the conver- that deserves further investigations to enable the application
gence properties of the proposed PAMAL method. Assume to large size graphs. In this paper, we have not investigated
that: i) the proximal parameters {ck,n methods to reduce the complexity of the approach exploiting,
i }∀k,n are arbitrarily
chosen as long as they satisfy (17); ii) the sequence {ǫk }k∈N for instance, the sparsity of the graphs under analysis. Also, we
is chosen such that limk→∞ ǫk = 0; iii) the penalty parameter have not optimized the selection of the parameters involved in
ρk is updated according to the rule described in Algorithm 2. 1 The reader can refer to Appendix B for a definition of the Kurdyka-
The PAM method, as given in Algorithm 3, guarantees global Łojasiewicz (K-Ł) property.
7

both SOC and PAMAL methods. However, even if complexity Algorithm 4 : Balanced graph signal variation
is an issue, the proposed approach is more numerically stable For k = 2, . . . , N
than the only method available today for the analysis of Set n = 0, xn 0 n
k = x nonzero vector with m(xk ) = 0,
α > 0, 0 < ǫ ≪ 1.
directed graphs, based on the Jordan decomposition. Repeat
Remark 2. The two alternative methods proposed above to wn ∈ sign(xn k ),
solve the non-convex problem P are robust to random initial- v n = wn − mean(wn )1,
izations, as testified also by the numerical results presented hn = xn k + αv ,
n

E(xn
k)
in the sequel. In terms of implementation complexity, SOC x̂n+1
k = arg min f (xk ) + k xk − hn k22 ,
xk ∈X b 2α
algorithm is easier to code even though, to the best of our k

knowledge, a theoretical proof of its convergence is still yn+1

k = x̂n+1
k − m(x̂n+1
k ),
n+1
lacking. n+1 y k
xk = , n = n + 1,
k yn+1
k k2
n n−1
V. M INIMIZATION OF BALANCED TOTAL VARIATION until | E(xk ) − E(xk ) |< ǫ,
x̂n+1
The minimization of the total variation as in (12) is inspired xn+1
k = k
,
k x̂n+1
k k2
by the min-cut problem. However, in some cases, this might end.
favor the appearance of very sparse vectors or of very small
clusters, possibly also isolated nodes. One way to prevent these
undesired solutions passes through the introduction of the bases {xk }N k=1 , with x1 = b1, by iteratively solving, for
balanced cut [45], [46]. A popular definition for the balanced k = 2, . . . , N , the following problem
cut of undirected graph is the Cheeger cut [47], which is given
by: min E(xk ) (Pkb )
¯
cut(S, S) xk ∈RN (25)
min . (21) s.t. xk T xℓ = δk,ℓ , ℓ = 1, . . . , k.
S⊆V min(|S|, |S̄|)

¯ attains its maximum when |S| = |S̄| =

Note that min(|S|, |S|) Note that problem Pkb is non-convex in both the constraints set
N/2, so that, for a given value of cut(S, S), ¯ the minimum and the objective function. Recently, several algorithms [45],
occurs when S and S̄ have approximately equal size. While the [49], [48], [46], have been proposed to minimize relaxations
problem stated above is NP-hard, a tight continuous relaxation of the balanced cut problem that are similar to (22). Typi-
of the balanced cut problems has recently been shown to cally, these algorithms give excellent numerical performance,
provide excellent clustering results [46,48,49]. In [49], [27] although theoretical convergence proofs are not available. For
it was proved that the balanced Cheeger cut problem in (21) instance, in [27], the authors proposed an algorithm minimiz-
for undirected graphs admits the following exact continuous ing (22), along with a proof of convergence to a critical point
relaxation of the original problem. This method is a new steepest descent
P P algorithm based on the explicit-implicitP gradient [50] of the
i aji | xi − xj | f (xk )
min Pj,i>j (22) function E(xk ) , B( where B(x ) = i | xk (i)−m(xk ) |.
x∈RN | xi − m(x) |
xk ) k
i The explicit-implicit subgradient of the non-smooth function
P stands for the median value of x. Note that since
where m(x) E(xk ) is given by
it holds i | xi − m(x) |= 0, ∀x ∈ span{1}, problem (22)
is well-defined if x ⊥ 1. Then, the problem in (22) can be xn+1
k − xnk ∂xk f (xn+1
k ) − E(xnk )∂xk B(xnk )
= − (26)
recast as: τn B(xnk )
P P or
i aji | xi − xj |
min Pj,i>j . (23)
x∈R ,x⊥1
N
i | xi − m(x) | ∂xk f (xn+1 ) n
n E(xk )
xn+1
k = xnk − τ n k
+ τ ∂x B(xnk ). (27)
In [49] it was proved that (22) is an exact relaxation of the B(xnk ) B(xnk ) k
Cheeger cut problem and, for any minimizer x, there is a
Let us now consider the following proximal minimization
number ν such that, ∀i, the binary solution xν (i) = 1 if x(i) >
problem
ν and xν (i) = 0 for x(i) ≤ ν, is also a minimizer of the
Cheeger cut problem. Then, from the equivalence of problems B(xnk )
(22) and (23), this result holds true also for any minimizer of xn+1
k , arg min f (xk ) + kxk − gn k2 . (28)
xk ∈RN 2τ n
(23). In the sequel, we formulate the problem of finding the
Fourier basis minimizing the balanced total variation in both Any stationary solution of (28) will be also solution of the
cases of directed and undirected graphs. To this end, let us subgradient equation
define the function
τn
f (xk ) ∂x f (xk ) + xk − gn = 0, (29)
E(xk ) , P (24) B(xnk ) k
i | xk (i) − m(xk ) |
so that at step n + 1 one gets
where f (xk ) = GAV(xk ) in (4), or f (xk ) = GDV(xk ) in
(3), in case of undirected or directed graphs, respectively. τn
According to problem (22), we can find a set of Fourier xn+1
k = gn − ∂x f (xk ). (30)
B(xnk ) k
8

3
3 2
3
2 2 1
1 15 4
15 15 1
4 4

11
11 11 5
5 14
14 5 14
12

12 12

7
6 6 6 7
7 13
13 13
9 9 9

8 8 8
10 10 10

(a) (b) (c)

Fig. 1: Examples of graphs with: (a) 2 directed links; (b) 3 directed links; (c) 1 directed cycle.

Replacing in this last equality the expression of xn+1

k given of N = 15 nodes with three clusters, connected by a) 2
in (27), we obtain the following set of two equations to be directed links, b) 3 directed links, and c) a directed cycle. As
iteratively updated: a first example, in Fig. 2 we report the basis vectors {xk }15k=1

E(xnk ) n obtained through Algorithm 2 for graph (a) in Fig. 1. The

gn = xnk + τ n w with wn ∈ ∂xk B(xnk ) intensity of the vector entries is encoded in the color associated
B(xnk )
to each vertex. Directed and undirected edges are represented
B(xnk ) by arrowed and continuous lines, respectively. The order cho-
xn+1
k = arg min f (x k ) + kxk − gn k2
xk ∈Xkb 2τ n sen to plot the basis vectors corresponds to increasing values
of the directed variation GDV(xk ) (reported on top of each
where we define Xkb , {xk ∈ RN : xk T xℓ = 0, for ℓ = subgraph). It is possible to notice that the basis vectors tend
1, . . . , k − 1}. Note that Xkb is a set of linear constraints to identify clusters and, furthermore, the value assumed by the
since, for each vector xk , the previously computed vectors basis vectors within each cluster is exactly constant. This is
xℓ , for ℓ = 1, . . . , k − 1, are assumed to be known. The norm a useful property in view of applications to unsupervised or
one constraint is satisfied through a simple projection of the semi-supervised clustering, where the label (signal) associated
optimal solution on the unitary sphere. As shown in [49], [27], to each cluster is exactly constant within the cluster. This
the algorithm decreases the objective function and preserves property does not hold with current methods based on the
the zero mean properties of the successive iterates. It was also eigenvectors of either Laplacian or adjacency matrices, whose
observed in [27] that a faster convergence rate can be achieved behavior within each cluster is only smooth but not exactly
B(xn )
when the step size is chosen as τ n = α E(xnk ) with α > 0. constant. To grasp the reason for this difference, it is worth
k
The formal description of the iterative optimization method noticing that, in case of undirected graphs, the above property
is given in Algorithm 4, where we denote by sign(a) and is a consequence of having minimized an ℓ1 -norm (see, e.g.,
mean(a), respectively, the element-wise sign and the mean (4)), rather than an ℓ2 -norm, as in the case of the Laplacian
value of a vector a. The convergence analysis of the algorithm eigenvectors. It is interesting to remark from Fig. 2 how there
to a critical point of E was derived in [49], [27] for undirected are three basis vectors that yield a zero directed variation. In
graphs. However, since for directed graphs f (xk ) preserves all particular, besides the constant vector x1 , vectors x2 and x3 ,
the required properties (i.e., it is non-smooth and convex), the even if not constant, yield zero variation just by assigning
convergence results in [49], [27] hold also for the minimization values to the entries of the cluster {11 ÷ 15} smaller than
of the balanced directed variation. the values of clusters {1 ÷ 5} and {6 ÷ 10}. Since there is no
directed edge between clusters {1 ÷ 5} and {6 ÷ 10}, there are
VI. N UMERICAL RESULTS two ways to enforce the previous property, still maintaining
In this section, we present some numerical results to assess vector orthogonality. As a further example, let us consider
the effectiveness of the proposed strategy for building the graph (b) in Fig. 1, where we added a directed link from node 7
GFT basis. First, we illustrate some examples of application to node 5. From Fig. 3 we observe that, in this case, the number
and then we compare the proposed approach with alternative of basis vectors having zero directed variation reduce to two,
definitions of GFT basis, as given in [5], [4], [14]. In all our since the presence of the new directed link leads to only one
experiments, the parameters of SOC and PAMAL methods possible way, besides the constant vector, to have GDV = 0
are set as (unless stated otherwise): β = 100, τ = 0.5, still preserving basis orthogonality. In Fig. 4, we report the
γ = 1.5, ρ1 = 50, ǫk = (0.9)k , ∀k ∈ N, Λmin = −1000 · I optimal basis, computed using Algorithm 2, for the graph with
Λmax = 1000 · I, Λ1 = 0, c = ck,n = c̄ = 0.5, ∀i, k, n. a directed cycle depicted in Fig. 1c. Interestingly, in this case,
i
Examples of bases for directed graphs. For the sake of there can only be one vector that yields zero directed variation:
understanding the structure of the GFT basis vectors obtained the constant vector. In fact, the cyclical structure of the graph
with our methods, we start considering the simple directed now prevents the existence of non-constant vectors able to null
graphs depicted in Fig. 1, i.e. a directed graph composed the directed variation. The properties described above are a
9

GDV(x1 ) = 0 GDV(x2 ) = 0 GDV(x3 ) = 0 GDV(x1 ) = 0 GDV(x2 ) = 0 GDV(x3 ) = 2

1 1 0
0.2 0.2 0.2
0.5 0.5 −0.2
0 0 −0.4
0 0 0
−0.6
−0.5 −0.2 −0.5 −0.2
−0.2 −0.8

GDV(x4 ) = 2.24 GDV(x5 ) = 2.46 GDV(x6 ) = 3 GDV(x4 ) = 2.7 GDV(x5 ) = 2.73 GDV(x6 ) = 3
0.2 0.2 0.2
0 0 0.2 0 0.2 0

−0.2 −0.2 0 −0.2 0 −0.2

−0.4 −0.4 −0.2 −0.4 −0.4
−0.2
−0.6 −0.6 −0.4 −0.6 −0.6
−0.4
−0.8 −0.8 −0.6 −0.8 −0.8

GDV(x7 ) = 3.37 GDV(x8 ) = 3.46 GDV(x9 ) = 3.53 GDV(x7 ) = 3.2 GDV(x8 ) = 3.5 GDV(x9 ) = 3.5
0.6 0.8 0.8 0.8 0.8
0.5
0.4 0.6 0.6 0.6 0.6
0.2 0.4 0.4 0.4 0.4
0
0.2
0 0.2 0.2 0.2
0
−0.2 −0.5 0 0 0
−0.2

GDV(x10 ) = 3.67
GDV(x11 ) = 4.08 GDV(x12 ) = 4.25
0.8 GDV(x 10 ) = 3.5 GDV(x 11 ) = 4.1 GDV(x 12 ) = 4.2
0.8
0.8
0.6 0.6 0.5
0.5 0.6 0.5
0.4 0.4
0.4
0.2 0.2 0
0 0.2 0
0 0
0
−0.2 −0.2 −0.5
−0.5 −0.2 −0.5
−0.4 −0.4 −0.4

GDV(x13 ) = 4.6 GDV(x14 ) = 4.6 GDV(x15 ) = 5 GDV(x 13 ) = 4.2 GDV(x 14 ) = 4.5 GDV(x 15 ) = 5
0.6 0.6
0.2 0.4
0.5 0.5
0.4 0 0.4
0.2
0.2 −0.2 0.2
0 0 0
0 −0.4 0
−0.2
−0.6
−0.2 −0.2 −0.5 −0.5
−0.4
−0.8

Fig. 2: Optimal basis vectors xk , k = 1, . . . , 15 for Algorithm Fig. 3: Optimal basis vectors xk , k = 1, . . . , 15 for Algorithm
2 and the directed graph in Fig. 1a. 2 and the graph in Fig. 1b.

unique and an interesting consequence of the edge directivity. suggests that bases associated to different local minima behave
In fact, as can be observed from Fig. 5, the optimal bases similarly in terms of total variation. Additionally, since the
for the corresponding undirected graph (obtained by simply PAMAL algorithm solves the orthogonality constrained, non-
removing edge directivity) have only one vector with zero convex problem by iteratively updating the primal variables
variation, the constant vector. Conversely, in the case shown and the multipliers, the objective function evaluated at each
before, we have had three, two, and one vectors yielding zero (inner and outer) iteration does not necessarily follow a
variation. monotonous decay, as can be noticed by the lower subplot
Convergence test. Since the optimization problem P is non- in Fig. 6.
convex, there is of course the possibility that the proposed Comparison with alternative GFT bases. We compare
methods fall into a local minimum. Furthermore, while PA- now the GFT basis found with our methods with the bases
MAL method guarantees convergence, SOC algorithm might associated to either the Laplacian or the adjacency matrix,
also fail to converge because, theoretically speaking, there is as proposed in [5], [4] and references therein. To compare
no convergence analysis. To test what happens, we considered the results, we applied all algorithms to several indepen-
several independent initializations of both SOC and PAMAL dent realizations of random graphs. We chose as family of
algorithms in the search for a basis for the graph of Fig. 1a. random graphs the so called scale-free graphs, as they are
In Fig. 6, we report the average behavior (± the standard known to fit many situations of practical interest [51]. In
deviation) of the directed variation versus the iteration index the generation of random scale-free graphs, it is possible to
m, which counts the overall number of (outer and inner) set the minimum degree dmin of each node. To compare
iterations for Algorithm 1 and 2. The curves refer to 200 our method with the GFT definition proposed in [1], since
independent initializations of algorithms SOC and PAMAL, the eigenvectors of an asymmetric matrix can be complex
using the same initialization for both. We can observe that in and the directed total variation GDV, as defined in (3),
all cases the algorithms converge but indeed there is a spread does not represent a valid metric for complex vectors, we
in the final variation, meaning that both methods can incur into restricted the comparison to undirected scale-free graphs, in
local minima. Nonetheless, the spread is quite limited, which which case the adjacency and Laplacian matrices are real and
10

GDV(x1 ) = 0 GDV(x2 ) = 0.54 GDV(x3 ) = 1.3 GAV(x1 ) = GQV(x1 ) = 0 GAV(x2 ) = 2.68, GQV(x2 ) = 2.6 GAV(x3 ) = 2.8, GQV(x3 ) = 1.6
0.2
1 0.3 0.2 1 0.6
0
0.2 0.4
0.5 0 0.5 −0.2
0.1
−0.4 0.2
0 0 0
−0.2 −0.6 0
−0.5 −0.1 −0.5 −0.8 −0.2

GDV(x4 ) = 2.4 GDV(x5 ) = 3 GDV(x6 ) = 3.3 GAV(x4 ) = 2.88, GQV(x4 ) = 2.39 GAV(x5 ) = 3.65, GQV(x5 ) = 2.93 GAV(x6 ) = 3.88, GQV(x6 ) = 3.39
0.2 0.6
0 0 0.2 0.4 0.2
0.4
−0.2 −0.2 0 0.2 0
0.2
−0.4 −0.4 −0.2 −0.2
0 0
−0.6 −0.6
−0.4 −0.4
−0.2 −0.2
−0.8 −0.8
−0.6 −0.6

GDV(x7 ) = 3.3 GDV(x8 ) = 3.4 GDV(x9 ) = 3.4 GAV(x7 ) = 3.88, GQV(x7 ) = 3.39 GAV(x8 ) = 3.88, GQV(x8 ) = 3.39 GAV(x9 ) = 4, GQV(x9 ) = 2.93
0.2 0.8 0.6 0.6
0.2 0.2
0 0.6 0.4 0.4
0 0
−0.2 0.4
0.2 0.2 −0.2
−0.2 −0.4 0.2
0 0 −0.4
−0.4 −0.6 0
−0.2 −0.2 −0.6
−0.6 −0.8 −0.2

GDV(x 10 ) = 3.5 GDV(x 11 ) = 4 GDV(x 12 ) = 4.2 GAV(x10 ) = 4.08, GQV(x10 ) = 3.66 GAV(x11 ) = 4.11, GQV(x11 ) = 3.61 GAV(x12 ) = 4.49, GQV(x12 ) = 4.16
0.8 0.8 0.8
0.5 0.6 0.5 0.6 0.2 0.6
0.4 0.4 0 0.4
0 0.2 0 0.2 0.2
−0.2
0 0 0
−0.4
−0.2 −0.2 −0.2
−0.5 −0.5
−0.4 −0.6 −0.4
−0.4

GDV(x 13 ) = 4.5 GDV(x 14 ) = 5 GDV(x15 ) = 5.2 GAV(x13 ) = 4.61, GQV(x13 ) = 4.33 GAV(x14 ) = 4.94, GQV(x14 ) = 4.5 GAV(x15 ) = 5.65, GQV(x15 ) = 4.9
0.4 0.6 0.6 0.8
0.2 0.5 0.5
0.4 0.4 0.6
0
0.4
0.2 0.2 0
−0.2 0
0.2
−0.4 0 0
0
−0.6 −0.2 −0.5 −0.5
−0.2 −0.2
−0.8

Fig. 4: Optimal basis vectors xk , k = 1, . . . , 15 for Algorithm Fig. 5: Optimal basis vectors xk , k = 1, . . . , 15 for Algorithm
2 and the graph in Fig. 1c. 2 and the undirected counterpart of the graph in Fig. 1c.

symmetric, so that their eigenvectors arePreal. In the sequel, are averaged over 100 independent realizations of scale-free
N graphs, vs. the average minimum degree, under the same
we will use the notations GAV(X) := k=1 GAV(xk ) and
PN
GQV(X) := k=1 GQV(xk ) to denote, respectively, the total settings of Fig. 7. Interestingly, even if our basis vectors X∗ do
graph absolute and quadratic variation of a matrix X. In Fig. not coincide with V or U, they provide the same GQV, within
7, we compare the following metrics: a) GAV(X∗ ), derived by negligible numerical inaccuracies. Indeed, the invariance of the
solving problem P through the SOC and PAMAL methods; metric GQV(X), for any square, orthogonal matrix PN X, can be
b) GAV(V), where V are the eigenvectors of the adjacency easily proved from the equality GQV(X) = k=1 xTk Lxk =
matrix according to the GFT defined in (7); c) GAV(U), trace(XT LX), by observing that trace(XT LX) = trace(L)
where U are the eigenvectors of the Laplacian matrix by for any orthogonal matrix X. Interestingly, this implies that,
assuming the GFT as in (5), that for undirected graphs is for undirected graphs, our orthogonal matrix X∗ can be
equivalent to the GFT defined in (10). More specifically, Fig. obtained by applying an orthogonal transform to the Laplacian
7 shows the previous metrics vs. the minimum degree of the eigenvectors basis.
graph averaged over 100 independent realizations of scale- Complexity issues. Clearly, looking at both SOC and PAMAL
free graphs of N = 20 nodes. As we can notice from Fig. methods, complexity is a non trivial issue which deserves
7, the bases built using SOC and PAMAL algorithms yield a further investigations, especially when the size of the graph
significantly lower total variation than the conventional bases increases. To get an idea of computing time, in Fig. 9 we report
built with either adjacency or Laplacian eigenvectors. This is the execution time of both SOC and PAMAL algorithms, as
primarily due to the fact that our optimization methods tend a function of the number of vertices in the graph. The results
to assign constant values within each cluster. Finally, in Fig. 8 have been obtained running a non-compiled Matlab program,
we compare the alternative basis vectors using as performance with no optimization of the parameters involved, by setting
metric the GQV. So, in Fig. 8 we report the GQV(X∗ ) metric ρ1 = β = 20. The program ran on a laptop having a processor
derived from the SOC and PAMAL methods with GQV(V) Intel Core i7-4500, CPU 1.8, 2.4 GHz. The graphs under
and GQV(U) obtained, respectively, from the eigenvectors of test were generated as geometric random graphs with equal
the adjacency and the Laplacian matrix. Again, the results percentage of directed links as N increases.
11

100 400
SOC algorithm GQV(X ⋆ ), SOC
GQV(X ⋆ ), PAMAL
GQV(V)
GDV m ± σm

80
350 GQV(U)

60
300
40
0 100 200 300 400 500

GQV
Iteration index m 250
100
PAMAL algorithm
200
GDV m ± σm

60 150

40
0 20 40 60 80 100 120 140 160 100
4 6 8 10 12 14 16
Iteration index m
Average minimum degree
Fig. 6: Average directed variation (± the standard deviation)
Fig. 8: Average GQV versus the average minimum degree
for SOC and PAMAL methods vs. the iteration index m for the
according to alternative GFT definitions for undirected scale-
graph of Fig. 1a, by averaging over 200 random initializations
free graphs with N = 20 nodes.
of the algorithms.
10 3
900 SOC algorithm
GAV(X ⋆ ), SOC PAMAL algorithm
GAV(X ⋆ ), PAMAL
800 GAV(V)
GAV(U)

700
Execution time [min]

10 2

600
GAV

500

400 10 1

300

200
10 0
50 100 150 200 250
100
4 6 8 10 12 14 16 N
Average minimum degree
Fig. 9: Execution time vs. the number of nodes for RGGs with
Fig. 7: Average absolute total variation versus the average 25% of directed links and β = ρ1 = 20.
minimum degree according to alternative GFT definitions for
undirected scale-free graphs with N = 20 nodes.
edges’ directivity.
Balanced total variation. In some cases, the solution of the
Examples with real networks. As an application to real total variation problem in (12) can cut the graph in subsets
graphs, in Fig. 10 we considered the directed graph obtained of very different cardinality. As an extreme case, it may be
from the street map of Rome, incorporating the true directions not uncommon to have a subset composed of only one node
of traffic lanes in the area around Mazzini square. The graph and the other set containing all the rest of the network. To
is composed of 239 nodes. Even though, the scope of this prevent such a behavior, Algorithm 4 aims at minimizing the
paper is to propose a method to build a GFT basis, so that balanced total variation. An example of its application to the
we do not dig further into applications, this an example that graph of Fig. 10 is reported in Fig. 12, where we show some
has interesting applications of GSP. The problem in this case basis vectors computed using Algorithm 4. Comparing these
is to build a map of vehicular traffic in a city, starting from vectors with the corresponding ones obtained with PAMAL
a subset of measurements collected along road side units or algorithm, see, e.g. Fig. 11, we can see how clusters of single
sent by cars equipped with ad hoc equipment. The problem nodes are now avoided.
can be interpreted as the reconstruction of the entire graph
signal from a subset of samples and then it builds on graph VII. C ONCLUSION
sampling theory [10]. In Fig. 11 we report some basis vectors In this paper we have proposed an alternative approach to
obtained by using Algorithm 2 with ρ1 = 10. We can observe build an orthonormal basis for the Graph Fourier Transform
that the basis vectors highlight clusters, while capturing the (GFT). The approach considers the general case of a directed
12

A PPENDIX
A. Closed-form solution for problem Q̃k,n
In this section we provide a closed-form solution for the
non-convex problem Q̃k,n . This problem can be equivalently
written as
Pk,n = arg min gk,n−1 (P)
P∈RN ×N (31)
T
s.t. P P=I
k
where gk,n−1 (P) , hΛk , P− Xk,n−1 i+ ρ2 kP− Xk,n−1 k2F +
ck,n−1
2
2 k P − Pk,n−1 k2F . Our proof consists of two steps:
i) first, we find the stationary solutions by solving the KKT
necessary conditions; ii) then, we prove that the resulting
closed-form solution is a global minimum of the non-convex
problem (31). The Lagrangian function LP associated to (31)
can be written as
ρk
LP =hΛk , P − Xk,n−1 i + kP − Xk,n−1 k2F
2 (32)
c2k,n−1 k,n−1 2 T
+ k P−P kF +hΛ1 , P P − Ii
2
where Λ1 ∈ RN ×N is the multipliers’ matrix associated to the
orthogonality constraint. The KKT conditions become then
Fig. 10: Directed graph associated to street map of Rome a) ∇P LP =P[I(ρk + c2k,n−1 ) + 2Λ1 ] − c2k,n−1 Pk,n−1
(Piazza Mazzini). − ρk Xk,n−1 + Λk = 0,
b) Λ1 ⊥ PT P − I = 0 (33)
where we chose Λ1 = ΛT1 . Hence, defining B , I +
graph and then it includes the undirected case as a particular 2Λ1 /(ρk + c2k,n−1 ), from equation a) one gets:
example. The search method starts from the identification of
an objective function and then looks for an orthonormal basis PB = F (34)
that minimizes that function. More specifically, motivated by c2k,n−1 Pk,n−1 + ρk Xk,n−1 −Λ k
the need to detect clustering behaviors in graph signals, we with F , . Let QΣTT be
chose as objective function the cut size. We showed that this ρk + c2k,n−1
the SVD decomposition of F. From (34), it turns out
approach leads, without loss of optimality, to the minimization
of a function that represents a directed total variation of graph PB = QΣTT (35)
signals, as it captures the edges’ directivity. Interestingly, in
and, using the orthogonality condition b) in (33), it holds
case of undirected graphs, this function converts into an ℓ1 -
norm total variation, which represents the graph (discrete) BT B = TΣ2 TT ⇒ B = TΣTT . (36)
counterpart of the ℓ1 -norm total variation that plays a key role
Therefore, replacing B in (35), we get
in the classical Fourier Transform of continuous-time signals
[17]. We compared our basis vectors with the eigenvectors PTΣTT = QΣTT ⇒ P = QTT . (37)
of either the Laplacian or adjacency matrix, assuming as
performance metric either our graph absolute variation or the It remains to prove that P⋆ = Pk,n = QTT is a global
graph quadratic variation. As expected, our method outper- minimum for problem (31). To this end, it is sufficient to show
forms the other methods when using the absolute variation, as that
it is built by minimizing that metric. However, what has been gk,n−1 (P⋆ ) ≤ gk,n−1 (P), ∀ P : PT P = I (38)
interesting to see was that our basis performs as well as the ⋆
alternative basis when we assumed as performance metric the i.e., using the equalities k P k2F =k
P k2F = N , we have to
graph quadratic variation. Before concluding, we wish to point prove that ∀ P : PT P = I, it results
out that, as always, our alternative approach to build a GFT trace(P⋆T (Λk − ρk Xk,n−1 − c2k,n−1 Pk,n−1 )) ≤
basis has its own merits and shortcomings when compared (39)
to alternative approaches. For example, having restricted the trace(PT (Λk − ρk Xk,n−1 − c2k,n−1 Pk,n−1 )).
search to the real domain, differently from available methods, Using the above definition of F, (39) reduces to
our method fails to find the complex exponentials as the
GFT basis in the case of circular graphs. Furthermore, other trace(P⋆T F) ≥ trace(PT F), ∀ P : PT P = I (40)
methods like the ones in [1] starting from the identification of ⋆ T
and since P = QT , the final inequality to hold true is
the adjacency matrix as the shift operator, are more suitable
than our approach to devise a filtering theory over graphs. trace(Σ) ≥ trace(TT PT QΣ), ∀ P : PT P = I. (41)
13

GDV(x3 ) =0 GDV(x5 ) =0
0

0.5
-0.05

-0.1 0.4

-0.15
0.3

-0.2

0.2

-0.25

-0.3 0.1

-0.35
0

GDV(x17 ) =0 GDV(x27 ) =0.44

0.4
0.5

0.35

0.4 0.3

0.25
0.3

0.2

0.2 0.15

0.1

0.05

0 0

GDV(x29 ) =0.57 GDV(x63 ) =0.91

0.3

-0.1 0.2

0.1

-0.2
0

-0.1
-0.3

-0.2

-0.4
-0.3

-0.4
-0.5

-0.5

Fig. 11: Optimal basis vectors xk , k = 3, 5, 17, 27, 29, 63 for Algorithm 2 and the graph in Fig. 10.
14

GDV(x2 ) =0 GDV(x3 ) =0
0.15

0.3

0.1

0.2

0.05
0.1

0 0

-0.1
-0.05

-0.2

-0.1

-0.3

-0.15

GDV(x4 ) =0.026 GDV(x5 ) =0.34

0.4 0.05

0.35

0.3

0
0.25

0.2

0.15
-0.05

0.1

0.05

0 -0.1

GDV(x6 ) =0.4 GDV(x7 ) =0.53

0.05 0.05

-0.05 0

-0.1

-0.05
-0.15

-0.2

-0.1
-0.25

Fig. 12: Optimal basis vectors xk , k = 2, . . . , 7 for Algorithm 4 and the graph in Fig. 10.
15

Define ZT := TT PT Q so that ZT Z = I. Then, from (41) we Lk (W) → ∞ when kWk∞ → ∞. Clearly, the term f2 (P)
get is coercive. The remaining terms in (45) can be written as
trace(Σ) ≥ trace(ZT Σ), ∀ Z : ZT Z = I. (42) ρk
f1 (X) + gk (X, P) = GDV(X) + hX, Xi−hρk P + Λk , Xi
This last inequality holds because Σii > 0 and Zii ≤| Zii | 2
≤ 1, ∀i, where the latter is implied by ZT Z = I [40]. ρk
+hΛk , Pi + k P k2F .
Additionally, Zii = 1, ∀i, if and only if Z = I, so that the 2
equality in (42) holds if and only if Z = I or P⋆ = QTT . Since P ∈ St it holds k P k2F = N . Thus, from the inequalities
hA, Bi ≥ − √ k A kF k B kF and k B kF ≤k B k1 , it holds
B. Proof of Theorem 1 hΛk , Pi ≥ − N k Λk k1 , so that one gets
For lack of space, we omit here the details of the proof,
ρk
which proceeds using similar arguments as in the proof of f1 (X) + gk (X, P) ≥ GDV(X) + hX, Xi − ρk k X k1
2
Proposition 2.5 in [26]. However, to invoke this correspon- √ ρk N
dence, we need to prove that the following properties hold true: −hΛk , Xi − N k Λk k1 +
i) the function Lk in (15) satisfies the Kurdyka-Łojasiewicz 2
k k
(K-Ł) property; ii) Lk is a coercive function. To prove point where we used the inequality hρ P, Xi ≤ ρ k X k1 . Observe
i), let us first introduce some definitions [52]. that the sequence {ρk }k∈N is non-decreasing when γ > 1 so
Definition 3: A semi-algebraic subset of Rn is a finite union that ρk > ρ1 . Then the function f1 (X) + gk (X, P) is coercive
of sets of the form ρk
being GDV(X) + hX, Xi a positive function.
{x ∈ Rn : P1 (x) = 0, . . . ,Pk (x) = 0, 2
(43)
Q1 (x) > 0, . . . , Ql (x) > 0} R EFERENCES
[1] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on
where P1 , . . . , Pk and Q1 , . . . , Ql are polynomial in n vari- graphs,” IEEE Trans. Signal Process., vol. 61, no. 7, pp. 1644–1656,
ables. Apr. 2013.
Definition 4: A function f : Rn → R is said to be semi- [2] S. K. Narang and A. Ortega, “Perfect reconstruction two-channel wavelet
filterbanks for graph structured data,” IEEE Trans. Signal Process.,
algebraic if its graph, defined as gphf := {(x, f (x))| x ∈ vol. 60, no. 6, pp. 2786–2799, 2012.
Rn }, is a semi-algebraic set. [3] ——, “Compact support biorthogonal wavelet filter banks for arbitrary
It is shown [ cf. [42], Th. 3] that the semi-algebraic functions undirected graphs,” IEEE Trans. Signal Process., vol. 61, no. 19, pp.
4673–4685, 2013.
satisfy the K-Ł property. [4] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on
Definition 5: A function φ(x) satisfies the Kurdyka- graphs: Frequency analysis,” IEEE Trans. Signal Process., vol. 62,
Łojasiewicz (K-Ł) property at point x̄ ∈ dom(∂φ) if there no. 12, pp. 3042–3054, Jun. 2014.
[5] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Van-
exists θ ∈ [0, 1) such that dergheynst, “The emerging field of signal processing on graphs: Ex-
tending high-dimensional data analysis to networks and other irregular
|φ(x) − φ(x̄)|θ domains,” IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83–98, May
(44)
dist(0, ∂φ(x)) 2013.
[6] D. K. Hammond, P. Vandergheynst, and R. Gribonval, “Wavelets on
is bounded around x̄. graphs via spectral graph theory,” Appl. Comput. Harmon. Anal., vol. 30,
The global convergence of the PAM method established in pp. 129–150, 2011.
[7] S. K. Narang, G. Shen, and A. Ortega, “Unidirectional graph-based
[43] requires the objective function to satisfy the K-Ł property. wavelet transforms for efficient data gathering in sensor networks,” in
Define W := (X, P) and consider the function Lk in (15), Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), Mar.
i.e. 2010, pp. 2902–2905.
Lk (W) = L(X, P, Λk ; ρk ) = f1 (X) + f2 (P) + gk (X, P) [8] I. Pesenson, “Sampling in Paley-Wiener spaces on combinatorial
graphs,” Trans. of the American Math. Society, vol. 360, no. 10, pp.
(45) 5603–5627, Oct. 2008.
where f1 (X) = GDV(X), f2 (P) = δSt (P) and gk (X, P) = [9] A. Agaskar and Y. M. Lu, “A spectral graph uncertainty principle,” IEEE
k Trans. Inform. Theory, vol. 59, no. 7, pp. 4338–4356, Jul. 2013.
hΛk , P − Xi + ρ2 kP − Xk2F . Observe that f1 (X) = [10] M. Tsitsvero, S. Barbarossa, and P. Di Lorenzo, “Signals on graphs:
XN
Uncertainty principle and sampling,” IEEE Trans. Signal Process.,
aji max(xi − xj , 0) is the weighted sum of the func- vol. 64, no. 18, pp. 4845–4860, Sep. 2016.
i,j=1 [11] M. Tsitsvero and S. Barbarossa, “On the degree of freedom of signals
tions fij (xi , xj ) = max(xi − xj , 0). Being a finite sum of on graphs,” in Proc. European Signal Process. Conf., Nice, Sep. 2015,
pp. 1521–1525.
semi-algebraic functions also a semi-algebraic function, it is [12] S. Chen, R. Varma, A. Sandryhaila, and J. Kovačević, “Discrete signal
sufficient to show that fij is semi-algebraic. Assume, w.l.o.g. processing on graphs: Sampling theory,” IEEE Trans. Signal Process.,
yij = xi − xj so that z = fij (yij ) = max(yij , 0). The graph vol. 63, no. 24, pp. 6510–6523, Dec. 2015.
[13] X. Zhu and M. Rabbat, “Approximating signals supported on graphs,” in
of fij becomes Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), Mar.
2012, pp. 3921–3924.
gphfij = {(yij , z) : z = yij , yij ≥ 0}∪{(yij , z) : z = 0, yij ≤ 0} [14] R. Singh, A. Chakraborty, and B. S. Manoj, “Graph Fourier transform
based on directed Laplacian,” in Proc. Int. Conf. Signal Process.
and according to Definition 3 it is a semi-algebraic set. Commun. (SPCOM), Jun. 2016, pp. 1–5.
Then f1 (X) as sum of semi-algebraic functions is also semi- [15] M. Püschel and J. M. F. Moura, “Algebraic signal processing theory:
algebraic. Since f2 (P) and gk (X, P) are semi-algebraic func- Foundation and 1-D time,” IEEE Trans. Signal Process., vol. 56, no. 8,
pp. 3572–3585, Aug. 2008.
tions it follows that Lk (W) is also semi-algebraic. It remains [16] ——, “Algebraic signal processing theory: 1-D space,” IEEE Trans.
to prove point ii) to assess that Lk is a coercive function, i.e. Signal Process., vol. 56, no. 8, pp. 3586–3599, Aug. 2008.
16

[17] S. Mallat, A wavelet tour of signal processing: The sparse way. [44] E. G. Birgin, D. Fernández, and J. M. Martı́nez, “On the boundedness
Accademic Press, 2009. of penalty parameters in an augmented Lagrangian method with con-
[18] F. Lozes, A. Elmoataz, and O. Lézoray, “Partial difference operators on strained subproblems,” Optimization Methods and Software, vol. 27, pp.
weighted graphs for image processing on surfaces and point clouds,” 1001–1024, 2012.
IEEE Trans. Image Process., vol. 23, no. 9, pp. 3896–3909, Sep. 2014. [45] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE
[19] G. H. Golub and J. H. Wilkinson, “Ill-conditioned eigensystems and Trans. Pattern Analysis and Mach. Lear., vol. 22, no. 8, pp. 888–905,
computation of the Jordan canonical form,” SIAM Review, vol. 18, no. 4, Aug. 2000.
pp. 578–619, Oct. 1976. [46] M. Hein and S. Setzer, “Beyond spectral clustering - tight relaxations of
[20] B. Girault, “Signal Processing on Graphs - Contributions to an Emerging balanced graph cuts,” in Advances in Neural Inform. Process. Systems
Field,” Theses, Ecole normale supérieure de lyon - ENS LYON, Dec. (NIPS), 2011, pp. 2366–2374.
2015. [Online]. Available: https://tel.archives-ouvertes.fr/tel-01256044 [47] J. Cheeger, “A lower bound for the smallest eigenvalue of the Laplacian,”
[21] A. Gadde, A. Anis, and A. Ortega, “Active semi-supervised learning Problems in Analysis, R.C. Gunning, ed., Princeton Univ. Press, pp.
using sampling theory for graph signals,” in Proc. 20th ACM SIGKDD 195–199, 1970.
Int. Conf. Knowledge Discovery and Data Mining, ser. KDD ’14. New [48] M. Hein and T. Bühler, “An inverse power method for nonlinear
York, NY, USA: ACM, 2014, pp. 492–501. eigenproblems with applications in 1-spectral clustering and sparse
[22] A. Anis, A. E. Gamal, S. Avestimehr, and A. Ortega, “Asymptotic PCA,” in Advances in Neural Inform. Process. Systems (NIPS), 2010,
justification of bandlimited interpolation of graph signals for semi- pp. 847–855.
supervised learning,” in Proc. IEEE Int. Conf. Acoust., Speech Signal [49] A. Szlam and X. Bresson, “Total variation and Cheeger cuts,” in Proc.
Process. (ICASSP), Apr. 2015, pp. 5461–5465. 27th Int. Conf. on Machine Learning (ICML), 2010, pp. 1039–1046.
[23] L. Lovász, “Submodular functions and convexity,” in A. Bachem et al. [50] S. Boyd and N. Parikh, Proximal Algorithms. Foundations and Trends
(eds.) Math. Program. The State of the Art, Springer Berlin Heidelberg, in Optimization, 2013, vol. 1, no. 3.
pp. 235–257, 1983. [51] R. Albert and A.-L. Barabási, “Statistical mechanics of complex net-
[24] F. Bach, “Learning with submodular functions: A convex optimization works,” Rev. Mod. Phys, pp. 47–97, 2002.
perspective,” Foundations and Trends in Machine Learning, vol. 6, no. [52] J. Bochnak, M. Coste, and M. F. Roy, Real Algebraic Geometry.
2–3, pp. 145–373, 2013. Springer-Verlag, Berlin, 1998.
[25] R. Lai and S. Osher, “A splitting method for orthogonality constrained
problems,” J. Scientific Computing, vol. 58, no. 2, pp. 431–449, Feb.
2014.
[26] W. Chen, H. Ji, and Y. You, “An augmented Lagrangian method for l1 -
regularized optimization problems with orthogonality constraints,” SIAM
J. Scientific Computing, vol. 38, no. 4, pp. B570–B592, 2016.
[27] X. Bresson, T. Laurent, D. Uminsky, and J. H. von Brecht, “Convergence
and energy landscape for Cheeger cut clustering,” in Advances in Neural
Inform. Process. Systems (NIPS), 2012, pp. 1394–1402.
[28] M. Newman, Networks: An Introduction. New York, NY, USA: Oxford
Univ. Press, 2010.
[29] L. Jost, S. Setzer, and M. Hein, “Nonlinear eigenproblems in data
analysis: Balanced graph cuts and the ratioDCA-Prox,” in Extraction
of Quantifiable Information from Complex Systems, Springer Intern.
Publishing, vol. 102, pp. 263–279, 2014.
[30] F. R. K. Chung, Spectral Graph Theory. American Math. Soc., 1997.
[31] J. Nocedal and S. J. Wright, Numerical Optimization. Springer, 2006.
[32] F. Bethuel, H. Brezis, and F. Hélein, “Asymptotics for the minimization
of a Ginzburg-Landau functional,” Calculus of Variations and Partial
Differential Equations, vol. 1, no. 2, pp. 123–148, 1993.
[33] D. P. Bertsekas, Constraint optimization and Lagrange multiplier meth-
ods. Belmont Massachusetts: Athena Scientific, 1999.
[34] M. Fortin and R. Glowinski, Augmented Lagrangian Methods: Applica-
tions to the Numerical Solution of Boundary-Value Problems. North
Holland, 2000, vol. 15.
[35] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed
Optimization and Statistical Learning via the Alternating Direction
Method of Multipliers. Foundations and Trends in Machine Learning,
2010, vol. 3, no. 1.
[36] R. Glowinski and P. Le Tallee, Augmented Lagrangian and Operator-
Splitting Methods in Nonlinear Mechanics. SIAM, 1989.
[37] W. Yin, S. Osher, D. Goldfarb, and J. Darbon, “Bregman iterative
algorithms for l1 -minimization with application to compressed sensing,”
SIAM J. Imag. Sciences, vol. 1, pp. 143–168, 2008.
[38] S. Osher, M. Burger, D. Goldfarb, J. Xu, and W. Yin, “An iterative regu-
larization method for total variation-based image restoration,” Multiscale
Model. Simul., vol. 4, no. 2, pp. 460–489, 2005.
[39] T. Goldstein and S. Osher, “The split Bregman method for l1 -regularized
problems,” SIAM J. Imag. Sciences, vol. 2, no. 2, pp. 323–343, 2009.
[40] J. H. Manton, “Optimization algorithms exploiting unitary constraints,”
IEEE Trans. Signal Process., vol. 50, no. 3, pp. 635–650, Mar. 2002.
[41] R. Andreani, E. G. Birgin, J. M. Martı́nez, and M. L. Schuverdt, “On
augmented Lagrangian methods with general lower–level constraints,”
SIAM J. Optimiz., vol. 18, no. 4, pp. 1286–1309, 2007.
[42] J. Bolte, S. Sabach, and M. Teboulle, “Proximal alternating linearized
minimization for nonconvex and nonsmooth problems,” Math. Program.,
vol. 146, no. 1–2, pp. 459–494, Aug. 2014.
[43] H. Attouch, J. Bolte, and B. F. Svaiter, “Convergence of descent
methods for semi–algebraic and tame problems: proximal algorithms,
forward–backward splitting, and regularized Gauss–Seidel methods,”
Math. Program., vol. 137, no. 1–2, pp. 91–129, Feb. 2013.

Global Shipping at Erken Apparel International: Group No: 9, Section: B
50% (2)
Global Shipping at Erken Apparel International: Group No: 9, Section: B
23 pages
Graph Fourier Transform Based On Directed Laplacian
No ratings yet
Graph Fourier Transform Based On Directed Laplacian
5 pages
Lossless Digraph Signal Processing Via Polar Decomposition: Feng Ji
No ratings yet
Lossless Digraph Signal Processing Via Polar Decomposition: Feng Ji
16 pages
DGFT Spmag
No ratings yet
DGFT Spmag
18 pages
Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213
No ratings yet
Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213
4 pages
Fourier Transform For Signals On Dynamic Graphs
No ratings yet
Fourier Transform For Signals On Dynamic Graphs
12 pages
Approximating Signals Supported On Graphs
No ratings yet
Approximating Signals Supported On Graphs
4 pages
Slides - Graph Signal Processing: An Introductory Overview
No ratings yet
Slides - Graph Signal Processing: An Introductory Overview
47 pages
Graph Signal Processing Report: Electronics and Electrical Communication Engineering
No ratings yet
Graph Signal Processing Report: Electronics and Electrical Communication Engineering
15 pages
GraphSigProc Part I v18 NowFnT
No ratings yet
GraphSigProc Part I v18 NowFnT
49 pages
Graph Signal Processing
No ratings yet
Graph Signal Processing
56 pages
Ricaud 2019
No ratings yet
Ricaud 2019
19 pages
Graph Signal Processing Overview Challenges and Applications
No ratings yet
Graph Signal Processing Overview Challenges and Applications
21 pages
Understanding The Basis of Graph Signal Processing Via An Intuitive Example-Driven Approach
No ratings yet
Understanding The Basis of Graph Signal Processing Via An Intuitive Example-Driven Approach
10 pages
Triplet, Vertices N Labels. Edges Ordered Pairs - " Can Be Influenced by ." Weights "Strength of The Influence of On ."
No ratings yet
Triplet, Vertices N Labels. Edges Ordered Pairs - " Can Be Influenced by ." Weights "Strength of The Influence of On ."
38 pages
Segarra S., Marques A. G., Ribeiro A. - Optimal Graph-Filter Design and Applications To Distributed Linear Network Operators
No ratings yet
Segarra S., Marques A. G., Ribeiro A. - Optimal Graph-Filter Design and Applications To Distributed Linear Network Operators
15 pages
Spectral Analysis of Signed Graphs For Clustering, Prediction and Visualization
No ratings yet
Spectral Analysis of Signed Graphs For Clustering, Prediction and Visualization
12 pages
Part I-KDD - Tutorial - GNN PDF
No ratings yet
Part I-KDD - Tutorial - GNN PDF
322 pages
Discrete Signal Processing On Graphs: Sampling Theory: Siheng Chen, Rohan Varma, Aliaksei Sandryhaila, Jelena Kova Cevi C
No ratings yet
Discrete Signal Processing On Graphs: Sampling Theory: Siheng Chen, Rohan Varma, Aliaksei Sandryhaila, Jelena Kova Cevi C
15 pages
Slides - Graph Signal Processing: Fundamentals and Applications To Diffusion Processes
No ratings yet
Slides - Graph Signal Processing: Fundamentals and Applications To Diffusion Processes
118 pages
The Emerging Field of Signal Processing On Graphs
No ratings yet
The Emerging Field of Signal Processing On Graphs
14 pages
Graph Signal Processing For Machine Learning
No ratings yet
Graph Signal Processing For Machine Learning
11 pages
Hypergraph Signal Processing
No ratings yet
Hypergraph Signal Processing
22 pages
2008 - Graph Signal Processing For Geometric Data and Beyond
No ratings yet
2008 - Graph Signal Processing For Geometric Data and Beyond
16 pages
Learning Graphs From Data A Signal Representation Perspective
No ratings yet
Learning Graphs From Data A Signal Representation Perspective
20 pages
Slides - Introduction To Signal Processing On Graphs
No ratings yet
Slides - Introduction To Signal Processing On Graphs
35 pages
Thesis Proposal: Graph Structured Statistical Inference: James Sharpnack
No ratings yet
Thesis Proposal: Graph Structured Statistical Inference: James Sharpnack
20 pages
Slides - Graph Signal Processing and Applications in Neuroscience
No ratings yet
Slides - Graph Signal Processing and Applications in Neuroscience
103 pages
Graph Algorithms
100% (3)
Graph Algorithms
20 pages
28781-Article Text-32835-1-2-20240324
No ratings yet
28781-Article Text-32835-1-2-20240324
9 pages
Graph Spectral Image Processing 1st Edition Gene Cheung Download
No ratings yet
Graph Spectral Image Processing 1st Edition Gene Cheung Download
53 pages
Complex Non-Backtracking Matrix For Directed Graphs: Keywords
No ratings yet
Complex Non-Backtracking Matrix For Directed Graphs: Keywords
19 pages
Narang 2013
No ratings yet
Narang 2013
5 pages
Convolutional Neural Networks On Graphs With Fast Localized Spectral Filtering
No ratings yet
Convolutional Neural Networks On Graphs With Fast Localized Spectral Filtering
9 pages
Tutorial On Spectral Clustering
No ratings yet
Tutorial On Spectral Clustering
26 pages
GraphSignalProcessing ICIP 2013 Ortega
No ratings yet
GraphSignalProcessing ICIP 2013 Ortega
125 pages
1 s2.0 S0378437117306593 Main
No ratings yet
1 s2.0 S0378437117306593 Main
11 pages
Graph Signal Processing History Development Impact and Outlook
No ratings yet
Graph Signal Processing History Development Impact and Outlook
12 pages
Textbook G426 Sacchi
No ratings yet
Textbook G426 Sacchi
199 pages
Detect and Identify Topology Change in Power
No ratings yet
Detect and Identify Topology Change in Power
6 pages
Luxburg07 Tutorial 4488
No ratings yet
Luxburg07 Tutorial 4488
32 pages
Notes Sd652
No ratings yet
Notes Sd652
47 pages
Chapter1-3 Sacchi PDF
No ratings yet
Chapter1-3 Sacchi PDF
88 pages
Is Signed Message Essential For GNNs
No ratings yet
Is Signed Message Essential For GNNs
15 pages
The Generalized Fourier Transform: A Unified Framework For The Fourier, Laplace, Mellin and Transforms
No ratings yet
The Generalized Fourier Transform: A Unified Framework For The Fourier, Laplace, Mellin and Transforms
18 pages
Work Book 463
No ratings yet
Work Book 463
113 pages
Community Detection With Graph Neural Networks
No ratings yet
Community Detection With Graph Neural Networks
16 pages
Sparse Fourier
No ratings yet
Sparse Fourier
21 pages
I. Generating A Noise Signal:: Shiv Nadar University Lab-9
No ratings yet
I. Generating A Noise Signal:: Shiv Nadar University Lab-9
1 page
DoRoOs Final
No ratings yet
DoRoOs Final
20 pages
MMB 2023 SDM
No ratings yet
MMB 2023 SDM
9 pages
Lectures On Spectral Graph Theory PDF
No ratings yet
Lectures On Spectral Graph Theory PDF
25 pages
Overview of Signals
No ratings yet
Overview of Signals
9 pages
Cini 2023 SparseGraphLearningFromSpatiotemporal Time Series
No ratings yet
Cini 2023 SparseGraphLearningFromSpatiotemporal Time Series
36 pages
Lecture 2
No ratings yet
Lecture 2
5 pages
Graph Convolutional Networks: Yunsheng Bai
No ratings yet
Graph Convolutional Networks: Yunsheng Bai
53 pages
Fourier GNN
No ratings yet
Fourier GNN
23 pages
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Nonlinear Transformations of Random Processes
From Everand
Nonlinear Transformations of Random Processes
Ralph Deutsch
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
NETWORK-THEORY IES and GATE
No ratings yet
NETWORK-THEORY IES and GATE
2 pages
The Process of Statistical Analysis in Psychology, 1st Edition Complete EPUB Download
100% (11)
The Process of Statistical Analysis in Psychology, 1st Edition Complete EPUB Download
17 pages
Multivariable Differentiability Implies Continuity
No ratings yet
Multivariable Differentiability Implies Continuity
5 pages
Padasalai Net 12th Computer Science Full Answer Key Quarterly Exam 2017
No ratings yet
Padasalai Net 12th Computer Science Full Answer Key Quarterly Exam 2017
5 pages
Segment Everything Everywhere All at Once (SEEM-Microsoft)
No ratings yet
Segment Everything Everywhere All at Once (SEEM-Microsoft)
13 pages
Available Puzzles
No ratings yet
Available Puzzles
1 page
Khanz e Zhang
No ratings yet
Khanz e Zhang
14 pages
Sight Reduction - WP PDF
No ratings yet
Sight Reduction - WP PDF
5 pages
MS5030: Data Analysis For Management: Rahul R Marathe
No ratings yet
MS5030: Data Analysis For Management: Rahul R Marathe
39 pages
Campus Recruitment Book
100% (1)
Campus Recruitment Book
126 pages
Machine Learning Methods
No ratings yet
Machine Learning Methods
27 pages
Answers To Assignment 2252
No ratings yet
Answers To Assignment 2252
6 pages
Hybrid Web Recommender Systems
No ratings yet
Hybrid Web Recommender Systems
33 pages
IGCSE - Physics - Lesson Plan 1 - Movement and Position
No ratings yet
IGCSE - Physics - Lesson Plan 1 - Movement and Position
4 pages
2017InIMC Keystage II Individual ENG PDF
No ratings yet
2017InIMC Keystage II Individual ENG PDF
4 pages
CS New
No ratings yet
CS New
7 pages
Compressible Flow Question Bank
No ratings yet
Compressible Flow Question Bank
3 pages
Optimising Safety Relief and Flare Systems
100% (1)
Optimising Safety Relief and Flare Systems
8 pages
LM 08
No ratings yet
LM 08
38 pages
Plato's Republic: PART 7
100% (1)
Plato's Republic: PART 7
11 pages
Advanced Quantum Field Theory Example Sheet 4: Lent 2021
No ratings yet
Advanced Quantum Field Theory Example Sheet 4: Lent 2021
30 pages
Cryptography and Cyber Security - CB3491 - 2 Marks Questions With Answer
No ratings yet
Cryptography and Cyber Security - CB3491 - 2 Marks Questions With Answer
33 pages
Academic Handbook of MIT
No ratings yet
Academic Handbook of MIT
170 pages
3F4 Power and Energy Spectral Density: Dr. I. J. Wassell
No ratings yet
3F4 Power and Energy Spectral Density: Dr. I. J. Wassell
12 pages
Experiment Title: Experimental Study of Sinusoids and Their Characteristics
No ratings yet
Experiment Title: Experimental Study of Sinusoids and Their Characteristics
5 pages
Avl
No ratings yet
Avl
5 pages
HP Calculators: HP 49G+ Return On Investment
No ratings yet
HP Calculators: HP 49G+ Return On Investment
3 pages
Daily Lesson Plan
No ratings yet
Daily Lesson Plan
8 pages
Prague Codex 1
No ratings yet
Prague Codex 1
6 pages

Paper On Graph Theory

Uploaded by

Paper On Graph Theory

Uploaded by

1

On the Graph Fourier Transform

where the proximal parameters ck,n

knowledge, a theoretical proof of its convergence is still yn+1

¯ attains its maximum when |S| = |S̄| =

(a) (b) (c)

Replacing in this last equality the expression of xn+1

E(xnk ) n obtained through Algorithm 2 for graph (a) in Fig. 1. The

GDV(x1 ) = 0 GDV(x2 ) = 0 GDV(x3 ) = 0 GDV(x1 ) = 0 GDV(x2 ) = 0 GDV(x3 ) = 2

−0.2 −0.2 0 −0.2 0 −0.2

GDV(x17 ) =0 GDV(x27 ) =0.44

GDV(x29 ) =0.57 GDV(x63 ) =0.91

GDV(x4 ) =0.026 GDV(x5 ) =0.34

GDV(x6 ) =0.4 GDV(x7 ) =0.53

You might also like