Understanding the basis of graph signal processing via an intuitive example-driven approach
Understanding the basis of graph signal processing via an intuitive example-driven approach
language between graph signals, defined on irregular signal date structured but often incomplete information,
domains, and some of the most fundamental paradigms in • New physically meaningful frameworks, specifically tai-
DSP, such as spectral analysis of multichannel signals, system lored for heterogeneous data sources, are required, and
transfer function, digital filter design, parameter estimation, • Trade-offs between performance and numerical require-
y = x + Ax, (4)
Sidebar 1: Graph Topology (Edges and Weights) Euclidean distance between vertices, rmn , may be used,
While in classic graph theory, the graphs are typically where for a given distance threshold, fi ,
given (e.g., in various computer, social, road, transporta- 2
Wmn = e −rmn =¸ or Wmn = e −rmn =¸
tion, and power networks) oftentimes, the first step in
graph signal processing is to employ background knowledge if rmn < fi and Wmn = 0 for rmn ≥ fi . This form
of signal generating mechanisms in order to define the has been used in the graph in Fig. 2, whereby the
graph as a signal domain. This poses a number of chal- altitude difference, hmn , was accounted for as Wmn =
lenges, e.g., while the data sensing points (graph vertices) e −rmn =¸ e −hmn =˛ .
are usually well defined in advance, their connectivity • Physically well defined relations among the sens-
(graph edges) is often not available. In other words, the ing positions: Examples include electric circuits, lin-
data domain definition within the graph signal paradigm ear heat transfer systems, spring-mass systems, and
represents a part of the problem itself, and has to be various forms of networks like social, computer or
determined based on the properties of the sensing positions power networks. In these cases, the edge weights are
or features of the acquired set of data. All in all, the defini- given as a part of problem definition.
tion of an appropriate graph structure is a prerequisite for • Data similarity dictates the underlying graph
physically meaningful and computationally efficient graph topology: This scenario is the most common in image
signal processing applications. and biomedical signal processing (see Sidebar 5). Vari-
Three important classes of problems regarding the defi- ous approaches and metrics can be used to define data
nition of graph edges are: similarity, including the correlation matrix between
the signals at various vertices or the corresponding
• Geometry of the vertex positions: The distances
inverse covariance (precision) matrix, combined with
between vertex positions play a crucial role in estab-
the signal smoothness and the edge sparsity condi-
lishing relations between the sensed data. In many
tions. Learning a graph (its edges) based on the set
physical processes, the presence of edges and their
of the available data is an interesting and currently
associated connecting weights is defined based on
extensively studied research area.
the vertex distances. An exponential function of the
where, by definition S0 = I, while h0 , h1 , . . . , hM −1 are For an unweighted graph, the adjacency matrix, A, is
the system coefficients to be found (see Section IX). Notice commonly used as a shift matrix, S, while the Laplacian
that for the directed and unweighted line graph in Fig. 1b) matrix, L = D − W, is used to define a shift on a weighted
(bottom), the system on a graph in (8) reduces to the well graph.
known standard Finite Impulse Response (FIR) filter, given Properties of a system on a graph: Following the above
by discussion, it is now possible to link the properties of linear
systems with those of systems on a graph. From equations
y(n) = h0 x(n)+h1 x(n−1)+· · ·+hM −1 x(n−M +1). (9) (8)-(12) the system on a graph is said to be:
• Linear, if
Remark 4: The above established link between the clas-
sical transfer function of a physical system and its graph- H(S)(a1 x1 + a2 x2 ) = a1 y1 + a2 y2 .
theoretic counterpart may serve to promote new algorithmic
approaches, which stem from signal processing, into many • Shift invariant, if
application scenarios that are directly considered as graphs. H(S)(Sx) = S(H(S)x).
Observe that the Laplacian operator applied on a signal,
Lx, can be considered as a combination of the scaled original Remark 6: A system on a graph, defined by
signal, Dx, and its weighted shifted version, Wx, since Lx = H(S) = h0 S0 + h1 S1 + · · · + hS−1 SM −1 (13)
Dx − Wx. A system defined using the graph Laplacian is
obtained from (8) by replacing S = L, and has the form is linear and shift invariant, since the matrix multiplication
of the square weighting matrices is associative S(SS) =
y = L0 x + h1 L1 x + · · · + hM −1 LM −1 x (10)
(SS)S , that is SSm = Sm S.
therefore allows us to always produce an unbiased estimate
of a constant c, that is, if x = c then y = c, since Lc = 0.
6
1
N N
1X X “ ”2
Ex = xLxT = Wnm x(n) − x(m) 35
2 n=1 m=1
and can be used to define signal smoothness since small 30
graph signals.
Since the minimum of the quadratic form xLxT corre- 30
VII. G RAPH F OURIER T RANSFORM signal, x, onto the k-th eigenvector, uk ∈ U, that is
N
While classic spectral analysis is performed in the Fourier X
domain, spectral representations of graph signals employ X(k) = x(n)uk (n). (15)
n=1
either the adjacency/weighting matrix or the graph Laplacian
eigenvalue decomposition. For the latter case we have The inverse graph Fourier transform is then straightfor-
wardly obtained as
L = UΛU−1 , x = UX (16)
In the case of a circular graph, the graph Fourier transform x = −160u1 +16u2 −8u3 −40u4 +16u5 −24u6 +ε(n), where
reduces to the standard discrete Fourier transform (DFT). For the random Gaussian noise, ε(n), had standard deviation σε =
this reason, the transform in (15) is referred to as the Graph 4.
Fourier transform (GFT).
Classic spectral analysis can thus be considered as a special VIII. S PECTRAL D OMAIN OF A S YSTEM ON G RAPHS
case of graph signal spectral analysis, with the adjacency Consider a system on a graph, as in (10), defined by its
matrix defined on an unweighted circular directed graph (a Laplacian matrix, given by
line graph with the connected last and√first vertex), when M −1
uk = [1, ej2πk/N , . . . , ejπ(N −1)k/N ]T / N . This becomes
X
y= hm Lm x. (18)
obvious by recognizing that the eigenvalues of a directed m=0
unweighted circular graph, λk = e−j2πk/N , are easily ob-
Upon employing the eigen-domain (graph spectral) represen-
tained as a solution of the eigenvalue/eigenvector (EVD)
tation of the Laplacian matrix, L = UΛU−1 , we have
relation Auk = λk uk . For a vertex n, this relation is of
the form uk (n − 1) = λk uk (n). The previous vector elements M
X −1
uk (n) and eigenvalues λk are the solutions of this difference y= hm UΛm U−1 x = U H(Λ)U−1 x, (19)
equation. It can be shown that the eigenvectors of the graph m=0
The classic spectral transfer function for (9) is then obtained g(λk ) = exp(−λk ) and to then filter the graph signal using
by using the adjacency matrix of an unweighed directed this spectral domain graph filter. For M = 4, the correspond-
circular graph whose eigenvalues are λk = e−j2πk/N . ing system coefficients can be found to be h0 = 0.9606,
h1 = −0.7453, h2 = 0.1936, and h3 = −0.0162. Upon
IX. S PECTRAL D OMAIN F ILTER D ESIGN signal filtering using the so defined graph transfer function,
Consider a desired graph transfer function, G(Λ). Like in the output signal-to-noise ratio was SN R = 21.74 dB, that is
classic signal processing, a system with this transfer function a 7.54 dB improvement over the original signal-to-noise ratio
can be implemented either in the spectral domain or in the SN R0 = 14.2 dB.
vertex domain. More detail on the solution of the system in (22) and (23)
The spectral domain implementation is straightforward and is provided in Sidebar 4.
can be performed in the following three steps:
1) Calculate the GFT of the input graph signal X = U−1 x, X. O PTIMAL D ENOISING
2) Multiply the GFT of the input graph signal with transfer Consider a measurement, as in the temperature measure-
function G(Λ) to obtain Y = G(Λ)X, and ment scenario in Fig. 1, which is composed of a slow-
3) Calculate the output graph signal as the inverse graph varying desired signal, s, and a superimposed fast changing
Fourier transform of Y to yield y = UY. disturbance, ε, to give
Notice that this procedure may be computationally very
x = s + ε.
demanding for large graphs where it may be easier to im-
plement the desired filter (or its close approximation) in the The aim is to design a graph filter for disturbance suppression
vertex domain, in analogy to the time domain in the classical (denoising), the output of which is denoted by y, [11].
approach. This means that we have to find the coefficients, The optimal denoising task can then be defined through a
h0 , h1 , . . . , hM −1 in (8), such that its spectral representation, minimization of the cost function
H(Λ), is equal (or at least as close as possible) to the desired 1
G(Λ). J = ky − xk22 +αyT Ly. (25)
2
In other words, the transfer function of the vertex do-
main system in (20), given by H(λk ) = h0 + h1 λ1k + The minimization of the first term, 21 ky − xk22 , enforces the
. . . hM −1 λM −1
, should be equal to the desired transfer func- output signal, y, to be as close as possible, in terms of the
k
tion, G(λk ), for each spectral index, k. This condition leads minimum residual disturbance power, to the available obser-
to a system of linear equations vations, x. As mentioned before, the second term, yT Ly,
represents a measure of smoothness of the graph filter output,
−1
h0 + h1 λ11 + . . . hM −1 λM
1 = G(λ1 ) y. For more detail on promoting smoothness of a graph signal,
−1
h0 + h1 λ12 + . . . hM −1 λM
2 = G(λ2 ) see Sidebar 2. The parameter α models a balance between
.. the closeness of the output, y, to the observed data, x, and
. the smoothness of output estimate y. While the problem in
h0 + h1 λ1N + . . . hM −1 λM −1
= G(λN ). (22) (25) could be expressed through a constrained Lagrangian
N
optimization, whereby we choose to focus more on the graph
The matrix form of this system is given by theoretic issues and hence we adopt a simpler option whereby
Vλ h = g, (23) the mixing parameter α is chosen empirically.
The solution to this minimization problem follows from
where Vλ is a Vandermonde matrix formed of the eigenvalues,
∂J
λk , while h = [h0 , h1 , . . . , hM −1 ]T is the vector of system = y − x + 2αLy = 0
coefficients that we wish to estimate, and ∂yT
and results in a smoothing optimal denoiser in the form
g = [G(λ1 ), G(λ2 ), . . . , G(λN )]T = diag(G(Λ)).
y = (I + 2αL)−1 x.
The system order M is typically significantly lower than the
number of equations, N , in (22). For such an overdetermined The Laplacian spectral domain form of this relation is
case, the least-squares approximation of h is obtained by
2
minimizing the squared error, e2 = kVλ h − gk2 . Like in Y = (I + 2αΛ)−1 X,
standard least-squares, the solution is obtained by a direct with the corresponding graph filter transfer function
minimization, ∂e2 /∂hT = 0, to yield
1
ĥ = (VλT Vλ )−1 VλT g = pinv(Vλ )g. (24) H(λk ) = .
1 + 2αλk
The so obtained solution, ĥ, therefore represents the mean For a small α, H(λk ) ≈ 1 and y ≈ x, while for a large
square error minimizer for Vλ h = g. Notice that this solution α, H(λk ) ≈ δ(k) and y ≈ const., which enforces y to
may not satisfy Vλ h = g, in which case the coefficients ĝ be maximally smooth (a constant, without any variation).
(its spectrum Ĝ(Λ)) may be used, that is Using α = 4, the obtained output signal-to-noise ratio for
the graph signal from Fig. 3 was SN R = 26 dB, a 11.8 dB
Vλ ĥ = ĝ.
improvement over the original SN R0 = 14.2 dB.
Such a solution, in general, differs from the desired system Remark 8: There are many cases when the graph topology is
coefficients g (its spectrum G(Λ)). unknown, so that the graph structure, i.e., the Laplacian (graph
Example: Consider the graph signal from Fig. 3. The edges and their weights) is also unknown. To this end, we
task is to design a graph filter whose frequency response is may employ a class of methods for graph topology learning,
9
1
Sidebar 4: Comments on the Graph Filter in (22) (M − Nm ) filter coefficients are free variables.
An infinite number of equivalent filters is ob-
Consider the following cases:
tained.
1) All the eigenvalues of L are distinct: b) For M = Nm , the solution is unique.
a) For M = N, the solution is unique. c) For M < Nm (overdetermined system), the
b) For M < N (overdetermined system), the mean mean square sense solution is obtained.
square sense solution is obtained. 3) Any filter of an order M > Nm has a unique
2) Some of the eigenvalues are of a degree higher than equivalent filter whose order is at most Nm . Such
one, the system reduces to Nm < N linear equations. equivalence can be obtained by setting the free vari-
a) For Nm < M ≤ N (underdetermined system), ables to zero, hi = 0 for i = Nm ; Nm + 1; : : : ; N − 1.
based on the minimization of the cost function in (25) with meaningful nature of this example-driven Lecture Note is also
respect to both the Laplacian, L, and the output signal, y, likely to promote intellectual curiosity and serve as a platform
with additional (commonly sparsity) constraints imposed on to explore the numerous opportunities in manifold applica-
the Laplacian values. tions in our ever-growing interconnected world, facilitated by
the Internet of Things.
XI. C URRENT G RAPH S IGNAL P ROCESSING C HALLENGES
Current research is mainly focused on graphs themselves, ACKNOWLEDGMENTS
like for example, on reducing the complexity of calculation in We are privileged to have had the help and advice of one
very large graphs, including downsampling, multirate analy- of the pioneers in Graph Theory Professor Nicos Christofides.
sis, compressive sensing, graph segmentation, non-linear GSP, We are grateful for his time, his incisive comments and
robust GSP, deep learning architectures for graph signals, mul- valuable advice. We would also like to express our sin-
tidimensional graph signals, and vertex-varying and vertex- cere gratitude to the students in our respective postgraduate
frequency analysis. courses, for their feedback on the material taught based on
this Lecture Note.
XII. W HAT W E H AVE L EARNED
Natural signals (speech, biomedical, video) reside over AUTHORS
irregular domains and are, unlike the signals in communica- Ljubiša Stanković, FIEEE, ([email protected]) is profes-
tions, not adequately processed using, e.g., standard harmonic sor at the University of Montenegro. His research interests
analyses. While Data Analytics are heavily dependent on include time-frequency analysis, compressive sensing, and
advances in DSP, neither the EE graduates worldwide nor graph signal processing. He is a vice-president of the National
practical data analysts are yet best prepared to employ graph Academy of Sciences and Arts of Montenegro (CANU) and a
algorithms in their future jobs. Our aim has been to fill member of the European Academy of Sciences and Arts. Prof.
this void by providing an example-driven platform to intro- Stanković is a recipient of the 2017 EURASIP Best Journal
duce graphs and their properties through the well understood Paper Award.
notions of transfer functions, Fourier transform, and digital Danilo P. Mandic, FIEEE, ([email protected]) is
filtering. a professor of signal processing at Imperial College London,
While both a graph with N vertices and a classical discrete United Kingdom. He is a member of the IEEE Signal Process-
time signal with N samples can be viewed as N -dimensional ing Society Education Technical Committee, and has received
vectors, structured graphs are much richer irregular domains President’s Award for Excellence in Postgraduate Supervision
which convey information about both the signal generation at Imperial College. He is a recipient of the 2018 Best Paper
and propagation mechanisms. This allows us to employ intu- Award in IEEE Signal Processing Magazine.
ition and our know-how from Euclidean domains to revisit Miloš Daković ([email protected]) is professor at the Univer-
basic dimensionality reduction operations, such as coarse sity of Montenegro. His research interests include graph signal
graining of graphs (cf. standard downsampling). In addition, processing, and time-frequency analysis.
in the vertex domain a number of different distances (shortest- Ilya Kisil ([email protected]) is a Ph.D. candidate
path, resistance, diffusion) have useful properties which can at Imperial College London. His research interests include
be employed to maintain data integrity throughout the pro- tensor decompositions, big data, efficient software for large
cessing, storage, communication and analysis stages, as the scale problems, and graph signal processing.
connectivities and edge weights are either dictated by the Ervin Sejdić, SMIEEE, ([email protected]) is an assistant
physics of the problem at hand or are inferred from the data. professor at the University of Pittsburgh, USA. His research
This particularly facilitates maintaining control and intuition interests include biomedical signal processing, rehabilitation
over distributed operations throughout the processing chain. engineering, and neuroscience. He received the USA Presi-
It is our hope that this lecture note has helped to demystify dential Early Career Award for Scientists and Engineers in
graph signal processing for students and educators, together 2016.
with empowering practitioners with enhanced intuition in Anthony G. Constantinides, LFIEEE, (a.constantinides
graph-theoretic design and optimization. This material may @imperial.ac.uk) is emeritus professor of signal processing
also serve as a vehicle to seamlessly merge curricula in Elec- in the Department of Electrical and Electronic Engineering at
trical Engineering and Computing. The generic and physically Imperial College London, United Kingdom.
10
1
R EFERENCES
[1] N. Christofides, “Graph theory: An algorithmic approach”, Academic
Press, 1975.
[2] F. Afrati and A. G. Constantinides, “The use of graph theory in binary
block code construction”, in Proceedings of the International Conference
on Digital Signal Processing, pp. 228-233, Florence, Italy, 310 August
- 2 September, 1978
[3] O. J. Morris, M. de J. Lee, and A. G. Constantinides, “Graph theory for
image analysis: An approach based on the shortest spanning tree”, IEE
Proceedings F-Communications, Radar and Signal Processing, vol. 133,
no. 2, pp. 146-152, 1986.
[4] S. Chen, R. Varma, A. Sandryhaila, and J. Kovačević, “Discrete signal
processing on graphs: Sampling theory,” IEEE Trans. on Signal Process-
ing, vol. 63, no. 24, pp. 6510-6523, Dec.15, 2015.
[5] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on
graphs,” IEEE Transactions on Signal Processing, vol. 61, no. 7, pp.
1644–1656, Apr. 2013.
[6] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on
graphs: Frequency analysis,” IEEE Transactions on Signal Processing,
vol. 62, no. 12, pp. 3042–3054, Jun. 2014.
[7] G. Cheung, E. Magli, Y. Tanaka, and M. K. Ng, “Graph spectral image
processing,” Proceedings of the IEEE, vol. 106(5), pp. 907-930, May
2018.
[8] A. Ortega, P. Frossard, J. Kovačević, J. M. F. Moura, P. Vandergheynst,
“Graph signal processing: Overview, challenges, and applications”, Pro-
ceedings of the IEEE, vol. 106(5), pp. 808–28, May 2018.
[9] S. Saito, H. Suzuki, and D. P. Mandic, “Hypergraph p-Laplacian: A
differential geometry view”, in Proceedings of the the Thirty Second
AAAI Conference on Artificial Intelligence (AAAI-18), pp. 3984–3991,
2018.
[10] L. Stanković and E. Sejdić, “Vertex-frequency analysis of graph sig-
nals,”, Springer Nature, 2019.
[11] S. Segarra, A. G. Marques, and A. Ribeiro, “Optimal graph-filter
design and applications to distributed linear network operators”, IEEE
Transactions on Signal Processing, 65(15), pp. 4117–4131, 2017.