0% found this document useful (0 votes)
29 views16 pages

2008 - Graph Signal Processing For Geometric Data and Beyond

The document provides an overview of graph signal processing techniques for processing geometric data like point clouds, meshes, and dynamic point clouds. It discusses how geometric data can be represented as graphs and processed using graph signal processing and graph neural network methods. The techniques allow for processing of irregularly sampled geometric data and have applications in areas like augmented reality, autonomous driving, and surveillance.

Uploaded by

Milton
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views16 pages

2008 - Graph Signal Processing For Geometric Data and Beyond

The document provides an overview of graph signal processing techniques for processing geometric data like point clouds, meshes, and dynamic point clouds. It discusses how geometric data can be represented as graphs and processed using graph signal processing and graph neural network methods. The techniques allow for processing of irregularly sampled geometric data and have applications in areas like augmented reality, autonomous driving, and surveillance.

Uploaded by

Milton
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

IEEE TMM OVERVIEW ARTICLE 1

Graph Signal Processing for Geometric Data and


Beyond: Theory and Applications
Wei Hu, Senior Member, IEEE, Jiahao Pang, Member, IEEE, Xianming Liu, Member, IEEE,
Dong Tian, Senior Member, IEEE, Chia-Wen Lin, Fellow, IEEE, and Anthony Vetro, Fellow, IEEE

Abstract—Geometric data acquired from real-world scenes, observed from different positions. This connection among
e.g., 2D depth images, 3D point clouds, and 4D dynamic point the different scene representations has also been embraced
clouds, have found a wide range of applications including and reflected in the work plans for the development of the
arXiv:2008.01918v3 [cs.CV] 4 Sep 2021

immersive telepresence, autonomous driving, surveillance, etc.


Due to irregular sampling patterns of most geometric data, JPEG Pleno standardization framework [6]. In this paper, we
traditional image/video processing methodologies are limited, mainly consider explicit representations of geometry, which
while Graph Signal Processing (GSP)—a fast-developing field in directly describe the underlying geometry, but the framework
the signal processing community—enables processing signals that and techniques extend to implicit representations of geometry,
reside on irregular domains and plays a critical role in numerous in which the underlying geometry is present in the data but
applications of geometric data from low-level processing to high-
level analysis. To further advance the research in this field, we needs to be inferred, e.g., from camera data. Examples of
provide the first timely and comprehensive overview of GSP explicit geometric representations include 2D geometric data
methodologies for geometric data in a unified manner by bridging (e.g., depth maps), 3D geometric data (e.g., point clouds and
the connections between geometric data and graphs, among the meshes), and 4D geometric data (e.g., dynamic point clouds),
various geometric data modalities, and with spectral/nodal graph as demonstrated in Fig. 1. Examples of implicit geometric
filtering techniques. We also discuss the recently developed Graph
Neural Networks (GNNs) and interpret the operation of these representations include camera-based inputs, e.g., multiview
networks from the perspective of GSP. We conclude with a brief video. For many cases of interest that aim to render immersive
discussion of open problems and challenges. imagery of a scene, the focus will be on dense representations
Index Terms—Graph Signal Processing (GSP), Geometric of geometry. However, there are also some applications of
Data, Riemannian Manifold, Graph Neural Networks (GNNs), interest that benefit from sparse representations of geometry,
Interpretability such as human activity analysis, in which the geometry of the
human body can be represented with few data points.
I. I NTRODUCTION Traditional image/video processing techniques assume sam-
pling patterns over regular grids and have limitations when
R ECENT advances in depth sensing, laser scanning and
image processing have enabled convenient acquisition
and extraction of geometric data from real-world scenes, which
dealing with the wide range of geometric data formats, some
of which have irregular sampling patterns. To overcome the
limitations of traditional techniques, Graph Signal Processing
can be digitized and formatted in a number of different ways.
(GSP) techniques have been proposed and developed in recent
Efficiently representing, processing, and analyzing geometric
years to process signals that reside over connected graph
data is central to a wide range of applications from augmented
nodes [7]–[9]. For geometric data, each sample is denoted
and virtual reality [1], [2] to autonomous driving [3] and
by a graph node and the associated 3D coordinate (or depth)
surveillance/monitoring applications [4].
is the signal to be analyzed. The underlying surface of
Geometric data may be represented in various data formats.
geometric data provides an intrinsic graph connectivity or
It has been recognized by Adelson, et al. [5] that different
graph topology. The graph-based representation has several
representations of a scene can be expressed as approximations
advantages over conventional representations in that it is more
of the plenoptic function, which is a high-dimensional math-
compact and accurate, and structure-adaptive since it naturally
ematical representation that provides complete information
captures geometric characteristics in the data, such as piece-
about any point within a scene and also how it changes when
wise smoothness (PWS) [10].
Manuscript received March 31, 2021. (Corresponding author: Chia-Wen A unified framework of GSP for geometric data is illustrated
Lin) in Fig. 1, in which we highlight how geometric data and graph
Wei Hu is with Wangxuan Institute of Computer Technology, Peking operators are counterparts in the context of Riemannian man-
University, Beijing, China. (e-mail: [email protected])
Jiahao Pang and Dong Tian are with InterDigital, Princeton, NJ, USA. (e- ifolds. Given continuous functions on Riemannian manifolds,
mail: [email protected], [email protected]) geometric data are discrete samples of the functions repre-
Xianming Liu is with Harbin Institute of Technology, Harbin, China. (e- senting the geometry of objects, which often lies on a low-
mail: [email protected])
Chia-Wen Lin is with Department of Electrical Engineering and Institute of dimensional manifold, e.g., 3D point clouds essentially repre-
Communications Engineering, National Tsing Hua University, Hsinchu, Tai- sent 2D surfaces embedded in the 3D space. Correspondingly,
wan, and with Electronic and Optoelectronic System Research Laboratories, graph operators are discrete counterparts of the continuous
Industrial Technology Research Institute. (e-mail: [email protected])
Anthony Vetro is with Mitsubishi Electric Research Laboratories, Cam- functionals defined on Riemannian manifolds. Theoretically, it
bridge, MA, USA. (e-mail: [email protected]) has been shown that graph operators converge to functionals
IEEE TMM OVERVIEW ARTICLE 2

Data Operator

Continuous Functional on
Continuous Functions Riemannian Manifolds
on Riemannian Manifolds
Discrete Counterpart
Process
Sample
Graph Operator

Graph
Inference
Discrete
Geometric Data 2D depth map 3D Point Cloud
Graph Signal Processing
Spectral-domain Nodal-domain
… Methods Methods
Interpret
4D Dynamic Point Cloud time
Graph Neural Networks

Fig. 1: Illustration of GSP for geometric data processing.

on Riemannian manifolds under certain constraints [11], while the signal model and processing methods.
graph regularizers converge to smooth functionals on Rieman-
This overview paper distinguishes itself from relevant re-
nian manifolds that is capable of enforcing low dimensionality
view papers such as [9], [19]–[22] in the following aspects.
of data [12]–[14]. Hence, GSP tools are naturally advantageous
While [9] provides a general overview for GSP covering core
for geometric data processing by representing the underlying
ideas in GSP and recent advances in developing basic GSP
topology of geometry on graphs.
tools with a variety of applications, our paper is dedicated to
A graph operator is typically constructed based on do- GSP for geometric data with unique signal characteristics that
main knowledge or inferred from training data as shown have led to new insights and understanding. Compared with
in Fig. 1. It essentially specifies a graph filtering process, [19] and [20] which provide a comprehensive overview of
which can be performed either in the spectral-domain (i.e., geometric deep learning including GNNs, we focus on those
graph transform domain) [15] or nodal-domain (i.e., spatial GNNs that are motivated or interpretable by GSP tools. In
domain) [16], which are referred to as spectral-domain GSP comparison with [21] that reviews recent progress in deep
methods and nodal-domain GSP methods, respectively. Nodal- learning methods for point clouds, we emphasize on GNNs
domain methods typically avoid eigen-decomposition for fast for geometric data that are explainable via GSP, while in
computing over large-scale data while still relying on spectral [21], graph-based methods are discussed only as one of
analysis to provide insights [17]. A nodal-domain method many types of approaches for 3D shape classification and
might also be specified through a graph regularizer to enforce point cloud segmentation without further discussion of the
graph-signal smoothness [14], [18]. Sparsity and smoothness model interpretability. Furthermore, compared with [22] that
are two widely used domain models. Additionally, Graph analyzes machine learning on graphs from the graph diffusion
Neural Networks (GNNs) have been developed to enable perspective and connects different learning algorithms on
inference with graph signals including geometric data [19], graphs with different diffusion models, we emphasize the
which are often motivated or interpretable by GSP tools. graph signal processing aspect of graph neural networks, and
Hence, methodologically, we will first elaborate on spectral- endeavor to interpret their behavior in both the spectral and
domain and nodal-domain GSP methods for geometric data the nodal domains, as well as several aspects to understand
respectively, then discuss the interpretability of GNNs from the representation learning of graph neural networks from the
the perspective of GSP. perspective of GSP as discussed in Section VI. In summary,
In practice, GSP for geometric data plays a critical role this paper provides an overview of GSP methods specifically
in numerous applications of geometric data, from low-level for a unique and important class of data—geometric data, as
processing, such as restoration and compression, to high-level well as insights into the interpretability of GNNs from the
analysis. The processing of geometric data includes denoising, perspective of GSP tools.
enhancement and resampling, as well as compression such as
point cloud coding standardized in MPEG1 and JPEG Pleno2 , The remainder of this paper is organized as follows. Sec-
while the analysis of geometric data addresses supervised or tion II reviews basic concepts in GSP, graph Fourier Trans-
unsupervised feature learning for classification, segmentation, form, as well as interpretation of graph variation operators
detection, and generation. These applications are unique rela- both in the discrete domain and continuous domain. Section III
tive to the use of GSP techniques for other data in terms of introduces the graph representation of geometric data based on
their characteristics, along with problems and applications of
1 https://mpeg.chiariglione.org/standards/mpeg-i/point-cloud-compression geometric data to be discussed throughout the paper. Then, we
2 https://jpeg.org/jpegpleno/ elaborate on spectral-domain GSP methods for geometric data
IEEE TMM OVERVIEW ARTICLE 3

time

(a) 2D Depth Map (b) 3D Point Cloud (c) 4D Dynamic Point Cloud

Fig. 2: Geometric data and their graph representations. The graphs of the patches enclosed in red squares are shown at the
bottom; the vertices are colored by the corresponding graph signals. (a) 2D Depth map [24]. (b) 3D Point cloud [25]. (c) 4D
dynamic point cloud [26], where the temporal edges of a point P are also shown.

in Section IV and nodal-domain GSP methods in Section V. λ2 ≤ ... ≤ λN }. We refer to the eigenvalue λi as the graph
Next, we provide the interpretations of GNNs for geometric frequency/spectrum, with a smaller eigenvalue corresponding
data from the perspective of GSP in Section VI. Finally, future to a lower graph frequency.
directions and conclusions are discussed in Section VII and For any graph signal x ∈ RN residing on the vertices of G,
Section VIII, respectively. its graph Fourier transform (GFT) x̂ ∈ RN is defined as [15]

II. R EVIEW: G RAPH S IGNAL P ROCESSING x̂ = U> x. (1)


A. Graph Variation Operators and Graph Signal The inverse GFT follows as
We denote a graph G = {V, E, A}, which is composed of a x = Ux̂. (2)
vertex set V of cardinality |V| = N , an edge set E connecting
vertices, and an adjacency matrix A. Each entry ai,j in A With an appropriately constructed graph that captures the
represents the weight of the edge between vertices i and j, signal structure well, the GFT will lead to a compact repre-
which often captures the similarity between adjacent vertices. sentation of the graph signal in the spectral domain, which is
In geometric data processing, we often consider an undirected beneficial for geometric data processing such as reconstruction
graph with non-negative edge weights, i.e., ai,j = aj,i ≥ 0. and compression.
Among variation operators in GSP, we focus on the com-
monly used graph Laplacian matrix. The combinatorial graph C. Interpretation of Graph Variation Operators
Laplacian [7] is defined as L := D−A, where D is the degree The graph variation operators have various interpretations,
PN
matrix—a diagonal matrix where di,i = j=1 ai,j . Given
both in the discrete domain and the continuous domain.
real and non-negative edge weights in an undirected graph, In the discrete domain, we can interpret graph Laplacian
L is real, symmetric, and positive semi-definite [23]. The matrices by precision matrices under Gaussian-Markov Ran-
1 1
symmetrically normalized version is Lsym := D− 2 LD− 2 , and dom Fields (GMRFs) [27] from a probabilistic perspective,
−1 and thus further show the GFT approximates the Karhunen-
the random walk graph Laplacian is Lrw := D L, which are
often used for theoretical analysis or in neural networks due Loève transform (KLT) for signal decorrelation under GMRFs.
to the normalization property. As discussed in [28], there is a one-to-one correspondence
A graph signal is a function that assigns a scalar or vector between precision matrices of different classes of GMRFs
to each vertex. For simplicity, we consider x : V → R, such and types of graph Laplacian matrices. For instance, the
as the intensity on each vertex of a mesh. We denote graph combinatorial graph Laplacian corresponds to the precision
signals as x ∈ RN , where xi represents the signal value at the matrix of an attractive, DC-intrinsic GMRF. Further, as the
i-th vertex. eigenvectors of the precision matrix (the inverse of the co-
variance matrix) constitute the basis of the KLT, the GFT
B. Graph Fourier Transform approximates the KLT under a family of statistical processes,
Because L is a real symmetric matrix, it admits an eigen- as proved in different ways in [10], [29]–[31]. This indicates
decomposition L = UΛU> , where U = [u1 , ..., uN ] is the GFT is approximately the optimal linear transform for
an orthonormal matrix containing the eigenvectors ui , and signal decorrelation, which is beneficial to the compression
Λ = diag(λ1 , ..., λN ) consists of eigenvalues {λ1 = 0 ≤ of geometric data as will be discussed in Section IV-C2.
IEEE TMM OVERVIEW ARTICLE 4

TABLE I: Representative Geometric Datasets and Relevant Application Scenarios.


Geometric Data Format Datasets Contents Typical Applications/Tasks

FlyingThings3D [34] Synthetic scene


Middlebury [24]
2D depth map Tsukuba [35] Indoor scene Stereo matching, depth completion

KITTI [36] Driving scene


Stanford 3D Scanning Repository [37]
Benchmark [38] Single object
3D telepresence, surface reconstruction
MPEG Sequences [39]
Microsoft Sequences [26] Single person
ShapeNet [40]
3D point cloud ModelNet [41] Single object Classification, part segmentation

Stanford Large-Scale 3D Indoor Spaces Dataset [42]


ScanNet [25] Indoor scene
Semantic/instance segmentation
KITTI [36]
WAYMO Open Dataset [43] Driving scene

MPEG Sequences [39]


Microsoft sequences [26] Single person 3D telepresence, compression
4D dynamic point cloud
KITTI [36]
Semantic KITTI [44] Driving scene Semantic/instance segmentation, detection

In the continuous domain, instead of viewing a neighbor- geometric data and serve as the basis of GSP for geometric
hood graph as inherently discrete, it can be treated as a discrete data processing. Also, we discuss and compare with non-graph
approximation of a Riemannian manifold [11], [32]. Thus, representations, which helps understand the advantages and
as the number of vertices on a graph increases, the graph insights of graph representations.
is converging to a Riemannian manifold. In this scenario,
each observation of a graph signal is a discrete sample of A. Problems and Challenges of Geometric Data
a continuous signal (function) defined on the manifold. Note There exist various problems associated with geometric
that not all graph signals can be interpreted in the continuous data, e.g., noise, holes (incomplete data), compression arti-
domain: voting pattern in a social network or paper informa- facts, large data size, and irregular sampling. For instance,
tion in a citation network is inherently discrete. With a focus due to inherent limitations in the sensing capabilities and
on geometric data which are indeed signals captured from a viewpoints that are acquired, geometric data often suffer from
continuous surface, we have a continuous-domain interpre- noise and holes, which will affect the subsequent rendering
tation of graph signals as discrete samples of a continuous or downstream inference tasks since the underlying structures
function (Fig. 1). The link between neighborhood graphs and are deformed.
Riemannian manifolds enables us to process geometric data These problems must be accounted for in the diverse range
with tools from differential geometry and variational methods of applications that rely on geometric data, including pro-
[33]. For instance, the graph Laplacian operator converges cessing (e.g., restoration and enhancement), compression, and
to the Laplace-Beltrami operator in the continuous manifold analysis (e.g., classification, segmentation, and recognition).
when the number of samples tends to infinity. Hence, without Some of the representative geometric datasets along with the
direct access to the underlying geometry (surface), it is still corresponding application scenarios are summarized in Table I.
possible to infer the property of the geometry based on its We assert that the chosen representation of geometric data
discrete samples. is critically important in addressing these problems and appli-
For a clearer presentation, Table II summarizes the most cations. Next, we discuss the characteristics of geometric data,
important mathematical symbols used in this paper. which lay the foundation for using graphs for representation.
TABLE II: Key notations employed in this review article.
B. Characteristics of Geometric Data
Notation Description
Geometric data represent the geometric structure underlying
G The graph being studied.
A Graph adjacency matrix.
the surface of objects and scenes in the 3D world and have
L Graph Laplacian matrix. unique characteristics that capture structural properties.
U Inverse graph Fourier transform matrix. For example, 2D depth maps characterize the per-pixel
x Geometric data (graph signal) being studied.
ai,j Graph weight connecting vertices i and j.
physical distance between objects in a 3D scene and the sensor,
λi Graph frequency/spectrum. which usually consists of sharp boundaries and smooth interior
ĥ(·) Spectral-domain filter coefficient. surfaces—referred to as piece-wise smoothness (PWS) [10],
hk Nodal-domain filter coefficient. as shown in Fig. 2(a). The PWS property is suitable to be
described by a graph, where most edge weights are 1 for
III. G RAPH R EPRESENTATIONS OF G EOMETRIC DATA smooth surfaces and a few weights are 0 for discontinuities
In this section, we elaborate on the graph representations of across sharp boundaries. Such a graph construction will lead
geometric data, which arise from the unique characteristics of to a compact representation in the GFT domain, where most
IEEE TMM OVERVIEW ARTICLE 5

energy is concentrated on low-frequency components for the D. Graph Representations of Geometric Data
description of smooth surfaces [10]. In contrast, graphs provide structure-adaptive, accurate,
3D geometric data such as point clouds form omnidirec- and compact representations for geometric data, which further
tional representations of a geometric structure in the 3D world. inspire new insights and understanding.
As shown in Fig. 2(b), the underlying surface of the 3D To represent geometric data on a graph G = {V, E, A}, we
geometric data often exhibits the PWS property, as given by consider points in the data (e.g., pixels in depth maps, points
the normals of the data [45]. Moreover, 3D point clouds lie in point clouds and meshes) as vertices V with cardinality
on a 2D manifold, as they represent 2D surfaces embedded in N . Further, for the i-th point, we represent the coordinate and
the 3D space. possibly associated attribute (pi , ai ) of each point as the graph
For 4D geometric data such as dynamic point clouds, con- signal on each vertex, where pi ∈ R2 or pi ∈ R3 represents
sistency/redundancy exists along the temporal dimension [46], the 2D or 3D coordinate of the i-th point (e.g., 2D for depth
[47], as shown in Fig. 2(c). However, in contrast to conven- maps, and 3D for point clouds), and ai represents associated
tional video data, the temporal correspondences in dynamic attributes, such as depth values, RGB colors, reflection intensi-
point clouds are difficult to track, mainly because 1) the ties, and surface normals. To ease mathematical computation,
sampling pattern may vary from frame to frame due to the we denote the graph signal of all vertices by a matrix,
irregular sampling; and 2) the number of points in each frame  T
of a dynamic point cloud sequence may vary significantly. x1
 xT2 
Thanks to the unique signal characteristics of geometric
X =  .  ∈ RN ×d , (3)
 
data, we may design methods tailored for geometric data  .. 
instead of methods for general graph data. For instance, xTN
particular graph smoothness priors may be taken into account
so that methods are optimized for the PWS property of depth where the i-th row vector xTi = [pTi aTi ] ∈ R1×d represents
maps, which leads to more robust or efficient processing as the graph signal on the i-th vertex and d denotes the dimension
will be discussed in Section V-B. of the graph signal.
To capture the underlying structure, we use edges E in the
C. Non-Graph Representations of Geometric Data graph to describe the pairwise relationship (spatio-temporal
There exist a variety of non-graph representations of ge- relationship for 4D geometric data [53]–[55]) between points,
ometric data. For instance, depth maps are represented as which is encoded in the adjacency matrix A as reviewed in
gray-scale images, while 3D point clouds are often quantized Section II. The construction of A, i.e., graph construction, is
onto regular voxel grids [48] or projected onto a set of depth crucial to characterize the underlying topology of geometric
images from multiple viewpoints [49]. These quantization- data. We classify existing graph construction methods mainly
based representations transform geometric data into regular into two families: 1) model-based graph construction, which
Euclidean space, which is amenable to existing methods for builds graphs with models from domain knowledge [56], [57];
Euclidean data such as images, videos, and regular voxel grids. and 2) learning-based graph construction, which infers/learns
Further, implicit functions (e.g., Signed Distance Function the underlying graph from geometric data [58]–[61].
(SDF)) for 3D shape representation have been proposed [50], Model-based graph construction for geometric data often
[51], which represent a shape’s surface by a discrete or assumes edge weights are inversely proportional to the affini-
continuous volumetric field: the magnitude of a point in the ties in coordinates, such as a K-nearest-neighbor graph (K-
field represents the distance to the surface boundary and the NN graph) and an -neighborhood graph (-N graph). A K-
sign indicates whether the region is inside (-) or outside (+) of NN graph is a graph in which two vertices are connected by
the shape. This enables the high-quality representation of the an edge, when their Euclidean distance is among the K-th
shape surface, interpolation and completion from partial and smallest Euclidean distances from one point to the others;
noisy 3D input data. Besides, the sparse tensor representation while in an -N graph, two vertices are connected if their
is employed due to its expressiveness and generalizability for Euclidean distance is smaller than a given threshold . A
high-dimensional spaces [52]. It also allows homogeneous data K-NN graph intends to maintain a constant vertex degree
representation within traditional neural network libraries as in the graph, which may lead to a more stable algorithm
most of them support sparse tensors. implementation; while an -N graph intends to make the
vertex degree reflect local point density, leading to more
In spite of the advantages, these non-graph representations
physical interpretation. Though these graphs exhibit manifold
may have the following limitations: 1) Most importantly,
convergence properties [11], [33], it still remains challenging
representing geometric data without graphs is often deficient
to find an efficient estimation of the sparsification parameters
in capturing the underlying geometric structure explicitly. 2)
such as K and  given finite and non-uniformly sampled data.
Quantization-based representations are sometimes inaccurate,
In learning-based graph construction, the underlying graph
e.g., due to quantization loss introduced by voxelization or
topology is inferred or optimized from geometric data in
projection errors when a point cloud is represented by a set of
terms of a certain optimization criterion, such as enforcing
images or discretized SDF; and 3) The representations can be
low-frequency representations of observed signals. For ex-
redundant, e.g., a voxel-based representation of point clouds
ample, given a single or partial observation, [18] optimizes
still needs to represent an unoccupied space with zeros, leading
a distance metric from relevant feature vectors on vertices
to redundant storage or processing.
IEEE TMM OVERVIEW ARTICLE 6

Optimized Graph Transform GLR for Point Cloud Denoising


(Shao et al.) (Zeng et al.)

Collaborative Denoising GB-JBU Graph-based Denoising ChebNet OGLR Fast Resampling GSP for DNNs RGCNN Graph Topology Inference GLNN
(Rosman et al.) (Wang et al.) (Schoenenberger et al.) (Defferrard et al.) (Pang et al.) (Chen et al.) (Gripon et al.) (Te et al.) (Chen et al.) (Gao et al.)

Discrete Regularization GBT for Depth DCTV Spectral Local Graph Network GGFT Dynamic PCC LERaG GCN MoNet AGCN Graph-based Blind Deblurring GR-GCN Predictive GGFT
(Elmoataz et al.) (Kim et al.) (Couprie et al.) (Bruna et al.) (Hu et al.) (Anis et al.) (Liu et al.) (Kipf et al.) (Monti et al.) (Li et al.) (Bai et al.) (Gao et al.) (Xu et al.)

2008 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

TDS+GBT MR-GBT NLGBT MR-GFT Graph Lifting Transform PC Attribute Compression GPT Cluster-based PCC Point Cloud Inpainting on Graphs SDFGLR RGLN
(Cheung et al.) (Hu et al.) (Hu et al.) (Hu et al.) (Chao et al.) (Cohen et al.) (Queiroz et al.) (Xu et al.) (Hu et al.) (Dinesh et al.) (Tang et al.)

Graph Conv Layer Sparse PC Attribute Compression Approximate FGFT GTGC SGC Feature Graph Learning
Spectral-domain GSP Methods
(Henaff et al.) (Cohen et al.) (Hu et al.) (Rente et al.) (Wu et al.) (Hu et al.)
Nodal-domain GSP Methods

GSP-interpretable Graph Neural Networks ECC Graph Attention Networks DGCNN GSDN
(Simonovsky et al.) (Veličković et al.) (Wang et al.) (Fu et al.)

Fig. 3: Representative works leveraging graph signal processing (GSP) to process or analyze geometric data.

by minimizing the graph Laplacian regularizer, leading to high-pass graph spectral filters, which will be discussed further
learned edge weights. Besides, edge weights could be trainable in the next subsection.
in an end-to-end learning manner [62]. Also, general graph Due to the computational complexity of graph transforms,
learning methodologies can apply to the graph construction of which often involve full eigen-decomposition, this class of
geometric data [58]–[61]. methods are either dedicated to small-scale geometric data
or applied in a divide-and-conquer manner. For instance, one
IV. S PECTRAL -D OMAIN GSP M ETHODS FOR may divide a point cloud into regular cubes, and perform graph
G EOMETRIC DATA spectral filtering on individual cubes separately. Also, one may
Based on the aforementioned graph representations, we will deploy a fast algorithm of GFT (e.g., the fast GFT in [63]),
elaborate on GSP methodologies for geometric data, including to accelerate the spectral filtering process.
spectral-domain GSP methods, nodal-domain GSP methods,
and GSP-interpretable graph neural networks. Representative B. Representative Graph Spectral Filtering
methods using GSP to process/analyze geometric data are 1) Low-Pass Graph Spectral Filtering: Analogous to pro-
summarized in chronological order in Fig. 3. We start from cessing digital images in the regular 2D grid, we can use a
the spectral-domain methods that offer spectral interpretations. low-pass graph filter to capture the rough shape of geometric
data and attenuate noise under the assumption that signals
A. Basic Principles are smooth in the associated data domain. In practice, a
Spectral-domain methods represent geometric data in the geometric signal (e.g., coordinates, normals) is inherently
graph transform domain and perform filtering on the resulting smooth with respect to the underlying graph, where high-
transform coefficients. While various graph transforms exist, frequency components are likely to be generated by fine details
we focus our discussion on the Graph Fourier Transform or noise. Hence, we can perform geometric data smoothing
(GFT) discussed in Section II-B without loss of generality. via a low-pass graph filter, essentially leading to a smoothed
Let the frequency response of a graph spectral filtering be representation in the underlying manifold.
denoted by ĥ(λk ) (k = 1, . . . , N ), then the graph spectral One intuitive realization is an ideal low-pass graph filter,
filtering takes the form which completely eliminates all graph frequencies above a
  given bandwidth while keeping those below unchanged. The
ĥ(λ1 ) graph frequency response of an ideal low-pass graph filter with
Y = U
 ..  >
 U X. (4) bandwidth b is
.

ĥ(λN ) 1, k ≤ b,
ĥ(λk ) = (5)
0, k > b,
This filtering first transforms the geometric data X into the
GFT domain U> X, performs filtering on each eigenvalue which projects the input geometric data into a bandlimited
(i.e., the spectrum of the graph), and finally projects back to subspace by removing components corresponding to large
the spatial domain via the inverse GFT to acquire the filtered eigenvalues (i.e., high-frequency components).
output Y. The smoothed result provides a bandlimited approximation
As discussed in Section II-C, the GFT leads to compact of the original geometric data. Fig. 4 demonstrates an example
representations of geometric data if the constructed graph of the bandlimited approximation of the 3D coordinates of
captures the underlying topology well. Based on the GFT point cloud Bunny (35947 points) [37] with 10, 100 and 400
representation, the key issue is to specify N graph frequency graph frequencies, respectively. Specifically, we construct a
responses {ĥ(λk )}N k=1 to operate on the geometric data; these K-NN graph (K = 10) on the point cloud and compute the
filters should be designed according to the specific task. corresponding GFT. Then we set the respective bandwidth for
Widely used filters include low-pass graph spectral filters and low-pass filtering as in (4) and (5). One can observe that the
IEEE TMM OVERVIEW ARTICLE 7

(a) Original. (b) 10 frequencies. (c) 100 frequencies. (d) 400 frequencies. (e) Graph spectral distribution.

Fig. 4: Low-pass approximation of point cloud Bunny. Plot (a) is the original point cloud with 35,947 points. Plots (b), (c)
and (d) show the low-pass approximations with 10, 100 and 400 graph frequency components , respectively. (e) presents the
main graph spectral distribution with frequencies higher than 500 omitted as the corresponding magnitudes are around zero.

first 10 low-frequency components are able to represent the C. Applications in Geometric Data
rough shape, with finer details becoming more apparent with Having discussed graph spectral filtering, we review some
additional graph frequencies. This validates the assertion that representative applications of spectral-domain GSP methods
the GFT achieves energy compaction for geometric data. for geometric data, including restoration and compression.
Another simple choice is a Haar-like low-pass graph filter 1) Geometric Data Restoration: Low-pass graph spectral
as discussed in [64], with the graph frequency response as filtering is often designed for geometric data restoration such
as denoising. As demonstrated in the example of Fig. 4, clean
ĥ(λk ) = 1 − λk /λmax , (6)
geometric data such as point clouds are dominated by low-
where λmax = λN is the maximum eigenvalue for normal- frequency components in the GFT domain. Hence, a carefully
ization. As λk−1 ≤ λk , we have ĥ(λk−1 ) ≥ ĥ(λk ). As such, designed low-pass filter is able to remove high-frequency
low-frequency components are preserved while high-frequency components that are likely introduced by noise or outliers.
components are attenuated. Based on this principle, Hu et al. proposed depth map
2) High-Pass Graph Spectral Filtering: In contrast to low- denoising by iterative thresholding in the GFT domain [65]. To
pass filtering, high-pass filtering eliminates low-frequency jointly exploit local smoothness and non-local self-similarity
components and detects large variations in geometric data, of a depth map, they cluster self-similar patches and compute
such as geometric contours or texture variations. A simple an average patch, from which a graph is deduced to describe
design is a Haar-like high-pass graph filter with the following correlations among adjacent pixels. Then self-similar patches
graph frequency response are transformed into the same GFT domain, where the GFT
basis is computed from the derived correlation graph. Finally,
ĥ(λk ) = λk /λmax . (7) iterative thresholding in the GFT domain is performed as the
ideal low-pass graph filter in (5) to enforce group sparsity.
As λk−1 ≤ λk , we have ĥ(λk−1 ) ≤ ĥ(λk ). This indicates Rosman et al. proposed spectral point cloud denoising based
that lower-frequency responses are attenuated while high- on the non-local framework as well [66]. Similar to Block-
frequency responses are preserved. Matching 3D filtering (BM3D) [67], they group similar surface
3) Graph Spectral Filtering with a Desired Distribution: patches into a collaborative patch and compute the graph
We may also design a desirable spectral distribution and then Laplacian from this grouping. Then they perform shrinkage
use graph filter coefficients to fit this distribution. For example, in the GFT domain by a low-pass filter similar to (5), which
an L-length graph filter is in the form of a diagonal matrix: leads to denoising of the collaborative patch.
PL−1 k
 In contrast, high-pass graph filtering can be used to detect
k=0 ĥk λ1
.. contours in 3D point cloud data as these are usually rep-
ĥ(Λ) =  , (8)
 
. resented by high-frequency components. For instance, Chen
PL−1 k
k=0 ĥk λN
et al. proposed a high-pass graph-filtering-based resampling
strategy to highlight contours for large-scale point cloud
where Λ is a diagonal matrix containing eigenvalues of graph visualization; the same technique can also be used to extract
Laplacian L as discussed in Section II-B, and ĥk is the key points for accurate 3D registration [64].
filter coefficients. If the desirable response of the i-th graph 2) Geometric Data Compression: Transform-based cod-
frequency is ci , we let ing is generally a low-pass filtering approach. When coding
L−1
X piece-wise smooth geometric data, the GFT produces small
ĥ(λi ) = ĥk λki = ci , (9) or zero high-frequency components since it does not filter
k=0 across boundaries, thus leading to a compact representation in
the transform domain. Further, as discussed in Section II-C,
and solve a set of linear equations to obtain the graph filter
the GFT approximates the KLT in terms of optimal signal
coefficients, ĥk . An alternative to construct such a graph filter
decorrelation under a family of statistical processes.
is the Chebyshev polynomial coefficients introduced in [15].
IEEE TMM OVERVIEW ARTICLE 8

TABLE III: Properties of different Graph Smoothness Regularizers (GSR).


Graph Smoothness Regularizer (GSR) Math Expression Dynamic Typical Solver Typical Works
P 2
Graph Laplacian Regularizer (GLR) i∼j ai,j · (xi − xj ) No Direct Solver / CG to (17) [13], [14], [18]
ai,j (xi , xj ) · (xi − xj )2
P
Reweighted Graph Laplacian Regularizer (RGLR) i∼j Yes Proximal Gradient [45]
P
Graph Total Variation (GTV) i∼j ai,j · |xi − xj | No Primal-Dual Method [79], [80]
P
Reweighted Graph Total Variation (RGTV) i∼j ai,j (xi , xj ) · |xi − xj | Yes ADMM [81]

Graph transform coding is suitable for depth maps due domain filtering is typically defined as a linear combination
to the piece-wise smoothness. Shen et al. first introduced a of local neighboring vertices
graph-based representation for depth maps that is adaptive X
to depth discontinuities, transforming the depth map into the yn := hn,j xj , (10)
j∈Nn,p
GFT domain for compression and outperforming traditional
DCT coding [56]. Variants of this work include [68], [69]. where hn,j denotes filter coefficients of the graph filter. Since
To further exploit the piece-wise smoothness of depth maps, Nn,p is node-dependent, hn,j needs to be properly defined
Hu et al. proposed a multi-resolution compression framework, according to n.
where boundaries are encoded in the original high resolution Typically, hn,j may be parameterized as a function of the
to preserve sharpness, and smooth surfaces are encoded at adjacency matrix A:
low resolution for greater efficiency [10], [70]. It is also
shown in [10] that the GFT approximates the KLT under a y = h(A)x, (11)
model specifically designed to characterize piece-wise smooth where
signals. Other graph transforms for depth map coding include
K−1
Generalized Graph Fourier Transforms (GGFTs) [30] and X
h(A) = hk Ak = h0 I+h1 A+. . .+hK−1 AK−1 . (12)
lifting transforms on graphs [71].
k=0
3D point clouds also exhibit certain piece-wise smoothness
in both geometry and attributes. Zhang et al. first proposed Here hk is the k-th filter coefficient that quantifies the con-
using graph transforms for attribute compression of static tribution from the k-hop neighbors, and K is the length of
point clouds [57], where graphs are constructed over local the graph filter. Ak determines the k-hop neighborhood by
neighborhoods in the point cloud by connecting nearby points, definition, thus a higher-order corresponds to a larger filtering
and the attributes are treated as graph signals. The graph range in the graph vertex domain. When operating A on a
transform decorrelates the signal and was found to be much graph signal, it computes the average of the neighboring signal
more efficient than traditional octree-based coding methods. of each vertex, which is essentially a low-pass filter.
Other follow up work includes graph transforms for sparse A can be replaced by other graph operators such as the
point clouds [72], [73], graph transforms with optimized graph Laplacian L:
Laplacian sparsity [74], normal-weighted graph transforms K−1
X
[75], Gaussian Process Transform (GPT) [76], and graph h(L) = hk Lk = h0 I + h1 L + . . . + hK−1 LK−1 . (13)
transforms for the enhancement layer [77]. k=0
In 4D dynamic point clouds, motion estimation becomes
When operating L on a graph signal, it sums up the signal
necessary to remove the temporal redundancy [46], [55],
difference between each vertex and its neighbors, which is
[78]. Thanou et al. represented the time-varying geometry of
essentially a high-pass filter.
dynamic point clouds with a set of graphs, and considered 3D
positions and color attributes of the point clouds as signals B. Nodal-domain Optimization
on the vertices of the graphs [46]. Motion estimation is
then cast as a feature matching problem between successive Besides direct filtering as in (10) or (11), nodal-domain
graphs based on spectral graph wavelets. Dynamic point cloud filtering often employs graph priors for regularization. Graph
compression remains a challenging task as each frame is Smoothness Regularizers (GSRs), which introduce prior
irregularly sampled without any explicit temporal pointwise knowledge about smoothness in the underlying graph signal,
correspondence with neighboring frames. play a critical role in a wide range of inverse problems, such
as depth map denoising [13], [65], point cloud denoising [14],
V. N ODAL -D OMAIN GSP M ETHODS FOR [18], and inpainting [82].
G EOMETRIC DATA 1) Formulation: In general, the formulation to restore a
A. Basic Principles geometric datum x with a signal prior, e.g., the GSR, is given
In contrary to spectral-domain GSP methods, this class of by the following maximum a posteriori optimization problem:
methods performs filtering on geometric data locally in the
2
nodal domain, which is often computationally efficient and x? = arg min ky − H(x)k2 + µ · GSR(x, G), (14)
x
thus, amenable to large-scale data.
Let Nn,p be a set of p-hop neighborhood nodes of the n-th where y is the observed signal and H(·) is a degradation
vertex, whose cardinality often varies according to n. Nodal- operator (e.g., down-sampling) defined over x. The first term
IEEE TMM OVERVIEW ARTICLE 9

in (14) is a data fidelity term; µ ∈ R balances the importance M. For a function x on manifold M and its discrete samples
between the data fidelity term and the signal prior. x on graph G (a graph signal), under mild conditions,
Next, we discuss two classes of commonly used GSRs— Z
> 1
Graph Laplacian Regularizer (GLR) and Graph Total Variation lim x Lx ∼ k∇M x(s)k22 ds, (19)
(GTV), as well as techniques to solve (14) with these priors.
N →∞ |M| M
→0
The property comparison of different GSRs is summarized in
where ∇M is the gradient operator on manifold M, and s
Table III.
is the natural volume element of M [33]. In other words,
2) Graph Laplacian Regularizer (GLR): The most com- the GLR now converges to a smoothness functional defined
monly used GSR is the GLR. Given a graph signal x residing on the associated Riemannian manifold. The relationship (19)
on the vertices of G encoded in the graph Laplacian L, the reveals that the GLR essentially regularizes graph signals with
GLR can be expressed as respect to the underlying manifold geometry, which justifies
X the usefulness of the GLR [84].
x> Lx = ai,j · (xi − xj )2 , (15)
In the aforementioned GLR, the graph Laplacian L is fixed,
i∼j
which does not promote reconstruction of the target signal
where i ∼ j means vertices i and j are connected, implying the with discontinuities if the corresponding edge weights are not
underlying points on the geometry are highly correlated. ai,j very small. It is thus extended to Reweighted GLR (RGLR)
is the corresponding element of the adjacency matrix A. The in [13], [81], [85] by considering L as a learnable function of
signal x is smooth with respect to G if the GLR is small, as the graph signal x. The RGLR is defined as
connected vertices xi and xj must be similar for a large edge X
weight ai,j ; for a small ai,j , xi and xj can differ significantly. x> L(x)x = ai,j (xi , xj ) · (xi − xj )2 , (20)
This prior also possesses an interpretation in the frequency i∼j

domain: where ai,j (xi , xj ) can be learned from the data. Now we have
N
X two optimization variables x and ai,j , which can be optimized
x> Lx = λk x̂2k , (16) alternately via proximal gradient [86].
k=1 It has been shown in [81] that minimizing the RGLR itera-
tively can promote piece-wise smoothness in the reconstructed
where λk is the k-th eigenvalue of L, and x̂k is the the k-th
graph signal x, assuming that the edge weights are appropri-
GFT coefficient. In other words, x̂2k is the energy in the k-th
ately initialized. Since geometric data often exhibits piece-wise
graph frequency for geometric data x. Thus, a small x> Lx
smoothness as discussed in Section III-B, the RGLR helps to
means that most of the signal energy is occupied by the low-
promote this property in the reconstruction process.
frequency components.
When we employ the GLR as the prior in (14) and assume 3) Graph Total Variation (GTV): Another popular line
H(·) is differentiable, (14) exhibits a closed-form solution. For of GSRs generalizes the well-known Total Variation (TV)
simplicity, we assume H = I (e.g., as in the denoising case), regularizer [87] to graph signals, leading to the Graph Total
then setting the derivative of (14) to zero yields Variation (GTV) and its variants. The GTV is defined as [88]:
X
x? = (I + µL)−1 y, (17) kxkGTV = ai,j · |xi − xj |. (21)
i∼j
which is a set of linear equations and can be solved directly
or with conjugate gradient (CG) [83]. As L is a high-pass where ai,j is fixed during the optimization. Since the GTV is
operator, the solution in (17) is essentially an adaptive low- non-differentiable, (21) has no closed-form solution, but can be
pass filtering result from the observation y. This can also be solved via existing optimization methods such as the primal-
indicated by the corresponding graph spectral response: dual algorithm [89].
Instead of using fixed A, Bai et al. extended the conven-
ĥ(λk ) = 1/(1 + µλk ), (18) tional GTV to the Reweighted GTV (RGTV) [81], where
graph weights are dependent on x:
which is a low-pass filter since smaller λk ’s correspond to X
lower frequencies. As described in Section IV-B1, the low- kxkRGTV = ai,j (xi , xj ) · |xi − xj |. (22)
pass filtering will lead to smoothed geometric data with the i∼j
underlying shape retained.
This can be solved by ADMM [90] or the algorithm proposed
Further, as discussed in Section II-C, the graph Laplacian in [81].
operator converges to the Laplace-Beltrami operator on the
The work of [81] also provides spectral interpretations of the
geometry in the continuous manifold when the number of
GTV and RGTV by rewriting them as `1 -Laplacian operators
samples tends to infinity. We can also interpret the GLR from
on a graph. The spectral analysis demonstrates that the GTV is
a continuous manifold perspective. According to [33], given a
a stronger PWS-preserving filter than the GLR, and the RGTV
Riemannian manifold M (or surface) and a set of N points
has desirable properties including robustness to noise and blur
uniformly sampled on M, an -neighborhood graph G can be
and promotes sharpness. Hence, the RGTV is advantageous to
constructed with each vertex corresponding to one sample on
boosting the piece-wise smoothness of geometric data.
IEEE TMM OVERVIEW ARTICLE 10

(a) Ground truth. (b) Noisy. (c) Spectral-LP. (d) Nodal- [14]. (e) Nodal- [18].

Fig. 5: Point cloud denoising results with Gaussian noise σ = 0.04 for Quasimoto [38]: (a) The ground truth; (b) The noisy
point cloud; (c) The denoised result by graph spectral low-pass (LP) filtering that we implement according to (5); (d) The
denoised result by a nodal-domain GSP method in [14]; (e) The denoised result by a nodal-domain GSP method in [18].

C. Applications in Geometric Data restoration involves sophisticated graph construction or algo-


In the following, we review a few works on geometric rithmic procedures. This has motivated the development of
data restoration with nodal-domain GSP methods. First, we other geometric data restoration methods using various GSRs
present a few applications recovering geometric data with the that are tailored to specific restoration tasks.
simple-yet-effective GLR, and then extend our scope to more To remove noise on point clouds, the method proposed
advanced graph smoothness regularizers. in et al. [80] first assumes smoothness in the gradient ∇G Y
1) Geometric Data Restoration with the GLR: To cope of the point cloud Y on a graph G, leading to a Tikhonov
with various geometric data restoration problems, GLR-based regularization GSRTik (Y) = k∇G Yk22 which is equivalent to
methods place more emphasis on the choice of the neighbor- the simple GLR. The method further assumes the underlying
hood graph and the algorithm design. For instance, to remove manifold of the point cloud to be piece-wise smooth rather than
additive white Gaussian noise (AWGN) from depth images, smooth, and then replaces the Tikhonov regularization with
Pang and Cheung [13] adopted the formulation in (14) with the GTV regularization (21), i.e., GSRTV (Y) = k∇G Yk1 . In
the GLR. To understand the behavior of the GLR for 2D depth [79], Elmoataz et al. also applied the GTV for mesh filtering
images, [13] performs an analysis in the continuous domain, to simplify 3D geometry.
leading to an -neighborhood graph (Section III-D) which not In [45], Dinesh et al. applied the RGTV (22) to regularize
only smooths out noise but also sharpens edges. the surface normal for point cloud denoising, where the edge
Zeng et al. [14] applied the GLR for point cloud denoising. weight between two nodes is a function of the normals. More-
In contrast to [13], they first formulated the denoising problem over, they established a linear relationship between normals
with a low dimensional manifold model (LDMM) [91]. The and 3D point coordinates via bipartite graph approximation
LDMM prior suggests that the clean point cloud patches are for ease of optimization. To perform point cloud inpainting,
samples from a low-dimensional manifold embedded in the Hu et al. [82] also applied a modified GTV called Anisotropic
high dimensional space, though it is non-trivial to minimize GTV (AGTV) as a metric to measure the similarity of point
the dimension of a Riemannian manifold. With (19) and tools cloud patches.
from differential geometry [92], it is possible to “convert” 3) Restoration with Nodal-domain Filtering: Solving op-
the LDMM signal prior to the GLR. Hence, the problem timization problems can be formidable and sometimes even
of minimizing the manifold dimension is approximated by impractical. An alternative strategy for geometric data recov-
iteratively solving a quadratic program with the GLR. ery is to perform filtering in the nodal-domain as discussed in
Instead of constructing the underlying graph with pre- Section V-A. Examples include point cloud re-sampling [64]
defined edge weights from hand-crafted parameters, Hu et and depth image enhancement [93]. Essentially, nodal-domain
al. proposed feature graph learning by minimizing the GLR filtering aims at “averaging” the samples of a graph signal
using the Mahalanobis distance metric matrix M as a variable, adaptively, either locally or non-locally.
assuming a feature vector per node is available [18]. Then the
graph Laplacian L becomes a function of M, i.e., L(M). A
fast algorithm with the GLR is presented and applied to point D. Discussion on Spectral- and Nodal-Domain GSP Methods
cloud denoising, where the graph for each set of self-similar
patches is computed from 3D coordinates and surface normals There exists a close connection between spectral-domain
as features. methods and nodal-domain ones, and as discussed earlier, there
2) Geometric Data Restoration with Other GSRs: Despite is a correspondence between spatial graph filters and their
the simplicity of the GLR, applying it for geometric data graph spectral response. As an example, consider the filter in
IEEE TMM OVERVIEW ARTICLE 11

On the other hand, compared to the hand-crafted assumptions


made in model-based approaches, learning-based approaches
effectively learn to abstract high-level (or semantic) features
with a training process [95]. Consequently, they are more
suitable for high-level applications such as segmentation [96]
and classification [97].
(a) Spectral graph convolution. Convolutional Neural Networks (CNNs) have shown to
be extremely effective for a wide range of imaging tasks
but have been designed to process data defined on regular
grids, where spatial relationships between data samples (e.g.,
top, bottom, left and right) are uniquely defined. In order
1-hop to leverage such networks for geometric data, some prior
2-hop
works transform irregular geometric data to regular 3D voxel
(b) Spatial graph convolution. grids or collections of 2D images before feeding them to a
Fig. 6: Graph convolution operations. neural network [98], [99] or impose certain symmetries in
the network computation (e.g., PointNet [100], PointNet++
[101], PointCNN [102]). As discussed in Section III-C, non-
(13), where Lk = (UΛU> )k = UΛk U> since U> U = I. It graph representations are sometimes redundant, inaccurate or
follows that (13) can be rewritten as deficient in data structural description.
In contrast, GSP provides efficient filtering and sampling of
h(L) = Uh(Λ)U> , (23) such data with insightful spectral interpretation, which is able
which corresponds to a graph spectral filter with h(Λ) as to generalize the key operations (e.g., convolution and pooling)
the spectral response in (4). Another example is the spectral in a neural network to irregular geometric data. For example,
response of the solution to a GLR-regularized optimization graph convolution can be defined as graph filtering either in
problem, as presented in (18). the spectral or the spatial domain. This leads to the recently
In addition, some nodal-domain graph filtering methods are developed Graph Neural Networks (GNNs) (see [19] and ref-
approximations of spectral-domain filtering, including polyno- erences therein), which generalize CNNs to unstructured data.
mial approximations of graph spectral filters like Chebyshev GNNs have achieved success in both analysis and synthesis
polynomials [15]–[17] for depth map enhancement, as well as of geometric data. The input geometric features at each point
lifting transforms on graphs [71] for depth map coding. (vertex) of GNNs are usually assigned with coordinates, laser
Comparing the two methods, as mentioned earlier, spectral intensities or colors, while features at each edge are usually
methods entail higher computational complexity due to eigen- assigned with geometric similarities between two connected
decomposition, whereas spatial methods avoid such complex- points.
ity and are thus more amenable to large-scale geometric data. Nonetheless, learning-based methods are facing common
Also, graph transforms employed in spectral methods are issues such as interpretability, robustness and generalization
mostly global transforms, which capture the global features. [94], [103]. In the remainder of this section, we will particu-
In contrast, spatial methods are often applied to capture local larly discuss the interpretability of GNNs from the perspective
features in graph signals. Taking point cloud denoising as an of GSP for geometric data, which is expected to inspire more
example, we show denoising results in Fig. 5 comparing one interpretable, robust, and generalizable designs of GNNs.
graph spectral low-pass filtering method that we implement
according to (5) as well as two state-of-the-art nodal-domain
GSP methods [14], [18]. We can observe from these results A. Interpreting Graph Convolution with GSP
that the nodal-domain methods reconstruct local structural
GSP tools, particularly graph filters, inspire some early
features better, including fine components such as the tobacco
designs of basic operations in GNNs, including spectral graph
pipe.
convolution and spectrum-free graph convolution. In addition,
GSP provides interpretation for spatial graph convolution from
VI. GSP- BASED I NTERPRETATION FOR G RAPH N EURAL
the perspective of spatial graph filtering.
N ETWORKS
1) Spectral Graph Convolution: As there is no clear def-
The aforementioned GSP methods are model-based and inition of shift-invariance over graphs in the nodal domain,
built upon prior knowledge and characteristics of geometric one may define graph convolution in the spectral domain via
data, which usually perform robustly, e.g., a depth map de- graph transforms according to the Convolution Theorem. That
noising algorithm would still perform reasonably on natural is, the graph convolution of signal f ∈ RN and filter g ∈ RN
images. However, model-based approaches lack flexibility as in the spectral domain with respect to the underlying graph G
they are built upon prior observations, e.g., the GSRs in Sec- can be expressed as the element-wise product of their graph
tion V-B [94]. In contrast, learning-based methods infer filter transforms:
parameters in a data-driven manner, which are highly flexible,
such as the recently developed geometric deep learning [19]. g ?G f = U(U> g U> f ), (24)
IEEE TMM OVERVIEW ARTICLE 12

where U is the GFT basis and denotes the Hadamard 3) Spatial Graph Convolution: Analogous to the convo-
product. Let gθ = diag(U> g), the graph convolution can be lution in CNNs, spatial graph convolution aggregates the
simplified as information of neighboring vertices to capture the local ge-
g ?G f = Ugθ U> f . (25) ometric structure in the spatial domain, leading to feature
propagation over adjacent vertices that enforce the smoothness
The key difference in various spectral GNN methods is the of geometric data to some extent [108]–[110]. Such graph
choice of filter gθ which captures the holistic appearance of convolution filters over the neighborhood of each vertex in
the geometry. In an early work [104], gθ = Θ is a learnable the spatial domain are essentially nodal-domain graph filters
diagonal matrix. from the perspective of GSP.
As schematically shown in Fig. 6(a), the spectral-domain As a representative spatial method on point clouds, Wang
graph convolution (25) is essentially the spectral graph filter- et al. introduced the concept of edge convolution [110], which
ing defined in (4) if the diagonal entries of gθ are the graph generates edge features that characterize the relationships
frequency response ĥ(λk ). As gθ is often learned so as to adapt between each point and its neighbors. The edge convolution
to various tasks, it is analogous to the graph spectral filtering exploits local geometric structure and can be stacked to learn
with desired distribution as discussed in Section IV-B3. Hence, global geometric properties. Let xi ∈ Rd and xj ∈ Rd denote
we are able to interpret spectral-domain graph convolution via the graph signal on the i-th and j-the vertex respectively, the
spectral graph filtering for geometric data processing. output of edge convolution is:
2) Spectrum-free Graph Convolution: It has been noted
x0i = Ψ(i,j)∈E h(xi , xj ) ∈ Rd , (30)
earlier that the eigen-decomposition required by spectral-
domain graph convolution incurs relatively high computational where E is the set of edges and h(·, ·) is a generic edge feature
complexity. However, one may parameterize the filter using function, implemented by a certain neural network. Ψ is a
a smooth spectral transfer function Θ(Λ) [105]. One choice generic aggregation function, which could be the summation
is to represent Θ(Λ) as a K-degree polynomial, such as the or maximum operation. The operation of (30) is demonstrated
Chebyshev polynomial which approximates the graph kernel in Fig. 6(b), where we could consider not only the 1-hop
well [15]: neighbors but also 2-hop neighbors or more.
K−1
X The edge convolution is also similar to the nodal-domain
Θ(Λ) = Tk (Λ̃), (26) graph filtering: both aggregate neighboring information; fur-
k=0 ther, the edge convolution specifically models each pairwise
where T (·) denotes the Chebyshev polynomial. It is defined as relationship by a non-parametric function.
T0 (Λ̃) = I, T1 (Λ̃) = Λ̃, Tk (Λ̃) = 2Λ̃Tk−1 (Λ̃) − Tk−2 (Λ̃). B. Understanding Representation Learning of GNNs with GSP
Λ̃ denotes the normalized eigenvalues in [−1, 1] due to the
domain defined by the Chebyshev polynomial. GSP tools also provide interpretation for representation
learning of GNNs, as discussed below.
Combining (25) and (26), we have
1) Low-pass Graph Filtering of Features: Wu et al. [111]
g ?G f = UΘ(Λ)U> f (27) propose to simplify GCNs by successively removing nonlin-
K K earities in GCNs and collapsing weight matrices between con-
secutive layers and analyze that this simple graph convolution
X X
≈U Tk (Λ̃)U> f = Tk (L)f
e , (28)
k=0 k=0 (SGC) corresponds to a fixed low-pass filter followed by a
linear classifier. That is, SGC acts as a low-pass filter that
where L e = UΛ̃U> is a normalized graph Laplacian. This produces smooth features over adjacent nodes in the graph.
leads to well-known ChebNet [106]. As a result, nearby nodes tend to share similar representations
If we only consider 1-degree Chebyshev polynomial, and consequently predictions.
namely, K = 1, it leads to the widely used Graph Convolu- Fu et al. [112] show that several popular GNNs can be inter-
tional Network (GCN) [107]. With a series of simplifications preted as implicitly implementing denoising and/or smoothing
and renormalization, the convolutional layer of the GCN takes of graph signals. In particular, spectral graph convolutions
the form: [106], [107] work as denoising node features, while graph
g ?G f = De − 12 A
eDe − 12 Φ, (29) attentions [62], [113] work as denoising edge weights.
2) Introducing Domain Knowledge via GSP-based Regu-
where A e = A + I is the renormalized adjacency matrix, and larization: Some works provide domain knowledge via GSP-
D is the corresponding degree matrix. Φ is a matrix of filter
e based regularization (e.g., GSRs) for better understanding
parameters. the representational properties of GNNs. For instance, Te
While inspired from a graph spectral viewpoint, both the et al. proposed a Regularized Graph Convolutional Neural
ChebNet and GCN can be implemented in the spatial domain Network (RGCNN) [114] as one of the first approaches to
directly, which are thus referred to as spectrum-free. The utilize GNNs for point cloud segmentation, which regularizes
spectrum-free convolution in (28) and (29) is essentially nodal- each layer by the GLR introduced in Section V-B2. This
domain graph filtering presented in (13) and (12), respectively. prior essentially enforces the features of vertices within each
For instance, the graph convolution in the GCN is a simple connected component of the graph similar, which is incor-
one-hop neighborhood averaging. porated into the loss function and enables explainable and
IEEE TMM OVERVIEW ARTICLE 13

robust segmentation. Also, the GLR has spectral smoothing • GSP for time-varying geometric data processing: Unlike
functionality as discussed in Section V-B2, i.e., low-frequency regularly sampled videos, 4D geometric data are char-
components are better preserved. Such regularization is robust acterized by irregularly sampled points, both spatially
to both low density and noise in point clouds. and temporally, and the number of points in each time
3) Inferring Data Structure via GSP-based Graph Learn- instance may also vary. This makes it challenging to
ing: Dong et al. [115] discuss that GSP-based graph learning establish temporal correspondences and exploit the tem-
frameworks enhance the model interpretability by inferring poral information. While some works have been done in
hidden relational structure from data, which leads to a better the context of 4D point cloud compression [46], [55] and
understanding of a complex system. In particular, GSP-based restoration [47], [54] with GSP, it still remains challeng-
graph-learning has the unique advantage of enforcing certain ing to address complex scenarios with fast motion.
desirable representations of the signals via frequency-domain • GSP for implicit geometric data processing: While not
analysis and filtering operations on graphs. For instance, discussed in detail, the presented GSP framework em-
models based on assumptions such as the smoothness or braces the processing of geometric data that is implicitly
the diffusion of the graph signals show their superiority on contained in the data, e.g., multi-view representations and
geometric structure [53], [110], [116]–[118]. Please refer to light fields. For instance, Maugey et al. [122] proposed
[115] for more discussions. a graph-based representation of geometric information
4) Monitoring Intermediate Representations via GSP: based on multi-view images. While this work aims at
Gripon et al. [119] improve interpretability by using GSP to more efficient compression, we believe the use of such
monitor the intermediate representations obtained in a deep representations can potentially be leveraged for a wide
neural network. They demonstrate that the smoothness of range of inference tasks as well.
the label signal on a k-nearest neighbor feature graph is a • GSP for enhancing model interpretability: GSP paves an
good measure of separation of classes in these intermediate insightful way to interpretable geometric deep learning,
representations. which also leads to more robust and generalizable deep
learning. While we have discussed the interpretability of
C. Enhancing Robustness and Generalizability with GSP GNNs via GSP from several aspects in Section VI, we
1) Robustness: ”Robustness” of a deep learning network believe further steps could be made for more interpretable
may refer to 1) robustness to noisy data or labels; 2) robustness geometric deep learning and even reasoning in artificial
to incomplete data; 3) robustness to a few training samples intelligence.
with supervised information (e.g., few-shot learning), etc. As • GSP for model-based geometric deep learning: as men-
discussed in Section VI-B2, introducing domain knowledge tioned in Section VI, model-based GSP methods lack
via GSP-based regularization would lead to geometric deep flexibility but perform robustly for different input data;
learning that is robust to noisy data and incomplete data. while learning-based methods (e.g., with GNN) are highly
Besides, Ziko et al. [120] proposed a transductive Laplacian- flexible but may not generalize well. Hence, it is desirable
regularized inference for few-shot learning tasks, which en- to explicitly integrate GSP models (e.g., the GSRs in
courages nearby query samples to have consistent label as- Section V-B) into learning-based methods to retain the
signments and thus leads to robust performance. benefits of both paradigms [94], [103].
2) Generalizability: The generalizability of a model ex-
presses how well the model will perform on unseen data. VIII. C ONCLUSIONS
Regarding the generalizability, we assume a hypothesis: even
when the unseen data may demonstrate rather different dis- We present a generic GSP framework for geometric data,
tribution characteristics, there may be an intrinsic structure from theory to applications. Distinguished from other graph
embedded in the data. Such intrinsic structure usually can be signals, geometric data are discrete samples of continuous 3D
better maintained from seen datasets to unseen data. That is, surfaces, which exhibit unique characteristics such as piece-
the data structure is assumed to be more stable than the data wise smoothness that can be compactly, accurately, and adap-
themselves. Consequently, when GSP tools are incorporated tively represented on graphs. Hence, graph signal processing
into deep learning networks, the graph structure (motivated (GSP) is naturally advantageous for the processing and analy-
by the data structure) could provide extra insights / guidance sis of geometric data, with interpretations in both the discrete
from the structure domain, in addition to the data domain, that domain and the continuous domain with Riemannian geome-
finally enhances the generalizability of the network. For exam- try. In particular, we discuss spectral-domain GSP methods and
ple, deep GLR [121] integrates graph Laplacian regularization nodal-domain GSP methods, as well as their relation. Further,
as a trainable module into a deep learning framework, which we provide the interpretability of Graph Neural Networks
exhibits strong cross-domain generalization ability. (GNNs) from the perspective of GSP, highlighting that the
basic graph convolution operation is essentially graph spectral
VII. F UTURE D IRECTIONS or nodal filtering and that representation learning of GNNs
Regardless of the great success of GSP methods in various can be understood or enhanced by GSP. We anticipate this
applications involving geometric data processing and analy- interpretation will inspire future research on more principled
sis, there remain quite a few challenges ahead. Some open GNN designs that leverage the key GSP concepts and theory.
problems and potential future research directions include: Finally, we discuss potential future directions and challenges
IEEE TMM OVERVIEW ARTICLE 14

in GSP for geometric data as well as GSP-based interpretable [26] Q. Cai and P. A. Chou, “Microsoft voxelized upper bodies a vox-
GNN designs. elized point cloud dataset,” in ISO/IEC JTC1/SC29 WG11 ISO/IEC
JTC1/SC29/WG1 input document m38673/M72012, May 2016.
[27] H. Rue and L. Held, Gaussian Markov Random Fields: Theory and
R EFERENCES applications. CRC press, 2005.
[28] H. E. Egilmez, E. Pavez, and A. Ortega, “Graph learning from data
[1] G. C. Burdea and P. Coiffet, Virtual reality technology. John Wiley under Laplacian and structural constraints,” IEEE J. Sel. Topics Signal
& Sons, 2003. Process., vol. 11, no. 6, pp. 825–841, 2017.
[2] D. Schmalstieg and T. Hollerer, Augmented reality: Principles and [29] C. Zhang and D. Florêncio, “Analyzing the optimality of predictive
practice. Addison-Wesley Professional, 2016. transform coding using graph-based models,” IEEE Signal Process.
[3] S. Chen, B. Liu, C. Feng, C. Vallespi-Gonzalez, and C. Wellington, “3D Lett., vol. 20, no. 1, pp. 106–109, 2012.
point cloud processing and learning for autonomous driving,” IEEE [30] W. Hu, G. Cheung, and A. Ortega, “Intra-prediction and generalized
Signal Process. Mag., 2020. graph Fourier transform for image coding,” IEEE Signal Process. Lett.,
[4] C. Benedek, “3D people surveillance on range data sequences of a vol. 22, no. 11, pp. 1913–1917, 2015.
rotating LiDAR,” Pattern Recognit. Lett., vol. 50, pp. 149–158, 2014. [31] C. Zhang, P. Chou, and D. Florencio, “Graph signal
[5] E. H. Adelson and J. R. Bergen, “The plenoptic function and the processing–A probabilistic framework,” URL https://sigport.
elements of early vision,” MIT Press, 1991. org/sites/default/files/graphSP prob. pdf, 2016.
[6] T. Ebrahimi, S. Fossel, F. Pereira, and P. Schelkens, “JPEG Pleno:
[32] A. Singer, “From graph to manifold Laplacian: The convergence rate,”
Toward an efficient representation of visual reality,” IEEE Multimedia,
Applied Comput. Harmonic Anal., vol. 21, no. 1, pp. 128–134, 2006.
vol. 23, no. 4, pp. 14–20, 2016.
[33] M. Hein, J.-Y. Audibert, and U. v. Luxburg, “Graph Laplacians and
[7] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Van-
their convergence on random neighborhood graphs,” J. Mach. Learn.
dergheynst, “The emerging field of signal processing on graphs: Ex-
Res., vol. 8, pp. 1325–1368, 2007.
tending high-dimensional data analysis to networks and other irregular
[34] “FlyingThings3D datasets,” https://lmb.informatik.uni-freiburg.de/
domains,” IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83–98, 2013.
resources/datasets/SceneFlowDatasets.en.html, accessed October 12,
[8] A. Sandryhaila and J. M. Moura, “Discrete signal processing on
2019.
graphs,” IEEE Trans. Signal Process., vol. 61, no. 7, pp. 1644–1656,
[35] “Tsukuba datasets,” https://home.cvlab.cs.tsukuba.ac.jp/dataset, ac-
2013.
cessed October 12, 2019.
[9] A. Ortega, P. Frossard, J. Kovačević, J. M. Moura, and P. Van-
dergheynst, “Graph signal processing: Overview, challenges, and ap- [36] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous
plications,” Proc. IEEE, vol. 106, no. 5, pp. 808–828, 2018. driving? The KITTI vision benchmark suite,” in Proc. CVPR, 2012,
[10] W. Hu, G. Cheung, A. Ortega, and O. C. Au, “Multiresolution graph pp. 3354–3361.
Fourier transform for compression of piecewise smooth images,” IEEE [37] M. Levoy, J. Gerth, B. Curless, and K. Pull, “The stanford
Trans. Image Process., vol. 24, no. 1, pp. 419–433, 2015. 3D scanning repository, 2005,” URL http://www-graphics. stanford.
[11] D. Ting, L. Huang, and M. Jordan, “An analysis of the convergence of edu/data/3dscanrep (accessed September 2017), 2005.
graph Laplacians,” in Proc. Int. Conf. Mach. Learn., 2010, pp. 1079– [38] M. Berger, J. A. Levine, L. G. Nonato, G. Taubin, and C. T. Silva, “A
1086. benchmark for surface reconstruction,” ACM Trans. Graphics, vol. 32,
[12] M. Hein, “Uniform convergence of adaptive graph-based regulariza- no. 2, p. 20, 2013.
tion,” in Proc. Int. Conf. Comput. Learn. Theory, 2006, pp. 50–64. [39] T. M. Eugene d0 Eon, Bob Harrison and P. A. Chou, “8i voxelized
[13] J. Pang and G. Cheung, “Graph Laplacian regularization for image full bodies, version 2—A voxelized point cloud dataset,” ISO/IEC
denoising: Analysis in the continuous domain,” IEEE Trans. Image JTC1/SC29/WG11 m40059 ISO/IEC JTC1/SC29/WG1 M74006, Jan.
Process., vol. 26, no. 6, pp. 1770–1785, 2017. 2017.
[14] J. Zeng, G. Cheung, M. Ng, J. Pang, and Y. Cheng, “3D point cloud [40] A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang,
denoising using graph Laplacian regularization of a low dimensional Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al., “ShapeNet:
manifold model,” IEEE Trans. Image Process., vol. 29, pp. 3474–3489, An information-rich 3D model repository,” arXiv preprint
December 2019. arXiv:1512.03012, 2015.
[15] D. K. Hammond, P. Vandergheynst, and R. Gribonval, “Wavelets on [41] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao,
graphs via spectral graph theory,” Appl. Comput. Harmonic Anal., “3D ShapeNets: A deep representation for volumetric shapes,” in Proc.
vol. 30, no. 2, pp. 129–150, 2011. CVPR, 2015, pp. 1912–1920.
[16] A. Gadde, S. K. Narang, and A. Ortega, “Bilateral filter: Graph spectral [42] I. Armeni, O. Sener, A. R. Zamir, H. Jiang, I. Brilakis, M. Fischer,
interpretation and extensions,” in Proc. IEEE Int. Conf. Image Process., and S. Savarese, “3D semantic parsing of large-scale indoor spaces,”
2013, pp. 1222–1226. in Proc. CVPR, 2016, pp. 1534–1543.
[17] D. Tian, H. Mansour, A. Knyazev, and A. Vetro, “Chebyshev and [43] “WAYMO open dataset,” https://github.com/waymo-research/
conjugate gradient filters for graph image denoising,” in Proc. IEEE waymo-open-dataset, accessed October 12, 2019.
Int. Conf. Multimedia Expo Workshop, 2014, pp. 1–6. [44] J. Behley, M. Garbade, A. Milioto, J. Quenzel, S. Behnke, C. Stachniss,
[18] W. Hu, X. Gao, G. Cheung, and Z. Guo, “Feature graph learning for and J. Gall, “SemanticKITTI: A dataset for semantic scene understand-
3D point cloud denoising,” IEEE Trans. Signal Process., vol. 68, pp. ing of LiDAR sequences,” in Proc. IEEE Int. Conf. Comput. Vis., 2019.
2841–2856, 2020. [45] C. Dinesh, G. Cheung, and I. V. Bajić, “Point cloud denoising via
[19] M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst, feature graph Laplacian regularization,” IEEE Trans. Image Process.,
“Geometric deep learning: Going beyond Euclidean data,” IEEE Signal vol. 29, pp. 4143–4158, 2020.
Process. Mag., vol. 34, no. 4, pp. 18–42, 2017. [46] D. Thanou, P. A. Chou, and P. Frossard, “Graph-based compression
[20] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y. Philip, “A of dynamic 3D point cloud sequences,” IEEE Trans. Image Process.,
comprehensive survey on graph neural networks,” IEEE Trans. Neural vol. 25, no. 4, pp. 1765–1778, 2016.
Netw. Learn. Syst., 2020. [47] Z. Fu, W. Hu, and Z. Guo, “3D dynamic point cloud inpainting via
[21] Y. Guo, H. Wang, Q. Hu, H. Liu, L. Liu, and M. Bennamoun, “Deep temporal consistency on graphs,” Proc. IEEE Int. Conf. Multimedia
learning for 3D point clouds: A survey,” IEEE Trans. Pattern Ana. Expo, July 2020.
Mach. Intell., June 2020. [48] R. Schnabel and R. Klein, “Octree-based point-cloud compression.”
[22] L. Stankovic, D. Mandic, M. Dakovic, M. Brajovic, B. Scalzo, S. Li, Spbg, vol. 6, pp. 111–120, 2006.
and A. G. Constantinides, “Graph signal processing–part iii: Ma- [49] X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multi-view 3D object
chine learning on graphs, from graph topology to applications,” arXiv detection network for autonomous driving,” in Proc. CVPR, July 2017.
preprint arXiv:2001.00426, 2020. [50] B. Curless and M. Levoy, “A volumetric method for building complex
[23] F. R. Chung and F. C. Graham, Spectral graph theory. American models from range images,” in Proc. SIGGRAPH, 1996, pp. 303–312.
Mathematical Soc., 1997, no. 92. [51] J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove,
[24] “Middlebury stereo datasets,” http://vision.middlebury.edu/stereo/data/, “DeepSDF: Learning continuous signed distance functions for shape
accessed October 12, 2019. representation,” in Proc. CVPR, 2019, pp. 165–174.
[25] A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and [52] C. Choy, J. Gwak, and S. Savarese, “4D spatio-temporal convnets:
M. Nießner, “ScanNet: Richly-annotated 3D reconstructions of indoor Minkowski convolutional neural networks,” in Proc. CVPR, 2019, pp.
scenes,” in Proc. CVPR, 2017, pp. 5828–5839. 3075–3084.
IEEE TMM OVERVIEW ARTICLE 15

[53] X. Gao, W. Hu, J. Tang, J. Liu, and Z. Guo, “Optimized skeleton-based [78] A. Anis, P. A. Chou, and A. Ortega, “Compression of dynamic 3D
action recognition via sparsified graph regression,” in Proc. ACM Int. point clouds using subdivisional meshes and graph wavelet transforms,”
Conf. Multimedia, 2019, pp. 601–610. in Proc. IEEE Int. Conf. Acoustics Speech Signal Process., 2016, pp.
[54] Z. Fu and W. Hu, “Dynamic point cloud inpainting via spatial-temporal 6360–6364.
graph learning,” IEEE Trans. Multimedia, 2021. [79] A. Elmoataz, O. Lezoray, and S. Bougleux, “Nonlocal discrete regu-
[55] Y. Xu, W. Hu, S. Wang, X. Zhang, S. Wang, S. Ma, Z. Guo, larization on weighted graphs: A framework for image and manifold
and W. Gao, “Predictive generalized graph Fourier transform for processing,” IEEE Trans. Image Process., vol. 17, no. 7, pp. 1047–
attribute compression of dynamic point clouds,” arXiv preprint 1060, 2008.
arXiv:1908.01970, 2019. [80] Y. Schoenenberger, J. Paratte, and P. Vandergheynst, “Graph-based
[56] G. Shen, W.-S. Kim, S. K. Narang, A. Ortega, J. Lee, and H. Wey, denoising for time-varying point clouds,” in Proc. IEEE 3DTV Conf.,
“Edge-adaptive transforms for efficient depth map coding,” in Proc. 2015, pp. 1–4.
Picture Coding Symp., 2010, pp. 566–569. [81] Y. Bai, G. Cheung, X. Liu, and W. Gao, “Graph-based blind image
[57] C. Zhang, D. Florencio, and C. Loop, “Point cloud attribute compres- deblurring from a single photograph,” IEEE Trans. Image Process.,
sion with graph transform,” in Proc. IEEE Int. Conf. Image Process., vol. 28, no. 3, pp. 1404–1418, 2018.
2014, pp. 2066–2070. [82] W. Hu, Z. Fu, and Z. Guo, “Local frequency interpretation and non-
[58] X. Dong, D. Thanou, M. Rabbat, and P. Frossard, “Learning graphs local self-similarity on graph for point cloud inpainting,” IEEE Trans.
from data: A signal representation perspective,” IEEE Signal Process. Image Process., vol. 28, no. 8, pp. 4087–4100, 2019.
Mag., vol. 36, no. 3, pp. 44–63, 2019. [83] O. Axelsson and G. Lindskog, “On the rate of convergence of the
[59] G. Mateos, S. Segarra, A. G. Marques, and A. Ribeiro, “Connecting preconditioned conjugate gradient method,” Numerische Mathematik,
the dots: Identifying network structure via graph signal processing,” vol. 48, no. 5, pp. 499–523, 1986.
IEEE Signal Process. Mag., vol. 36, no. 3, pp. 16–43, 2019. [84] M. Hein and M. Maier, “Manifold denoising,” in Proc. Adv. Neural
[60] L. Franceschi, M. Niepert, M. Pontil, and X. He, “Learning discrete Inf. Process. Syst., 2007, pp. 561–568.
structures for graph neural networks,” in Proc. Int. Conf. Mach. Learn., [85] X. Liu, G. Cheung, X. Wu, and D. Zhao, “Random walk graph
2019, pp. 1972–1982. Laplacian-based smoothness prior for soft decoding of JPEG images,”
[61] W. Jin, Y. Ma, X. Liu, X. Tang, S. Wang, and J. Tang, “Graph structure IEEE Trans. Image Process., vol. 26, no. 2, pp. 509–524, 2016.
learning for robust graph neural networks,” in Proc. ACM Int. Conf. [86] N. Parikh and S. Boyd, “Proximal algorithms,” Foundations and Trends
Knowledge Discovery & Data Mining, 2020, pp. 66–74. in Optimization, vol. 1, no. 3, pp. 123–231, 2013.
[62] R. Li, S. Wang, F. Zhu, and J. Huang, “Adaptive graph convolutional [87] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation
neural networks,” in Proc. AAAI Conf. Artif. Intell., 2018. based noise removal algorithms,” Phys. D, vol. 60, no. 1-4, pp.
[63] L. Le Magoarou, R. Gribonval, and N. Tremblay, “Approximate fast 259–268, Nov. 1992. [Online]. Available: http://dx.doi.org/10.1016/
graph Fourier transforms via multilayer sparse approximations,” IEEE 0167-2789(92)90242-F
Trans. Signal Inf. Process. Netw., vol. 4, no. 2, pp. 407–420, 2017. [88] C. Couprie, L. Grady, L. Najman, J.-C. Pesquet, and H. Talbot, “Dual
[64] S. Chen, D. Tian, C. Feng, A. Vetro, and J. Kovačević, “Fast resampling constrained TV-based regularization on graphs,” SIAM J. Imaging
of three-dimensional point clouds via graphs,” IEEE Trans. Signal Sciences, vol. 6, no. 3, pp. 1246–1273, 2013.
Process., vol. 66, no. 3, pp. 666–681, 2017. [89] X. Zhang, M. Burger, and S. Osher, “A unified primal-dual algorithm
[65] W. Hu, X. Li, G. Cheung, and O. Au, “Depth map denoising using framework based on Bregman iteration,” J. Sci. Comput., vol. 46, no. 1,
graph-based transform and group sparsity,” in Proc. IEEE Workshop pp. 20–46, 2011.
Multimedia Signal Process., 2013, pp. 001–006. [90] B. Wahlberg, S. Boyd, M. Annergren, and Y. Wang, “An ADMM
[66] G. Rosman, A. Dubrovina, and R. Kimmel, “Patch-collaborative spec- algorithm for a class of total variation regularized estimation problems,”
tral point-cloud denoising,” in Comput. Graphics Forum, vol. 32, no. 8, IFAC Proceedings Volumes, vol. 45, no. 16, pp. 83–88, 2012.
2013, pp. 1–12. [91] S. Osher, Z. Shi, and W. Zhu, “Low dimensional manifold model for
[67] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising image processing,” SIAM J. Imaging Sci., vol. 10, no. 4, pp. 1669–1690,
with block-matching and 3D filtering,” in Image Process.: Algorithms 2017.
Syst. Neural Netw. Mach. Learn., vol. 6064, 2006, p. 606414. [92] S. Helgason, Differential geometry, Lie groups, and symmetric spaces.
[68] G. Cheung, W.-S. Kim, A. Ortega, J. Ishida, and A. Kubota, “Depth Academic Press, 1979.
map coding using graph based transform and transform domain sparsi- [93] Y. Wang, A. Ortega, D. Tian, and A. Vetro, “A graph-based joint
fication,” in Proc. IEEE Workshop Multimedia Signal Process., 2011, bilateral approach for depth enhancement,” in Proc. IEEE Int. Conf.
pp. 1–6. Acoustic Speech Signal Process., 2014, pp. 885–889.
[69] W.-S. Kim, S. K. Narang, and A. Ortega, “Graph based transforms for [94] J. Zeng, J. Pang, W. Sun, and G. Cheung, “Deep graph laplacian
depth video coding,” in Proc. IEEE Int. Conf. Acoustics Speech Signal regularization for robust denoising of real images,” in Proc. IEEE/CVF
Process., 2012, pp. 813–816. Conf. Comput. Vis. Pattern Recognit. Workshops, 2019, pp. 0–0.
[70] W. Hu, G. Cheung, X. Li, and O. Au, “Depth map compression [95] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol.
using multi-resolution graph-based transform for depth-image-based 521, no. 7553, pp. 436–444, 2015.
rendering,” in Proc. IEEE Int. Conf. Image Process., 2012, pp. 1297– [96] X. Qi, R. Liao, J. Jia, S. Fidler, and R. Urtasun, “3D graph neural
1300. networks for RGBD semantic segmentation,” in Proc. ICCV, 2017, pp.
[71] Y.-H. Chao, A. Ortega, W. Hu, and G. Cheung, “Edge-adaptive depth 5199–5208.
map coding with lifting transform on graphs,” in Proc. Picture Coding [97] Y. Zhang and M. Rabbat, “A graph-CNN for 3D point cloud classi-
Symp., 2015, pp. 60–64. fication,” in Proc. IEEE Int. Conf. Acoustics Speech Signal Process.,
[72] R. A. Cohen, D. Tian, and A. Vetro, “Attribute compression for sparse 2018, pp. 6279–6283.
point clouds using graph transforms,” in Proc. IEEE Int. Conf. Image [98] H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, “Multi-view
Process., 2016, pp. 1374–1378. convolutional neural networks for 3D shape recognition,” in Proc. IEEE
[73] ——, “Point cloud attribute compression using 3-D intra prediction and Int. Conf. Comput. Vis., 2015, pp. 945–953.
shape-adaptive transforms,” in Proc. Data Compression Conf., 2016, [99] C. Ma, Y. Guo, J. Yang, and W. An, “Learning multi-view representa-
pp. 141–150. tion with LSTM for 3-D shape recognition and retrieval,” IEEE Trans.
[74] Y. Shao, Z. Zhang, Z. Li, K. Fan, and G. Li, “Attribute compression of Multimedia, vol. 21, no. 5, pp. 1169–1182, 2018.
3D point clouds using Laplacian sparsity optimized graph transform,” [100] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep learning
in Proc. IEEE Vis. Commun. Image Process., 2017, pp. 1–4. on point sets for 3D classification and segmentation,” in Proc. CVPR,
[75] Y. Xu, W. Hu, S. Wang, X. Zhang, S. Wang, S. Ma, and W. Gao, 2017, pp. 652–660.
“Cluster-based point cloud coding with normal weighted graph Fourier [101] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “PointNet++: Deep hierarchical
transform,” in Proc. IEEE Int. Conf. Acoustics Speech Signal Process., feature learning on point sets in a metric space,” in Proc. Adv. Neural
2018, pp. 1753–1757. Inf. Process. Syst., 2017, pp. 5099–5108.
[76] R. L. de Queiroz and P. A. Chou, “Transform coding for point clouds [102] Y. Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen, “PointCNN: Con-
using a gaussian process model,” IEEE Trans. Image Process., vol. 26, volution on x-transformed points,” in Proc. Adv. Neural Inf. Process.
no. 7, pp. 3507–3517, 2017. Syst., 2018, pp. 820–830.
[77] P. de Oliveira Rente, C. Brites, J. Ascenso, and F. Pereira, “Graph-based [103] D. Valsesia, G. Fracastoro, and E. Magli, “Deep graph-convolutional
static 3D point clouds geometry coding,” IEEE Trans. Multimedia, image denoising,” IEEE Trans. Image Process., vol. 29, pp. 8226–8237,
vol. 21, no. 2, pp. 284–299, 2018. 2020.
IEEE TMM OVERVIEW ARTICLE 16

[104] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks Jiahao Pang (Member, IEEE) received the B.Eng.
and locally connected networks on graphs,” in Proc. Int. Conf. Learn. degree from South China University of Technology,
Rep., 2014. Guangzhou, China, in 2010, and the M.Sc. and Ph.D.
[105] M. Henaff, J. Bruna, and Y. LeCun, “Deep convolutional networks on degrees from the Hong Kong University of Science
graph-structured data,” arXiv preprint arXiv:1506.05163, 2015. and Technology, Hong Kong, China, in 2011 and
[106] M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional 2016, respectively. He was a Senior Researcher with
neural networks on graphs with fast localized spectral filtering,” in SenseTime Group Limited, Hong Kong, China, from
Proc. Adv. Neural Inf. Process. Syst., 2016, pp. 3844–3852. 2016 to 2019. He is currently a Staff Engineer with
[107] T. N. Kipf and M. Welling, “Semi-supervised classification with graph InterDigital, Princeton, NJ, USA. His research inter-
convolutional networks,” Proc. Int. Conf. Learn. Rep., April 2017. ests include 3D computer vision, image processing,
[108] M. Simonovsky and N. Komodakis, “Dynamic edge-conditioned filters graph signal processing and deep learning. He has
in convolutional neural networks on graphs,” in Proc. CVPR, 2017, pp. over 30 international journal and conference publications.
3693–3702. Xianming Liu (Member, IEEE) is a Professor
[109] F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, and M. M. with the School of Computer Science and Technol-
Bronstein, “Geometric deep learning on graphs and manifolds using ogy, Harbin Institute of Technology (HIT), Harbin,
mixture model CNNs,” in Proc. CVPR, 2017, pp. 5115–5124. China. He received the B.S., M.S., and Ph.D. degrees
[110] Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. in computer science from HIT, in 2006, 2008 and
Solomon, “Dynamic graph CNN for learning on point clouds,” ACM 2012, respectively. In 2011, he spent half a year
Trans. Graphics, vol. 38, no. 5, pp. 1–12, 2019. at the Dept of Electrical & Computer Engineering,
[111] F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, and K. Weinberger, McMaster University, Canada, as a visiting student,
“Simplifying graph convolutional networks,” in Proc. Int. Conf. Mach. where he then worked as a post-doctoral fellow
Learn.g, 2019, pp. 6861–6871. (2012-2013). He worked as a project researcher at
[112] G. Fu, Y. Hou, J. Zhang, K. Ma, B. F. Kamhoua, and J. Cheng, National Institute of Informatics (NII), Tokyo, Japan
“Understanding graph neural networks from graph signal denoising (2014-2017). He has published over 80 international conference and journal
perspectives,” arXiv preprint arXiv:2006.04386, 2020. publications, including top IEEE journals. He is the recipient of IEEE ICME
[113] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and 2016 Best Student Paper Award. His research interests include multimedia
Y. Bengio, “Graph attention networks,” Proc. Int. Conf. Learn. Rep., signal processing and computational imaging.
April 2018. Dong Tian (Senior Member, IEEE) received the
[114] G. Te, W. Hu, A. Zheng, and Z. Guo, “RGCNN: Regularized graph B.S. and M.Sc degrees from University of Science
CNN for point cloud segmentation,” in Proc. ACM Int. Conf. Multime- and Technology of China, Hefei, China, in 1995 and
dia, 2018, pp. 746–754. 1998, and the Ph.D. degree from Beijing University
[115] X. Dong, D. Thanou, L. Toni, M. Bronstein, and P. Frossard, “Graph of Technology, Beijing, in 2001. He is currently a
signal processing for machine learning: A review and new perspec- Sr. Principal Engineer with InterDigital, Princeton,
tives,” IEEE Signal Process. Mag., vol. 37, no. 6, pp. 117–127, 2020. NJ, USA, after serving as a Sr. Principal Research
[116] S. Chen, C. Duan, Y. Yang, D. Li, C. Feng, and D. Tian, “Deep Scientist with MERL, Cambridge, MA USA from
unsupervised learning of 3D point clouds via graph topology inference 2010-2018, a Sr. Researcher with Thomson Cor-
and filtering,” IEEE Trans. Image Process., vol. 29, pp. 3183–3198, porate Research, Princeton, NJ, USA, from 2006-
2019. 2010, and a Researcher with Tampere University of
[117] X. Gao, W. Hu, and Z. Guo, “Exploring structure-adaptive graph Technology from 2002-2005. His research interests include image processing,
learning for robust semi-supervised classification,” in Proc. IEEE Int. point cloud processing, graph signal processing, and deep learning. He has
Conf. Multimedia Expo, 2020, pp. 1–6. been actively contributing to both standards and academic communities. Dr.
[118] J. Tang, X. Gao, and W. Hu, “RGLN: Robust residual graph learning Tian serves as an AE of TIP (2018-), General Co-Chair of MMSP’20, TPC
networks via similarity-preserving mapping on graphs,” in Proc. IEEE chair of MMSP’19, etc. He is also a TC member of IEEE MMSP, and IDSP.
Int. Conf. Acoustics Speech Signal Process., 2021.
[119] V. Gripon, A. Ortega, and B. Girault, “An inside look at deep neural Chia-Wen Lin (Fellow, IEEE) received his
networks using graph signal processing,” in Proc. IEEE Inf. Theory Ph.D. degree from National Tsing Hua University
Appl. Workshop, 2018, pp. 1–9. (NTHU), Taiwan, in 2000. Dr. Lin is currently Pro-
[120] I. Ziko, J. Dolz, E. Granger, and I. B. Ayed, “Laplacian regularized fessor with the Department of Electrical Engineering
few-shot learning,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 11 660– and the Institute of Communications Engineering,
11 670. NTHU, and R&D Director of the Electronic and
[121] J. Zeng, J. Pang, W. Sun, and G. Cheung, “Deep graph laplacian Optoelectronic System Research Laboratories, In-
regularization for robust denoising of real images,” in Proc. IEEE/CVF dustrial Technology Research Institute. His research
Conf. Comput. Vis. Pattern Recognit. Workshops, 2019, pp. 1759–1768. interests include image and video processing, com-
[122] T. Maugey, A. Ortega, and P. Frossard, “Graph-based representation puter vision, and video networking. He served as
for multiview image geometry,” IEEE Trans. Image Process., vol. 24, Fellow evaluating Committee member (2021) and
no. 5, pp. 1573–1586, 2015. Distinguished Lecturer (2018–2019) of IEEE Circuits and Systems Society. He
has served on the editorial boards of TMM, TIP, TCSVT, IEEE Multimedia,
and Elsevier JVCI. He is Chair of ICME Steering Committee and was Steering
Committee member of IEEE TMM from 2013 to 2015. He served as TPC
Co-Chair of ICIP 2019 and ICME 2010, and General Co-Chair of VCIP 2018.
He received best paper awards from VCIP 2010 and 2015.
Wei Hu (Senior Member, IEEE) received the B.S.
degree in Electrical Engineering from the University Anthony Vetro (Fellow, IEEE) received the B.S.,
of Science and Technology of China in 2010, and M.S., and Ph.D. degrees in Electrical Engineering
the Ph.D. degree in Electronic and Computer Engi- from Polytechnic University, Brooklyn, NY. Dr.
neering from the Hong Kong University of Science Vetro is currently VP & Director at Mitsubishi
and Technology in 2015. She was a Researcher Electric Research Labs, in Cambridge, MA. He is
with Technicolor, Rennes, France, from 2015 to responsible for AI related research in the areas of
2017. She is currently an Assistant Professor with computer vision, speech/audio processing, and data
Wangxuan Institute of Computer Technology, Peking analytics. In his 20+ years with the company, he
University. Her research interests are graph signal has contributed to the development and transfer of
processing, graph-based machine learning and 3D several technologies to Mitsubishi products. He has
visual computing. She has authored over 50 international journal and con- published more than 200 papers and has been a
ference publications, with several paper awards including Best Student Paper member of the MPEG and ITU-T video coding standardization committees
Runner Up Award in ICME 2020 and Best Paper Candidate in CVPR 2021. for a number of years, serving in numerous leadership roles. He is also active
She was awarded the 2021 IEEE Multimedia Rising Star Award—Honorable in various IEEE conferences, technical committees, and boards, most recently
Mention. She serves as an Associate Editor for Signal Processing Magazine, serving on the Conference Board of IEEE SPS, as a Senior AE for the Open
IEEE Transactions on Signal and Information Processing over Networks, etc. Journal on Signal Processing, and as General Co-Chair of ICIP 2017.

You might also like