0% found this document useful (0 votes)
15 views

A Review of Sparse Recovery Algorithms

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

A Review of Sparse Recovery Algorithms

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

Digital Object Identifier

A Review of Sparse Recovery Algorithms


E. C. MARQUES1 , N. MACIEL1 , L. A. B. NAVINER1 , (Senior Member, IEEE), H. CAI12 ,
(Member, IEEE), and J. YANG2 , (Member, IEEE)
1
Télécom ParisTech, Université Paris-Saclay, 46 Rue Barrault, Paris, France (e-mail: {ecrespo,nmaciel,lirida.naviner}@telecom-paristech.fr)
2
National ASIC System Engineering Center, Southeast University, Nanjing, 210096, China (e-mail:{hao.cai,dragon}@seu.edu.cn)
Corresponding author: E. C. Marques (e-mail: [email protected]).

ABSTRACT Nowadays, a large amount of information has to be transmitted or processed. This implies
high-power processing, large memory density, and increased energy consumption. In several applications,
such as imaging, radar, speech recognition, and data acquisition, the signals involved can be considered
sparse or compressive in some domain. The compressive sensing theory could be a proper candidate to deal
with these constraints. It can be used to recover sparse or compressive signals with fewer measurements
than traditional methods. Two problems must be addressed by compressive sensing theory: design of the
measurement matrix and development of an efficient sparse recovery algorithm. These algorithms are
usually classified into three categories: convex relaxation, non-convex optimization techniques, and greedy
algorithms. This paper intends to supply a comprehensive study and a state-of-the-art review of these
algorithms to researchers who wish to develop and use them. Moreover, a wide range of compressive sensing
theory applications is summarized and some open research challenges are presented.

INDEX TERMS
Bayesian compressive sensing, compressive sensing, convex relaxation, greedy algorithms, sparse recovery
algorithms, sparse signals.

I. INTRODUCTION Compressive Sensing (CS) theory can be very useful when


ITH the increasing amount of information available signals are sparse or compressible. This theory was devel-
W in the age of big data, the complexity and the cost
of processing high-dimensional data systems become very
oped by Candes et al [9] and Donoho [10]. CS combines
the acquisition and compression processes, exploiting the
critical [1]. Therefore, developing methods to acquire the signals’ sparsity. It allows reducing time processing and en-
previous information of the signals is very important and ergy consumption as well as improving storage capacities [5].
useful [2]. While tradition methods use the Nyquist rate, that is, the rate
The Shannon-Nyquist sampling theorem is traditionally depends on the highest frequency component of the signal,
used to reconstruct images or signals from measured data. CS relies on sampling rate related to the signal’s sparsity.
According to this theorem, a signal can be perfectly recon- Researchers have invested effort in developing an efficient
structed from its samples if it is sampled by the Nyquist rate, algorithm for sparse signal estimation. In fact, there are some
that is, a rate equal to twice the bandwidth of the signal [3]. survey papers in the literature that address greedy algorithms
In several applications, especially those involving ultra- for sparse signal recovery [11]–[15], measurement matrices
wideband communications and digital image, the Nyquist used in compressive sensing [16], [17], and compressive
rate can be very high, resulting in too many samples, which sensing applications [18]–[24]. Various works focus on one
makes it difficult to store or transmit them [4]–[6]. Further- specific category of sparse recovery algorithm or one specific
more, it can become unfeasible to implement these scenarios. application of CS theory. Moreover, the recently work [25]
Most real signals can be considered sparse or compressive reviews the basic theoretical concepts related to CS and the
in some domain. Such signals have a lot of coefficients CS acquisition strategies. However, although [25] presents
which are equal to or close to zero. For example, many some reconstruction algorithms, it is not focused on them.
communication signals can be compressible in a Fourier Due to the significant amount of literature available, this
basis, while discrete cosine and wavelet bases tend to be work aims to review some concepts and applications of
suitable for compressing natural images [7]. Moreover, a compressive sensing, and to provide a survey on the most
sparse representation of a signal allows more efficient signal important sparse recovery algorithms from each category.
processing [8]. This work focuses on the single measurement vector prob-

VOLUME , 2018 1

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

5 2 CS includes three main steps: sparse representation, CS


acquisition (measurement), and CS reconstruction (sparse

Amplitude
Amplitude

1
0 recovery) as illustrated in Fig. 2 [30], [31].
0

−1
−5 Ψ A
0 50 100 150 200 0 50 100 150 200

Time Frequency z Sparse h CS y CS ĥ


(a) Time domain (b) Frequency domain Representation Acquisition Reconstruction

FIGURE 1. Samples of 8 sinusoids in (a) time and (b) frequency domains. FIGURE 2. Compressive sensing main processes.

In a sparse representation, the signal is represented as a


lem. Other works focused on Multiple Measurement Vectors projection on a suitable basis, i.e., a linear combination of
(MMV) can be found in [26]–[29]. Some open research chal- only s basis vectors, with s  N . It means that a signal z
lenges related to sparse signal estimation are also discussed. with N × 1 column vector in its original representation can
The main contributions of this paper are: be represented with a basis of N × 1 vectors {ψi }Ni=1 . Let Ψ
• A review of the most important concepts related to be the N × N basis matrix, the signal can be represented in
compressive sensing and sparse recovery algorithms; its sparse form h by:
• A list of some applications of CS; z = Ψh (2)
• Comparison of some sparse recovery algorithms;
• Open research challenges. Next, in the second step (measurement - CS Acquisition),
To facilitate the reading of this paper, Table 1 provides the signal z is measured by sampling it according to a matrix
the definitions of acronyms and notations. Moreover, vectors Φ ∈ CM ×N , where φi denotes the ith column of the matrix
are denoted by bolded lowercase letters, while bolded capital Φ. The system model is defined by:
letters represent matrices. y = Φz + n = ΦΨh + n = Ah + n (3)
The paper is structured as follows. Section II introduces the
key concepts of the CS theory. Its applications are illustrated where y = [y1 , y2 , ..., yM ]T denotes the received signal, h =
in Section III. Section IV presents an overview of sparse re- [h1 , h2 , ..., hN ]T is the sparse signal vector with N > M and
covery algorithms. Discussion of these algorithms and some n is the noise.
open research challenges are explained in Section V. Finally, Recovery is possible if the following two fundamental
conclusions are presented in Section VI. premises underlying CS are attended [10]:
• Sparsity - means that the signal could be characterized
II. COMPRESSIVE SENSING (CS) by few significant terms in some domain.
The main idea of compressive sensing is to recover signals • Incoherence - states that distances between sparse sig-
from fewer measurements than the Nyquist rate [9], [10]. nals are approximately conserved as distances between
The underlying assumption is that signals are sparse or their respective measurements generated by the sam-
compressible by some transforms (e.g., Fourier, wavelets). pling process.
A s-sparse signal has only s non-zero coefficients. Other- The largest correlation between any two elements of Ψ and
wise, the coefficients z(i) of a compressible signal decrease Φ is measured by the coherence between these matrices and
in magnitude according to: defined by:
|z(I(k))| 6 Ck −1/r k = 1, ..., n (1) √
µ(Φ, Ψ) = N max |< ϕk , ψj >| (4)
1≤k,j≤N
where I(k) is the k th largest component of z sorted by
magnitude from largest to smallest. Due to their rapid decay, If Φ and Ψ contain correlated elements, the coherence
such signals can be well approximated by s-sparse signals, is large. On the contrary, the coherence is small. Compres-
keeping just the s largest coefficients of z. sive sensing is mainly concerned with low coherence pairs.
Fig. 1 shows a 200 samples length time-domain signal In [32], considering C as a constant, the authors showed that
(Fig. 1(a)) representing 8 distinct sinusoids (Fig. 1(b)). This if (5) holds, then with overwhelming probability one sparse
figure is an example of 8-sparse signal in frequency domain, recovery algorithm will recover the signal.
that is, it can be seen in Fig. 1(b) that only 8 non-zero values M ≥ Cµ2 (Φ, Ψ)s log N (5)
exist among the 200 frequencies.
CS allows two main advantages: reduce the energy for Equation (5) shows that fewer measurements will be re-
transmission and storage through the projection of the infor- quired to recover the signal if the coherence between Ψ and
mation into a lower dimensional space; and reduce the power Φ is small [2].
consumption by reducing the sampling rate to the signal’s As illustrated in Fig.2, the last step (sparse recovery - CS
information content rather than to its bandwidth [10]. Reconstruction) recovers the sparse signal from a small set
2 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

TABLE 1. Definitions of Acronyms and Notations.

Acronym / Definition Acronym / Definition


Notation Notation
ADC Analog to Digital Conversion NMSE Normalized Mean Squared Error
AIC Analog to Information Conversion OMP Orthogonal Matching Pursuit
AMP Approximate Message Passing RIC Restricted Isometry Constant
AST Affine Scaling Transformation RIP Restricted Isometry Property
BAOMP Back-tracking based Adaptive Orthogonal Matching Pursuit ROMP Regularized OMP
BCS Bayesian Compressive Sensing RVM Relevance Vector Machine
BOMP Block Orthogonal Matching Pursuit SP Subspace Pursuit
BP Basis Pursuit SBL Sparse Bayesian Learning
BPDN BP de-noising SGP Stochastic Gradient Pursuit
CoSaMP Compressive Sampling Matching Pursuit SLSMP Sequential Least Squares Matching Pursuit
CP Chaining Pursuit SpAdOMP Sparse Adaptive Orthogonal Matching Pursuit
CR Cognitive Radio SpaRSA Sparse Reconstruction by Separable Approximation
CS Compressive Sensing StOMP Stagewise Orthogonal Matching Pursuit
DOA Direction-of-Arrival TOMP Tree-based Orthogonal Matching Pursuit
D-OMP Differential Orthogonal Matching Pursuit TSMP Tree Search Matching Pursuit
DS Dantzig Selector VAMP Vector Approximate Message Passing
EM Expectation-Maximization WSN Wireless Sensors Network
FBP Forward-Backward Pursuit (·)H hermitian transpose matrix
FISTA Fast Iterative Shrinkage Thresholding Algorithm (·)† pseudo-inverse matrix
FOCUSS Focal Underdetermined System Solution µ() coherence
FPGA Field Programmable Gate Array (·)−1 inverse matrix
GBP Greedy Basis Pursuit Sτ () soft thresholding function defined by (32)
GOAMP Generalized Orthogonal Adaptive Matching Pursuit λ the absolute correlation of the active columns
GOMP Generalized Orthogonal Matching Pursuit  error tolerance
GP Gradient Pursuit s signal‘s sparsity
HDTV High-Definition Television J set of index selected
HHS Heavy Hitters on Steroids N length of the signal h
IBTMC Inter-Burst Translational Motion Compensation M length of the signal y
IHT Iterative Hard Thresholding T maximum of iterations
IoT Internet of Things Λi set of the indices chosen untill iteration i
IRLS Iterative Reweighted Least Squares C constant
ISAR Inverse Synthetic Aperture Radar δs the smallest number that achieves RIP
IST Iterative Soft Thresholding Nit number of iterations
p 1/p
Pn
ISDB-T Integrated Services Digital Broadcasting-Terrestrial lp norm - ||x||p =

|| · ||p i=1 |xi |
LARS Least Angle Regression tS threshold value
LASSO Least Absolute Shrinkage and Selection Operator γi step size at iteration i
LP Linear Programming h N × 1 sparse signal to be estimated
LS Least Square ĥ estimation of h
MAP Maximum a posteriori Ψ N × N basis matrix
ML Maximum Likelihood A M × N measurement matrix
MMP Multipath Matching Pursuit y M × 1 received signal
MMV Multiple Measurement Vectors b residual vector
MP Matching Pursuit c projection vector
MPLS Matching Pursuit based on Least Squares d direction vector
MRI Magnetic Resonance Imaging < ., . > inner product
Hs () sets all components to zero except the A(Λi ) submatrix of A containing only
s largest magnitude components those columns of A with indices in Λi

of measurements y through a specific sparse recovery algo- y A h


rithm [31]. This step concerns the development of efficient
sparse recovery algorithms. Some of them are addressed in M ×1 = × N ×1
Section IV. measurements
Fig. 3 illustrates the relationship between the variables in s
M ×N non-zero
a noiseless scenario. This work considers that the signal to values
be estimated is already in its sparse representation in a noisy s<M N
scenario. The system is defined by:
FIGURE 3. Representation of measurements used in Compressive Sensing.
y = Ah + n (6)
One of the challenges associated with the sparse signal
estimation is to identify the locations of the non-zero signal generated by no more than s columns of the matrix A,
components. In other words, this is finding the subspace related to the received signal y. After finding these positions,
VOLUME , 2018 3

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

TABLE 2. Comparison between random and deterministic sensing [2]. of A and AAH has an inverse matrix, according to the LS
algorithm, the unique solution h
b of this optimization problem
Random Sensing Deterministic Sensing
is given by (9) [41].
Outside the mainstream of signal Aligned with the mainstream
processing: worst case of signal processing: average ĥLS = A† y = AH (AAH )−1 y (9)
signal processing case signal processing
Less efficient recovery time More efficient recovery time
It is worth noting that the least squares minimization
No explicit constructions Explicit constructions
Larger storage Efficient storage problem cannot return a sparse vector, so alternatives have
Looser recovery bounds Tighter recovery bounds been sought. By focusing on the sparsity constraint on the
solution and solving the l0 norm minimization described by
(10), it is possible to obtain a sparse approximation ĥ.
the non-zero coefficients can be calculated by applying the
pseudoinversion process. min ||ĥ||0 subject to ||y − Aĥ||22 <  (10)
CS theory addresses two main problems:
The Lemma 1.2 of [42] shows that if the matrix A obeys
• Design of the measurement matrix A.
the RIP condition with constant δ2s < 1, (10) has an unique
• Development of a sparse recovery algorithm for the
solution and h can be reconstructed exactly from yand A.
efficient estimation of h, given only y and A.
Unfortunately, an exhaustive search over all Ns possible
In the first problem, the goal is to design a measurement sparse combinations is required in the l0 minimization prob-
matrix A which assures that the main information of any s- lem, which is computationally intractable for some practical
sparse or compressible signal is in this matrix [2]. The ideal applications. Thus, although this gives the desired solution,
goal is to design an appropriate measurement matrix with in practice it is not feasible to solve this equation. The
M ≈ s. excessive complexity of such a formulation can be avoided
The measurement matrix is very important in the process with the minimization of the l1 problem (11), which can
of recovering the sparse signal. According to [10], if the efficiently compute (10) under certain conditions, as demon-
Restricted Isometry Property (RIP) defined in (7) is satis- strated in [43].
fied, using some recovery algorithm, it is possible to obtain
an accurate estimation of the sparse signal h, for example min ||ĥ||1 subject to ||y − Aĥ||22 <  (11)
solving an lp -norm problem [33]. δs ∈ (0, 1) is the RIC
(Restricted Isometry Constant) value and corresponds to the One of the advantages of the l1 norm minimization ap-
smallest number that achieves (7). proach is that it can be solved efficiently by linear program-
ming techniques [44]. Moreover, in [45], the authors say that
(1 − δs )||h||22 ≤ ||Ah||22 ≤ (1 + δs )||h||22 (7) sparse signals can be recovered through l1 minimization if
Table 2 reproduces a comparison between deterministic M ≈ 2s log(N ).
sensing and random sensing for the measurement matrix A
III. APPLICATION OF COMPRESSIVE SENSING
presented in [2]. The random matrices are one approach to
obtain a measurement matrix A that obeys the RIP condition.
Many works deal with random measurement matrices gener- This section overviews some application areas for the CS
ated by identical and independent distributions (i.i.d.) such theory and its sparse recovery algorithms.
as Bernoulli, Gaussian, and random Fourier ensembles [34]–
[37]. However, these matrices require significant space for A. IMAGE AND VIDEO
storage and they have excessive complexity in reconstruc- 1) Compressive Imaging
tion [2]. Furthermore, it is difficult to verify whether these Natural images can be sparsely represented in wavelet do-
matrices satisfy the RIP property with a small RIC value [2]. mains, so the required number of measurements in com-
Therefore, deterministic matrices have been studied to be pressive imaging can be reduced using CS [46], [47]. One
used as measurement matrices. In [38] and [39], the authors example of application is the single-pixel camera that allows
propose deterministic measurement matrices based on coher- reconstructing an image in a sub-Nyquist image acquisition,
ence and based on RIP, respectively. Moreover, deterministic that is, from fewer measurements than the number of recon-
measurement matrices are constructed via algebraic curves structed pixels [48].
over finite fields in [40]. Furthermore, a survey on determin-
istic measurement matrices for CS can be found in [16]. 2) Medical Imaging
Defined the appropriate measurement matrix A, h can be CS can be very useful for medical imaging. For example,
estimated by the least squares (LS) solution of (6), i.e., solv- the magnetic resonance imaging (MRI) is a time-consuming
ing the problem (8), where  is a predefined error tolerance. and costly process. CS allows to decrease the number of
samples, and then to reduce the time of acquisition [49].
min ||ĥ||2 subject to ||y − Aĥ||22 <  (8)
Similarly, bio-signals such as ECG signals are sparse in either
This system is “underdetermined” (the matrix A has more wavelet or Fourier domain [50]. CS allows to take advantage
columns than rows). Let A† be the pseudo-inverse matrix of the sparsity and reduces the required number of collected
4 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

measurements [49]–[52]. A hardware implementation on a transmission rate. To overcome these challenges, sampling
system on chip (SoC) platform of a solution to tackle big data architectures using CS provide a data acquisition technique
transmission and privacy issues is presented in [53]. with fewer measurements. Thus, the amount of collected data
to be downloaded to Earth and the energy consumption are
3) Video Coding reduced. The simple coding process with low computational
Due to the development and the increase of video surveil- cost provided by the CS promotes its use in real-time ap-
lance, mobile video, and wireless camera sensor networks, plications often found onboard spacecrafts. Moreover, the
wireless video broadcasting is becoming more popular and reconstruction of the signals will be done on Earth where
finding several real-time applications [54], [55]. In these there are much more computing and energy resources than
cases, a single video stream is simultaneously transmitted to onboard a satellite [67], [68].
several receivers with different channel conditions [55]. In
order to do this, many new video codecs have been proposed 4) Machine Learning
using compressive sensing [55]–[58]. Machine learning algorithms perform pattern recognition
(e.g., classification) on data that is too complex to model
4) Compressive Radar analytically to solve high-dimensional problems. However,
Radar imaging systems aim to determine the direction, al- the amount of information generated by acquisition devices
titude, and speed of fixed and moving objects [15]. By is always huge and ever-growing. It can achieve gigabytes of
solving an inverse problem using the compressive sensing data or more that exceeds the processing capacity of the most
theory, the received radar signal can be recovered from fewer sophisticated machine learning algorithms [69]. To reduce
measurements [15]. Therefore, the cost and the complexity the energy consumption of the applications, as in low-power
of the hardware of the receiver are extremely reduced [15], wireless neural recording tasks, signals must be compressed
[59]. Moreover, the CS has been a novel way to deal with the before transmission to extend battery life. In these cases, the
Inter-Burst Translational Motion Compensation (IBTMC) to CS can be used and it was demonstrated its potential in neural
achieve the exact recovery of Inverse Synthetic Aperture recording applications [69]–[71].
Radar (ISAR) images from limited measurements [60].
C. COMMUNICATION SYSTEMS
B. COMPRESSIVE TRANSMISSION DATA 1) Cognitive Radios (CRs)
1) Wireless Sensors Networks (WSNs) Cognitive radios aim to provide a solution to the inefficient
Wireless sensor networks (WSNs) require high communica- usage of the frequency spectrum. Spectrum sensing tech-
tion costs and energy consumption. Due to critically resource niques suffer from computational complexity, hardware cost,
constraints as limited power supply, communication band- and high processing time [31]. Since usually only some of
width, memory, and processing performance, CS can be used the available channels are occupied by the users, the signal of
to reduce the number of bits to be transmitted or to represent interest is normally sparse in the frequency domain. Hence,
the sensed data in WSNs [61]–[64]. the CS can be used to sense a wider spectrum with reduced
sampling requirements, resulting in more power efficient
2) Internet of Things (IoT) systems [4], [18], [72], [73].
The use of internet of things (IoT) devices has increased and
it is estimated that it will continue to do so in the follow- 2) Sparse Channel Estimation
ing years. This includes home automation/control devices, Channels of several communication systems, e.g., underwa-
security cameras, mobile phones, and sensing devices [65]. ter communication systems [74], [75], WideBand HF [76],
However, the IoT devices have computation, energy, and [77], high-definition television (HDTV) [78], [79], Ultra-
congestion constraints. Even if they need to transmit large WideBand communications [6], [80], [81], and mmWave sys-
amounts of data, they usually have limited power and low- tem [82], [83], can be considered or well modelled as sparse
computation capabilities. Moreover, given the large number channels. That is, the impulse response of these channels are
of devices connected, they can suffer from congestion and mainly characterized by few significant components widely
packet drops [65]. Thus, special data transmission strategies separated in some domain. In these cases, better results can
have to be developed to enable low-power and low-cost be achieved using the compressive sensing theory to estimate
signal processing operations, and energy-efficient communi- these sparse channels [84]–[86]. In [87], a low-complexity
cations [65]. Multimedia data usually possesses sparse struc- CS hardware implementation for channel estimation in the
tures. Therefore, the CS theory emerges as a good strategy integrated services digital broadcasting-terrestrial (ISDB-T)
to reduce the amount of data that the IoT devices need to system is proposed in FPGA.
transmit with a high fidelity recovery data [66].
3) Analog to Information Conversion (AIC)
3) Astrophysical signals The analog to digital conversion (ADC) is based on the
Radio receivers located in outer space suffer from strong Nyquist sampling theorem in order to have a perfectly recon-
restrictions on storage capacity, energy consumption, and struction of the information. That is, the signal is uniformly
VOLUME , 2018 5

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

sampled at a rate at least twice its bandwidth. In several appli- are then estimated by finding the minimum number of DOAs
cations, the information of the signal is much smaller than its with a non-zero valued impinging signal that still gives an
bandwidth. In these cases, this represents a waste of hardware acceptable estimate of the array output [23], [104].
and software resources to sample the whole signal. To deal
with this, an analog to information conversion (AIC) can IV. SPARSE RECOVERY ALGORITHMS
use the CS theory to acquire a large bandwidth with relaxed Several sparse recovery algorithms have been proposed in the
sampling rate requirements, enabling faster, less expensive, last years. They have to recover a sparse signal from an un-
and more energy-efficient solutions [88]–[92]. Examples of dersampled set of measurements. They are usually classified
AIC are: random demodulator [91], [93], [94], modulated into three main categories: convex relaxations, non-convex
wideband converter [95] and non-uniform sampling [90], optimization techniques, and greedy algorithms [109]. Fig. 4
[96]–[98]. All these architectures have advantages and lim- shows the algorithms that will be addressed in more details
itations. While the random demodulator AIC employs finite in this work. For the following algorithms, the system model
temporal sampling functions with infinite spectral support, is defined by (6) and the notation is presented in Table 1.
the modulated wideband converter AIC has finite spectral
sampling functions with infinite temporal support. Moreover,
the modulated wideband converter AIC requires a large num- BP MP
ber of branches, so synchronization among the branches is LASSO MPLS
also needed, thus consuming more area and power. On the LARS OMP
other hand, the non-uniform sampling AIC is sensitive to Convex
DS SP
timing jitter, i.e., a sampling time with a small error can lead Relaxation
AMP StOMP
to a big error in the sample value for input signals that change Sparse
GraDeS Greedy CoSaMP
rapidly. Recovery
IST Algorithms Algorithms ROMP
D. DETECTION AND RECOGNITION SYSTEMS GOMP
1) Speech Recognition BCS GOAMP
Dictionary of example speech tokens can be used to sparsely Non-convex
FOCUSS GP
represent speech signals [99]. Moreover, the speech signal Optimization
IRLS MMP
can have sparse representation for a suitable selection of
sparse basis functions, but for the noise, it will be difficult IHT
to derive a sparse representation. So, it is possible to exploit FIGURE 4. Classification of sparse recovery algorithms.
this characteristic and through the CS theory achieve a better
speech recognition performance [20], [99], [100]. Section IV-A presents some algorithms from the first
category. These algorithms result in convex optimization
2) Seismology problems whose efficient solutions exist relying on advanced
The compressive sensing theory has an important use in techniques, such as projected gradient methods, interior-
data acquisition, that is, situations when it is intricate to point methods, or iterative thresholding [84].
obtain a lot of samples, for example in the case of seismic On the other hand, non-convex optimization approaches
data [101]. The layers of the Earth can be estimated by can recovery the signal by taking into account a previous
measuring the reflections of a signal from different layers knowledge of its distribution (see Section IV-B) [31]. Thanks
of the Earth. However, this requires a large data collection to a posterior probability density function, these solutions
that is a time-consuming and expensive process. To deal with offer complete statistics of the estimate. Nonetheless, they
this, several works have proposed the CS for different seismic can be unsuitable for high-dimensional problems due to their
applications [101]–[103]. intensive computational requirements [110].
The third category is composed of the greedy algorithms.
3) Direction-of-Arrival (DOA)
They recover the signal in an iterative way, making a lo-
Direction-of-Arrival (DOA) estimation is the process of de-
cal optimal selection at each iteration hoping to find the
termining which direction a signal impinging on an array has
global optimum solution at the end of the algorithm (see
arrived from [104], [105]. Because there are only a few non-
Section IV-C).
zeros in the spatial spectrum of array signals, which represent
their corresponding spatial locations, this sparsity can be
applied to the DOA estimation [106]. Hence, the compressive A. CONVEX RELAXATIONS
sensing theory can be applied to the problem of DOA estima- 1) Basis Pursuit (BP)
tion by splitting the angular region into N potential DOAs, Basis Pursuit (BP) is a signal processing technique that de-
where only s  N of the DOAs have an impinging signal composes the signal into an superposition of basic elements.
(alternatively N − s of the angular directions have a zero- This decomposition is optimal in the sens that it leads to the
valued impinging signal present) [107], [108]. These DOAs smallest l1 norm of coefficients among all such decomposi-
6 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

X
tions [111]. The BP algorithm seeks to determine a signal’s Bj = −aTj y + aTj al ĥl (18)
representation that solves the problem: l6=j

min ||h||1 subject to y = Ah (12) The original Shooting method is applied to real variables.
For complex variables, an adaptation is necessary. In [117],
BP is a principle of global optimization without any spec-
two schemes are presented to adapt the LASSO algorithm to
ified algorithm. One of a possible algorithm to be used is
estimate a complex signal h:
the BP-simplex [111] that is inspired by the simplex method
• r-LASSO: Let imag(.) and real(.) be the imaginary and
of linear programming [112]. For the BP-simplex, first, an
initial basis A(Λ) is found by selecting M linearly inde- real parts of a complex vector, respectively, it is defined
pendent columns of A. Then, at each step, the swap which by [117]:
best improves the objective functions is chosen to update the real(y) 
yR = imag(y) real(h) 
, hR = imag(h) ,
current basis, that is, one term in the basis is swapped for one R real(A) −imag(A) 
term that is not in the basis [111]. A = imag(A) real(A) (19)
In [113], the authors propose an algorithm for BP called These definitions are used in the Shooting method in
Greedy Basis Pursuit (GBP). Unlike standard linear program- (16) and each j th element of ĥ is calculated by [117]:
ming methods for BP, the GBP algorithm proceeds more like √
the MP algorithm, that is, it builds up the representation by ĥj = ĥR
j + −1ĥR j+N (20)
iteratively selecting columns based on computational geome- • c-LASSO: The complex l1 -norm can be solved by some
try [113]. Moreover, the GBP allows discarding columns that methods [117], [118]. It is defined by:
have already been selected [113]. X Xp
||h||1 = |hi | = real(hi )2 + imag(hi )2 (21)
2) BP de-noising (BPDN) / Least Absolute Shrinkage and i i
Selection Operator (LASSO) In many applications, the imaginary and real components
The Basis Pursuit Denoising (BPDN) [111] / Least Absolute tend to be either zero or non-zero simultaneously [117].
Shrinkage and Selection Operator (LASSO) [114] algorithm However, the r-LASSO does not take into account the infor-
considers the presence of the noise n: mation about any potential grouping of the real and imagi-
min ||h||1 subject to y = Ah + n (13) nary parts [117]. On the other hand, the c-LASSO consid-
ers this extra information [117]. A comparison between r-
and aims to solve the optimization problem defined by: LASSO and c-LASSO performed in [117] concludes that
1 the c-LASSO outperforms the r-LASSO since it exploits the
min( ||y − Ah||22 + λp ||h||1 ) (14)
2 connection between the imaginary and the real parts.
where λp > 0 is a scalar parameter [111], [114]. Its value
greatly influences on the performance of the LASSO algo- 3) Least Angle Regression (LARS)
rithm and therefore should be chosen carefully. In [111], the The Least Angle Regression (LARS) algorithm begins with
authors suggest: p ĥ = 0, the residual vector b0 = y, and the active set Λ = ∅.
λp = σ 2 log(p) (15) This algorithm selects a new column from the matrix A at
where σ > 0 is the noise level and p is the cardinality of the each iteration i and adds its index to the set Λi [119]. The
dictionary [111]. column aj1 that has a smaller angle with b0 is selected at the
Comparing with the LS cost function, it is possible to see first iteration. Then, the coefficient ĥ1 (j1 ) associated with the
that (14) basically includes a l1 norm penalty term. Hence, selected column aj1 is increased [119]. Next, the smallest
under certain conditions, the solution would achieve the possible steps in the direction of the column aj1 is taken
minimal LS error [115]. Since ||h||1 is not differentiable for until another column aj2 has as much absolute correlation
any zero position of h, it is not possible to obtain an analytical value with the current residual as the column aj1 . The algo-
solution for the global minimum of (14). rithm continues in a direction equiangular between the two
There are several iterative techniques to find the minimum active columns (aj1 ,aj2 ) until a third column aj3 earns its
of (14) [111], [114]. One of these is called “Shooting” [116] way into the most correlated set [119]. The algorithm stops
and starts by the solution: when no remaining column has correlation with the current
residual [119].
ĥ = (AH A + I)−1 AH y (16) Fig. 5 illustrates the begin of the LARS algorithm consid-
where I is the identity matrix. Let aj be the j column of th ering a two-dimensional system. As said before, LARS starts
the matrix A and Bj be defined by (18), each j th element of with ĥ0 = 0 and the residual vector b0 = y. Let θt (i) be
ĥ is updated by: the angle between the column aji and the current residual
 λ−Bj vector bi = y −Aĥi at iteration i, the column aj1 is selected
 aTj aj , if Bj > λ

due to its absolute correlation with the initial residual vector
ĥj = −λ−BT
j
, if Bj < −λ (17) compared to aj2 (θ1 (1) < θ1 (2)) [120]. Next, the algorithm
 aj aj
continues in the direction of aj1 by adding the step size γ1 . γ1

0, if |Bj | ≤ λ
VOLUME , 2018 7

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

is chosen in a way to guarantee that the columns aj1 and aj2 where b = y − Aĥ is the residual vector, σ is the standard
have the same absolute correlation with the current residual deviation of the Additive White Gaussian Noise in (6),
√ λN >
vector at the next iteration (θ2 (1) = θ2 (2)). The solution 0 and all the columns of A have norm less than 1 + δ1 .
coefficient is ĥ1 (j1 ) = γ1 [120]. The column aj2 is added ||AT b||∞ is defined by:
to the set Λ at the second iteration, and the LARS continues
||AT b||∞ = sup |(AT b)i | (23)
in a equiangular direction with aj1 and aj2 . Then, the step 1≤i≤N
size γ2 that leads to the vector y is added [120]. Finally, the
For an orthogonal matrix A, the Dantzig Selector is the l1 -
solution coefficients are equal to: ĥ2 (j1 ) = γ1 +γ2 d2 (j1 ) and
minimizer subject to the constraint ||AT y − ĥ||∞ ≤ λN σ,
ĥ2 (j2 ) = γ2 d2 (j2 ), where d2 is the updated direction at the
and the ith element of ĥ is calculated by:
second iteration that is equiangular with the active columns
(aj1 , aj2 ). The estimated vector ĥ is updated by multiplying ĥ(i) = max(|(AT y)i | − λN σ, 0)sgn((AT y)i ) (24)
the step size γ with the updated direction d [120]. The
algorithm continues until the residual be zero. 5) Approximate Message Passing (AMP)
The Approximate Message Passing (AMP) algorithm is de-
aj2 a0j2 scribed in [122], [123]. This algorithm starts by ĥ0 = 0 and
b0 = y. Then, in each iteration i, it updates these vectors by:
y
ĥi = ηi−1 (ĥi−1 + AT bi−1 ) (25)
θ1 (2) θ2 (2) 1 D
0
E
γ2 bi = y − Aĥi + bi−1 ηi−1 (AT bi−1 + ĥi−1 )(26)
θ1 (1) θ2 (1) δ
aj1
γ1 where δ P = M/N , ηi (.) is the soft thresholding function,
N
i=1 ui /N for uD = (u1 , ..., uN ) andE ηi (s) =
0
ĥ1 (j1 ) = γ1 ĥ2 (j1 ) = γ1 +γ2 d2 (j1 ) hui =
ĥ1 (j2 ) = 0 ĥ2 (j2 ) = γ2 d2 (j2 )
∂s ηi (s). The term δ bi−1 ηi (A bi−1 + ĥi−1 ) is from
∂ 1 0 T

FIGURE 5. LARS approximates the vector y by using aj1 and aj2 [120]. theory of belief propagation in graphical model [122].
The thresholding function ηi (.) depends on iteration and
A modified LARS called “homotopy algorithm” was pro- problem setting. In [123], the authors consider the threshold
posed by Donoho and Tsaig to find a sparse solution of an control parameter λ and ηi (.) = η(.; λσi ) defined by:
underdetermined linear system [44]. 
(u − λσi ), if u ≥ λσi
These steps can summarize the LARS algorithm [120]: η(u; λσi ) = (u + λσi ), if u ≤ −λσi (27)
• Step 1: Initialize the residual vector b0 = y, the active 
0, otherwise
set Λ = ∅, ĥ0 = 0 and the iteration counter i = 1.
where σi is the mean square error of the current estimate
• Step 2: Calculate the correlation vector: ci = A bi−1 .
T
solution ĥi at iteration i. The optimal value of λ is [122]:
• Step 3: Find the maximum absolute value in the corre-
lation vector: λi = ||ci ||∞ . 1 n
1−(2/δ)[(1+x2 )Φ(−x)−xφ(x)]
o
λ(δ) = √ arg max 1+x 2 −2[(1+x2 )Φ(−x)−xφ(x)] (28)
• Step 4: Stop the algorithm if λ ≈ 0. If not, go to Step 5. δ x≤0
• Step 5: Find the active set: Λ = {j : |ci (j)| = λi }. R x −t2 /2 −x2 /2

• Step 6: Solve the following least square prob-


where Φ(x) = −∞ e√2π dt and φ(x) = e √2π .
lem to find active entries of the updated direction: A high-speed FPGA implementation of AMP is presented
AT (Λ)A(Λ)di (Λ) = sign(ci (Λ)). in [124]. Moreover, in [125], the authors present an im-
• Step 7: Set the inactive entries of the updated direction
plementation of AMP based on memristive crossbar arrays.
to zero: di (ΛC ) = 0. Furthemore, an adaptive complex approximate message pass-
• Step 8: Calculate the step size γi by:
ing (CAMP) algorithm and its hardware implementation in
( ) FPGA are proposed in [126].
λi − ci (j) λi + ci (j)
γi = minc , 6) Gradient Descent with Sparsification (GraDeS)
j∈Λ 1 − aTj A(Λ)di (Λ) 1 + aTj A(Λ)di (Λ)
This algorithm was proposed in [127]. It considers a mea-
• Step 9: Calculate ĥi = ĥi−1 + γi di . surement matrix A which satisfies the RIP with an isometric
• Step 10: Update bi = y − Aĥi . constant δ2s < 1/3. This algorithm finds a sparse solution
• Step 11: Stop the algorithm if ||bi ||2 < . Otherwise, for the l1 minimization problem in an iterative way.
set i = i + 1 and return to Step 2. First, the algorithm initializes the signal estimation ĥ0 =
0. Then, in each iteration i, it estimates the signal by:
4) The Dantzig Selector (DS) 
1

The Dantzig Selector (DS) is a solution to l1 minimization ĥi = Hs ĥi−1 + AH (y − Aĥi−1 ) (29)
γ
problem [121]:
p where γ > 1 and the operator Hs () sets all components to
min ||ĥ||1 subjet to ||AT b||∞ ≤ 1 + δ1 λN σ (22) zero except the s largest magnitude components.
8 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

7) Iterative Soft Thresholding (IST) B. NON-CONVEX OPTIMIZATION TECHNIQUES


In [128], the authors demonstrate that soft thresholding can 1) Bayesian Compressive Sensing (BCS)
be used to minimize equations of the form: Let σ 2 be the noise variance, the sparse Bayesian learning
1 2
(SBL) assumes the Gaussian likelihood model [131]:
kAh − yk2 + τ khk1 (30)
2 
−1

2 2 −M/2 2
The solution is given by the limit of the sequence, where p(y|h; σ ) = (2πσ ) exp ||y − Ah|| (37)
2σ 2
each iteration called Landweber iteration is defined by [128]:
   In a Bayesian formulation, the formalization that h is
ĥi = Sτ ĥi−1 + βAH y − Aĥi−1 (31) sparse is made by placing a sparseness-promoting prior on
h [132]. The Laplace density function is a widely used
where ĥ0 = 0, β is a stepsize, and Sτ (.) is the soft sparseness prior [133], [134]:
thresholding function defined by (32) applied to each element
N
 N !
of the vector. λ X
p(h|λ) = exp −λ |hi | (38)
2

x − τ, if x > τ i=1
Sτ (x) = 0, if |x| 6 τ (32)

x + τ, if x < −τ and henceforth the subscript s on h is dropped, recognizing
that the interest is in a sparse solution for the weights [132].
Let BR ⊂ Rn be the l1 ball of radius R defined by: Thus, the solution of (6) corresponds to a maximum a poste-
riori (MAP) estimate for using the prior in (38) [114], [133].
BR = {h ∈ Rn : khk1 6 R} (33)
According to the Bayesian probability theory, we consider
In [129], the authors suggest a different form of (31) which that a class of prior probability distributions p(θ) is conjugate
calls the projected Landweber iteration given by: to a class of likelihood functions p(x|θ) if the resulting poste-
   rior distributions p(θ|x) are in the same family as p(θ) [132].
ĥi = PR ĥi−1 + AH y − Aĥi−1 (34) Since the Laplace prior is not conjugate to the Gaussian
likelihood, the relevance vector machine (RVM) is used.
where PR (x) is the projection of a point x to the closest point
Assuming the hyperparameters α and α0 are known, a
(under the l2 norm) onto the convex set BR .
multivariate Gaussian distribution with mean and covariance
The next steps calculate PR (x), that is, calculate the vector
given by (39) and (40) can express the posterior for h [132].
t that is the closest point (under l2 distance) in the l1 ball of
radius R to x [11]:
• Step 1: If ||x||1 6 R, then t = x. µ = α0 ΣAT y (39)
• Step 2: Sort the components of x by magnitude to get
Σ = (α0 AT A + D)−1 (40)
the vector x̂ where |x̂1 | > P
|x̂2 | > ... > |x̂n |.
k−1
• Step 3: Let kSx̂k (x)k1 = i=1 (x̂i − x̂k ), find k such where D = diag(α1 , α2 , ...αN ). Therefore, the search for
that: the hyperparameters α and α0 can be seen as a learning
problem in the context of the RVM. A type-II maximum
kSx̂k (x)k1 6R6 Sx̂k+1 (x) ,
1 likelihood (ML) procedure can be used to estimate these
k−1 k
X X hyperparameters from the data [135].
(x̂i − x̂k ) 6 R 6 (x̂i − x̂k+1 ) (35) The logarithm of the marginal likelihood for α and α0 ,
i=1 i=1
noted L(α, α0 ), is given by [132]:
• Step 4: Calculate µ = x̂k + k1 (R − kSx̂k (x)k1 ). Then
t = Sµ (x). Z
According to [129], the projected Landweber iterative step log p(y|α, α0 ) = log p(y|h, α0 )p(h|α)dh
with an adaptive descent parameter βi > 0 as in (36) will 1
M log 2π + log |C| + yT C−1 y (41)

converge to solve argminĥ ||Aĥ − y||2 , that is, minimize ĥ =−
2
in the l1 ball BR .
   with C = σ 2 I + AD−1 AT . The maximization of (41) can
ĥi = PR ĥi−1 + βi−1 AH y − Aĥi−1 (36) be obtained with a type-II ML approximation that uses the
point estimates for α and α0 . This can be achieved through
βi can be chosen by [118]:
the Expectation-Maximization (EM) algorithm [132], [135],
• βi = 1, ∀i.
2 to yield:
kAH (y−Aĥi )k2 γi
• βi = 2 αinew = 2 (42)
kAA (y−Aĥi )k2
H
µi
Although IST is guaranteed to converge [128], it converges
slowly. Therefore, several modifications have been proposed where µi is the ith posterior mean weight from (39) and γi =
to speed it up such as the “fast ISTA” (FISTA) [130]. 1 − αi Σii with Σii the ith diagonal element of (40).
VOLUME , 2018 9

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

2) Focal Underdetermined System Solution (FOCUSS) Start


The Focal Underdetermined System Solution (FOCUSS) was
proposed in [136] to solve (6). First, a low-resolution initial
Parameters
estimate of the real signal is made. Then, the iteration process Initialization
refines the initial estimate to the final localized energy solu-
tion [136]. The FOCUSS iterations are based on a weighted
minimum norm solution defined as the solution minimizing Projection Support Merges
a weighted norm ||W−1 h||2 . It is given by [136]: Subset Selection
c = AH bi−1 j Λi = Λi−1 ∪ Ji

ĥ = W(AW) y (43)

where the definition of a weighted minimum norm solution Residual Vector no Stop Signal
is to find h = Wq where q : min||q||2 , subject to AWq = Calculation Criterion Estimation
y. When W is diagonal,the cost objective simply becomes bi ? ĥi
PN 2
hi
W† h = i=1,wi 6=0 wi , where wi are the diagonal
entries of W [136]. yes
The basis of the basic FOCUSS algorithm lies the Affine Stop
Scaling Transformation (AST):
FIGURE 6. Greedy Algorithms Diagram.
q = Ĥ†k−1 ĥ (44)

where Ĥ†k−1 = diag(ĥk−1 ) [136]. Let Wpk be the a (53), where the weights are computated from the previous
posteriori weight in each iteration, the AST is used in the ba- iterate hn−1 , so wi = |hn−1 (i)|p−2 [137].
sic FOCUSS algorithm to construct the weighted minimum
norm constraint (45) by setting Wpk = Ĥk−1 [136]. min ||h||pp subject to Ah = y (52)
h
N
n  2 X
T 2 2
X h(i) min wi h2 (i) subject to Ah = y (53)
||W h||2 = ||q||2 = (45) h
w(i) i=1
i=1,wi 6=0
Let Qn be the diagonal matrix with entries 1/wi =
Let ĥ0 = 0, the steps of the algorithm are: |hn−1 (i)|2−p , the solution of (53) can be given by:
Step 1: Wpk = (diag(ĥk−1 )) (46) hn = Qn AT (AQn AT )−1 y (54)
Step 2: qk = (AWpk )† y (47)
To deal with the case 0 ≤ p ≤ 1, where wi will be
Step 3: ĥk = Wpk qk (48) undefined for hn−1 (i) = 0, the authors in [137] regularize
the optimization problem by incorporating a small  > 0:
The algorithm continues until a minimal set of the columns
of A that describe y is obtained [136]. wi = ((hn−1 (i))2 + )p/2−1 (55)
By introducing two parameters, the authors extend the
basic FOCUSS into a class of recursively constrained opti- C. GREEDY ALGORITHMS
mization algorithms in [136]. In the first extension, ĥk−1 is Several greedy algorithms follow the steps showed in Fig. 6.
raised to some power l [136]. While in the second extension There are some differences in the choice of the quantity of the
an additional weight matrix Wak which is independent of column in each iteration, that is, the way to choose the indices
the a posteriori constraints is used [136]. The follow steps j to compose the set Ji . For the MP, the OMP, and the MPLS
describe the algorithm: algorithms only one column is chosen in each iteration. In
Step 1: Wpk = (diag(ĥlk−1 )), l ∈ N+ (49) contrast, the StOMP algorithm chooses all columns whose
the projection value is bigger than the threshold value tS . The
Step 2: qk = (AWak Wpk ) y †
(50) calculation of the residual vector bi and the estimation of the
Step 3: ĥk = Wak Wpk qk (51) non-zero values of ĥ in each iteration are other differences
between the algorithms. For example, the MPLS and the SP
It can be assumed that Wak is constant for all iterations. algorithms estimate ĥ only at the end of the algorithms as is
According to [136], l > 0.5 when h(i) > 0 is imposed. explained in the subsections below.
Table 3 summarizes the inputs, the calculation of the resid-
3) Iterative Reweighted Least Squares (IRLS) ual vector bi and the signal estimate components ĥi in each
The Iterative Reweighted Least Squares (IRLS) algorithm is iteration. In the next subsections, the algorithms presented in
used for solving (52) through a weigthed l2 norm given by Fig. 4 are explained.
10 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

TABLE 3. Main parameters and calculations of Greedy Algorithms.

Algorithm Inputs j bi ĥi


(al H bi−1 )al (al H bi−1 )
MP A, y maxj ||cj || bi−1 − i
||al ||2
i i
||al ||2
i 2 i 2
(al H bi−1 )al
MPLS A, y maxj ||cj || bi−1 − i
||al ||2
i

2i
OMP A, y maxj ||cj || y − A(Λi )ĥi A(Λi )† y
SP A, y, s s biggest ||cj || y − A(Λi )A† (Λi )y –
StOMP A, y, T , tS j : ||cj || > tS y − A(Λi )ĥi A(Λi )† y
CoSaMP A, y, s 2s biggest ||cj || y − A(Λi )ĥi supps (A(Λi )† y)
ROMP A, y, s s biggest ||cj || y − A(Λi )ĥi A(Λi )† y
GOMP A, y, Q, s Q biggest ||cj || y − A(Λi )ĥi A(Λi )† y
GOAMP A, y, Q Q biggest ||cj || y − A(Λi )ĥi A(Λi )† y
GP A, y maxj ||cj || bi−1 − ai A(Λi )di ĥi−1 + ai di

1) Matching Pursuit (MP) The MP and the MPLS algorithms are different in the way
The Matching Pursuit (MP) algorithm is proposed in [138]. that they calculate the non-zero signal components. In the
Let ĥ0 = 0, each iteration i of the MP algorithm consists in MPLS algorithm, these components are estimated through
finding the column aki ∈ A which is best aligned with the the LS calculation only in the end of the algorithm.
residual vector bi−1 (b0 = y) according to (56) [138]. After reaching the stop criterion, the signal is estimated
by (61), where T is the number of iterations and A(ΛT ) is a
ki = arg max |al H bi−1 |, l = 1, 2, ...., N (56) submatrix of A consisting of the ai columns with i ∈ ΛT .
l

The index set Λi stores the indices of the best aligned ĥ = A† (ΛT )y (61)
columns after i iterations. Let Di be the matrix formed by
the columns aki chosen until iteration i, the next step is 3) Orthogonal Matching Pursuit (OMP)
Λi = Λi−1 ∪ ki and Di = [Di−1 , aki ], if ki ∈ / Λi−1 . The Orthogonal Matching Pursuit (OMP) algorithm is an
Otherwise, Λi = Λi−1 and Di = Di−1 . improvement of the MP [141]. It can be stated as follows:
Then, a new residual vector is computed as (57) by re- • Step 1: Initialize b0 = y, Λi = ∅, and i = 1.
moving the projection of bi−1 along this direction, and the • Step 2: Find l that solves the maximization problem
H
estimated coefficient is calculated by (58). max||Pal bi−1 ||2 = max al||abl ||i−1
2 and update Λi =
l l 2

(aki H bi−1 )aki Λi−1 ∪ {l}.


bi = bi−1 − Paki bi−1 = bi−1 − (57) • Step 3: Calculate ĥi = A (Λi )y and update bi = y −

||aki ||22
A(Λi )ĥi .
(aki H bi−1 ) • Step 4: Stop the algorithm if the stopping condition is
ĥi (ki ) = ĥi−1 (ki ) + (58) achieved (e.g. ||bi || ≤ ). Otherwise, set i = i + 1 and
||aki ||22
return to Step 2.
The stop criterion of the algorithm can be, for example, In the OMP, the residual vector bi is always orthogonal to
||bi || ≤ . The signal estimate corresponds to the projections the columns that have already been selected. Therefore, there
of the best columns of the matrix A. An ASIC implementa- will be no columns selected twice and the set of selected
tion of MP algorithm is proposed in [139]. columns is increased through the iterations. Moreover, the
sufficient and worst-case necessary conditions for recovering
2) Matching Pursuit based on Least Squares (MPLS) the signal sparsity are investigated in [142]. Furthermore,
Similarly to the MP algorithm, in each iteration of the Match- the condition for the exact support recovery with the OMP
ing Pursuit based on Least Squares (MPLS) [140] algorithm, algorithm based on RIP and the minimum magnitude of the
the column aki ∈ A which is best aligned with the residual non-zero taps of the signal are studied in [143], [144].
vector bi−1 (where b0 = y) is selected according to (59). In [145], the authors propose two modifications to the
OMP in order to reduce the hardware complexity of the
ki = arg max |al H bi−1 |, l = 1, 2, ...., N (59) OMP: Thresholding technique for OMP and Gradient De-
l
scent OMP. Reconfigurable, parallel, and pipelined architec-
Let Λi be the index set of the best aligned columns of A
tures for the OMP and its two modifications are implemented
until the iteration i, Λi is updated by Λi = Λi−1 ∪ ki if ki ∈
/
on 65nm CMOS technology operating at 1V supply voltage
Λi−1 . Otherwise, Λi = Λi−1 .
to reconstruct data vector sizes ranging from 128 to 1024.
Then, the new residual vector is computed as:
These modifications lead to a 33% reduction in reconstruc-
(aki H bi−1 )aki tion time and to a 44% reduction in chip area when compared
bi = bi−1 − Paki bi−1 = bi−1 − (60) to the OMP ASIC implementation.
||aki ||22
VOLUME , 2018 11

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

However, several other OMP hardware implementations and the residual vector [158]. The number of iterations is
are proposed in the literature [124], [146]–[152]. The Step fixed.
3, specifically the least squares operation, is the most costly The input parameters are: the number of iterations T to
part of the OMP implementation. The most used methods to perform, the threshold value tS , the received signal y, and the
deal with this are the QR decomposition and the Cholesky measurement matrix A. The StOMP algorithm can be stated
decomposition. as follows:
Due to the OMP selects only one column in each iteration, • Step 1: Initialize the residual vector b0 = y, Λ0 = ∅,
it is very sensitive to the selection of the index [153]. Alter- and i = 1.
natively, various approaches investigating multiple columns • Step 2: Find al that ||Pal bi−1 || > tS , that is,
chosen in each iteration have been proposed such as the
H
max al||abl ||i−1
2 > tS and add the al columns to the set
SP, the StOMP, the CoSaMP, the ROMP, the GOMP, the l
of selected columns. Update Λi = Λi−1 ∪ {l}
GOAMP, the MMP, and the GP algorithms. Furthermore,
• Step 3: Let ĥi = A(Λi )† y. Update bi = y − A(Λi )ĥi
the Block Orthogonal Matching Pursuit (BOMP) algo-
• Step 4: If the stopping condition is achieved (i = Nit =
rithm [154] was developed to recover block sparse signals
T ), stop the algorithm. Otherwise, set i = i + 1 and
and its performance was investigated in [155], [156].
return to Step 2.
4) Subspace Pursuit (SP) 6) Compressive Sampling Matching Pursuit (CoSaMP)
At each stage, in order to refine an initially chosen estimate The Compressive Sampling Matching Pursuit (CoSaMP) al-
for the subspace, the Subspace Pursuit (SP) algorithm tests gorithm is presented in [159] to mitigate the unstability of the
subsets of s columns in a group [157]. That is, maintaining OMP algorithm. Similarly to the OMP, it starts by initializing
s columns of A, the algorithm executes a simple test in the a residual vector as b0 = y, the support set as Λ0 = ∅, the
spanned list of space, and after refines the list by discard- iteration counter as i = 1, and additionally sets ĥ0 = 0. The
ing the unreliable candidates, retaining reliable ones while CoSaMP performs these steps [159]:
adding the same number of new candidates [157]. Basically,
• Step 1 - Identification: a proxy of the residual vector
the steps of the SP are:
from the current samples is formed and the largest
• Step 1: Initialize the support set Λ0 with the s in- components of the proxy ci = |AH bi−1 | are located.
dices corresponding to the largest magnitude entries The first 2s entries of ci with largest absolute values are
in the vector AH y, the residual vector b0 = y − selected, and the indices selected compose Ji .
A(Λ0 )A(Λ0 )† y and the iteration counter i = 1. • Step 2 - Support merger: the set of newly identified
• Step 2: Λ̂i = Λi−1 ∪ Ji , where Ji is the set of the s components is united with the set of components that
indices corresponding to the largest magnitude entries appears in the current approximation. Λi = Ji ∪
in the vector ci = AH bi−1 . supp(ĥi−1 ) is defined as the augmentation of the sup-
• Step 3: Calculate xi = A† (Λ̂i )y. port of the previous estimate ĥi−1 with the 2s indices
• Step 4: Update Λi = {s indices corresponding to the corresponding to the entries of ci with largest absolute
largest magnitude elements of xi }. values.
• Step 5: Update bi = y − A(Λi )A† (Λi )y. • Step 3 - Estimation: a least-squares problem to approxi-
• Step 6: Stop the algorithm if the stopping condition is mate the target signal on the merged set of components
achieved. Otherwise, set i = i + 1 and return to Step 2. is solved. x̂i = A(Λi )† y.
After T iterations, the signal estimated is given by ĥ = • Step 4 - Pruning: a new approximation by retaining only
A† (ΛT )y. the largest entries in this least-squares signal approxi-
When the signal is very sparse, the SP algorithm has mation is produced. ĥi is the first s entries of x̂i with
computational
√ complexity upper-bounded by O(sM N ) (s ≤ largest absolute values.
const. N ), that is, lower computational complexity than • Step 5 - Sample update: update bi = y − A(Λi )ĥi .
the OMP algorithm [157]. However, when the non-zero FPGA implementations of CoSaMP are presented in [160],
components of the sparse signal decay slowly, the com- [161]. While an iterative Chebyshev-type method is used
putational complexity of the SP can be further reduced to in [161] to calculate the matrix inversion process during the
O(M N logs) [157]. algorithm, [160] uses a QR decomposition method.

5) Stagewise Orthogonal Matching Pursuit (StOMP) 7) Regularized OMP (ROMP)


The Stagewise Orthogonal Matching Pursuit (StOMP) [158] The Regularized OMP (ROMP) algorithm was proposed
algorithm is inspired by the OMP. Different from the OMP in [162]. Firstly, the ROMP algorithm initializes Λ0 = ∅ and
algorithm, the StOMP algorithm selects multiple columns at the residual vector b0 = y. Then, during each iteration i, the
each iteration. That is, according to a threshold, the StOMP ROMP performs these three steps:
algorithm selects the subspaces composed of the columns • Step 1 - Identification: Λ̂i = {s biggest indices in
with the highest coherence between the remaining columns magnitude of the projection vector ci = AH bi−1 }.
12 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

• Step 2 - Regularization: Among all subsets Ji ⊂ Λ̂i with where f (Q) is a function that increases the value of Q.
comparable coordinates |c(l)| ≤ 2|c(j)| for all l, j ∈ According to [165], 2 is about 0.7 − 0.9.
Ji , choose Ji with the maximal energy ||c(Ji )||2 .
• Step 3 - Updating: Add the set Ji to the index set: 10) Gradient Pursuit (GP)
Λi = Λi ∪ Ji . Calculate ĥi = A(Λi )† y and update The Gradient Pursuit (GP) algorithms were proposed in [166]
the residual vector bi = y − A(Λi )ĥi . as variations of the MP algorithm. In the GP, at iteration i, the
The regularization step can be done in linear time. The signal estimate ĥi is:
running time of the ROMP is comparable to that of the OMP ĥi = ĥi+1 + γi di (62)
in theory, but it is often better than the OMP in practice [162].
where di is the update direction and γi is the optimal step
8) Generalized Orthogonal Matching Pursuit (GOMP) size defined by:
The Generalized Orthogonal Matching Pursuit (GOMP) al- < bi−1 , A(Λi )di >
gorithm is a direct extension of the OMP algorithm [163]. γi = (63)
||A(Λi )di ||
The GOMP selects Q ≥ 1 largest correlation columns of
In the MP and the OMP algorithms, the update direction
the matrix A with the residual vector b. When Q = 1, √ the is taken to be in the direction of the best aligned column
GOMP becomes the OMP. Moreover, Q ≤ s and Q ≤ M .
of the matrix A. In the OMP, once added, the column will
The steps of the GOMP are:
not be selected again as the process of orthogonalisation
• Step 1: Initialize the residual vector b0 = y, Λ0 = ∅
ensures that all future residuals will remain orthogonal to all
and i = 1. currently selected columns. However, in the MP and the GP
• Step 2: Find the Q biggest al1 , .., alQ columns that
the orthogonality is not ensured. Hence, it is possible select
solves the maximization problem max||Palk bi−1 ||2 = again the same column.
k
max
alk H bi−1
and add the ali columns to the set of Each iteration i consists into find the column ali ∈ A
||alk ||22
k which is best aligned with the signal vector residual bi−1 .
selected columns. Update Λi = Λi−1 ∪ {l1 , ..., lQ }.
The GP algorithms perform these steps:
• Step 3: Calculate ĥi = A(Λi ) y. Update bi = y −

• Step 1: Initialize b0 = y, Λ0 = ∅ and i = 1.
A(Λi )ĥi .
• Step 2: Find li that solves the maximization prob-
• Step 4: Stop the algorithm if the stopping condition
a i H bi−1
is achieved (Nit = min(s, M/Q) or ||bi ||2 ≤ ). lem max||Pali bi−1 ||2 = max l||a l ||
2 . Update Λi =
li li i 2
Otherwise, set i = i + 1 and return to Step 2. Λi−1 ∪ {li }.
The complexity of the GOMP algorithm is approximately • Step 3: Update the direction di . Calculate γi =
<bi−1 ,A(Λi )di >
2Nit M N +(2Q2 +Q)Nit2 M [163]. The RIP based sufficient ||A(Λi )di || and ĥi = ĥi−1 + γi di . Update bi =
conditions for the exact support recovery with the GOMP in bi−1 − γi A(Λi )di .
the noisy case are investigated in [164]. • Step 4: Stop the algorithm, if the stopping condition is
achieved. Otherwise, set i = i + 1 and return to Step 2.
9) Generalized Orthogonal Adaptive Matching Pursuit There are three different methods for calculating the up-
(GOAMP) date direction di [11], [166]:
The Generalized Orthogonal Adaptive Matching Pursuit • Gradient Pursuit: uses the direction that minimises ||y −
(GOAMP) algorithm considers that the signal’s sparsity is not Aĥi−1 ||2 , that is:
known, so it adapts the variable Q of the GOMP algorithm  
during the iterations [165]. Basically, the GOAMP inserts a di = AT (Λi ) y − A(Λi )ĥi−1 (Λi ) (64)
new Step after the update of the residual vector:
• Conjugate Gradient Pursuit: it is a directional optimiza-
• Step 1: Initialize the residual vector b0 = y, Λ0 = ∅
tion algorithm that is guaranteed to solve quadratic opti-
and i = 1. mization problems in as many steps as the dimension of
• Step 2: Find the Q biggest al1 , .., alQ columns that
the problem [167]. Let φ(h) = 21 hT Gh − f T h be the
solve the maximization problem max||Palk bi−1 ||2 = cost function to be minimised, this method chooses di
k
max
alk H bi−1
and add the ali columns to the set of that is G-conjugate to all the previous directions, that is:
k ||alk ||22
selected columns. Update Λi = Λi−1 ∪ {l1 , ..., lQ }. di Gdk = 0, ∀k < i (65)
• Step 3: Calculate ĥi = A(Λi )† y. Update bi = y −
In this case, G = AT (Λi )A(Λi ). Let Di be the matrix
A(Λi )ĥi .
whose columns are the update directions for the first
• Step 4: If ||bi−1 − bi ||22 /||bi−1 ||22 < 2 , Q = f (Q).
i iterations and let gi be the gradient of the the cost
Otherwise, go to Step 5.
function in iteration i, the new update direction di in
• Step 5: Stop the algorithm if the stopping condition is
iteration i is given by [167]:
achieved (||bi ||2 ≤ 1 ). Otherwise, set i = i + 1 and
return to Step 2. di = gi + Di−1 f (66)
VOLUME , 2018 13

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

−1
where f = − DTi−1 GDi−1 DTi−1 Ggi−1 . Ø Ø


The OMP uses a full conjugate gradient solver at every F1 F1


iteration. Instead, in this method, only a directional {2} f11={2} f12={4}
update step occurs for each new added element.
F2
• Approximate Conjugate Gradient Pursuit: the new di- F2
f21={2, 1} f22={2, 5} f23={4, 1} f24={4, 5}
rection is conjugate to the previous direction, but this {2, 1}

can be extended to a larger number of directions: F3 F3


{2, 1, 4} f31={2, 1, 4} f32={2, 1, 5} f33={2, 5, 4} f35={4, 1, 3} f34={4, 5, 3}
di = gi + di−1 f (67)
(a) (b)
The G-conjugacy implies that:
FIGURE 7. Comparison between the OMP and the MMP algorithms (L = 2):
h(Gdi−1 ) , (gi + bdi−1 )i = 0 (68) (a) OMP (b) MMP [153].

h(A(Λi )di−1 ) , (A(Λi )gi )i


f =− (69)
kA(Λi )di−1 k2
2 • Step 3: Choose L best indices of columns that solve the
maximization problem arg max ||AH bki−1 ||22 to com-
11) Multipath Matching Pursuit (MMP) pose π and set j = 1.
With the help of the greedy strategy, the Multipath Matching • Step 4: Set f temp = fi−1 k
∪ {πj }, where πj is the j th
Pursuit (MMP) algorithm executes the tree search [153]. element of the set π.
First, the MMP algorithm searches multiple promising • Step 5: If f temp ∈ / Fi then u = u + 1, fiu = f temp ,
columns of the matrix A candidates and then it chooses Fi = Fi ∪ {fi }, update ĥui = A† (fiu )y and bui =
u

one minimizing the residual in the final moment. The MMP y − A(fiu )ĥui . Otherwise, go to Step 6.
algorithm can not be represented by Fig. 6. Let L be the • Step 6: Set j = j + 1. If j ≤ L then go to Step 4.
number of child paths of each candidate, fik be the k th Otherwise, go to Step 7.
candidate in the ith iteration, Fi = {fi1 , ..., fiu } be the set • Step 7: Set k = k + 1. If k ≤ |Fi−1 | then go to Step 3.
of candidates in the ith iteration and |Fi | be the number of Otherwise, go to Step 8.
elements of Fi , Ωk is the set of all possible combinations of • Step 8: Set i = i + 1. If i > s then go to Step 9.
k columns in A, for example, if Ω = {1, 2, 3} and k = 2, Otherwise, go to Step 2.
then Ωk = {{1, 2}, {1, 3}, {2, 3}} [153]. • Step 9: Find the index of the best candidate, that is,

Fig. 7 shows a comparison from an hypothetical choice of u∗ = arg minu ||bus ||22 . Set Λ = fsu and calculate the
columns in the 3 first iterations of the OMP and the MMP estimate signal ĥ = A† (Λ)y.
algorithms. In this figure, the OMP selects the column with If the arg max ||AH bki−1 ||22 in the Step 3 is calculated as in
index 2 in the first iteration, then the index 1 in the next the OMP algorithm, the MMP algorithm is called Tree-based
iteration and in the third iteration, it selects the index 4. Orthogonal Matching Pursuit (TOMP) algorithm [168].
On the other hand, the MMP algorithm selects the index 2
and 4 in the first iteration, after for each index selected, the 12) Iterative Hard Thresholding (IHT)
algorithm will select others L = 2 index in each iteration. The Iterative Hard Thresholding (IHT) algorithm [169] is an
Then, in the second iteration, it selects the index 1 and 5 iterative method that performs some thresholding function on
for the index 2 and for the index 4, but it is not necessary each iteration. This algorithm can’t be represented by Fig. 6.
select the same index as can be noted in the third iteration Let ĥ0 = 0, i = 1, for each iteration:
where the MMP selects the index 4 and 5 for the {2, 1}
composing f31 = {2, 1, 4} and f32 = {2, 1, 5}, and the ĥi = Hs (ĥi−1 + AH (y − Aĥi−1 )) (70)
index 2 and 3 for the {4, 1} composing f31 = {2, 1, 4} and
f35 = {4, 1, 3}. Moreover, it can be noticed that although where Hs () is a non-linear operator that sets all elements to
the number of candidates increases as an iteration goes on zero except the s elements having largest amplitudes.
(each candidate brings forth multiple children), the increase The IHT algorithm can stop after a fixed number of
is actually moderate since many candidates are overlapping iterations or it can terminate when the sparse vector does
in the middle of search as the case of f31 , f32 and f33 in not change much between consecutive iterations, for exam-
Fig. 7 [153]. ple [170].
The residual vector of the k th candidate in the ith iteration
is bki = y − A(fik )ĥki , where A(fik ) is the matrix A using D. OTHER ALGORITHMS
only the columns indexed by fik . Given the measurement This work presents some sparse recovery algorithms. How-
matrix A, the received signal y, the signal’s sparsity s and ever, if the reader wants to know other algorithms, in addition
the parameter L, the MMP follows the steps bellow: to the vast list presented above, some of them can be found in:
• Step 1: Initialize b0 = y, F0 = ∅ and i = 1. Back-tracking based Adaptive Orthogonal Matching Pursuit
• Step 2: Set Fi = ∅, u = 0 and k = 1. (BAOMP) [171], Chaining Pursuit (CP) [172], Conjugate
14 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

Gradient Iterative Hard Thresholding [173], Differential Or- column, while the StOMP selects several columns. Thus, the
thogonal Matching Pursuit (D-OMP) [174], Fast Iterative StOMP algorithm is generally faster than the OMP. StOMP
Shrinkage Thresholding Algorithm (FISTA) [130], Forward- can produce a good approximation with a small number
Backward Pursuit (FBP) [175], Fourier sampling algo- of iterations [15], but it has to determine an appropriate
rithm [176], Hard Thresholding Pursuit [177], Heavy Hitters threshold value since different threshold values could lead to
on Steroids (HHS) [178], Normalized Iterative Hard Thresh- different results [15], [158].
olding [179], lp -Regularized Least-Squares Two Pass [180],
Sequential Least Squares Matching Pursuit (SLSMP) [115], 5) ROMP × SP
Sparse Adaptive Orthogonal Matching Pursuit (SpAdOMP) The ROMP and the SP algorithms generate the support set Λi
[181], Sparse Reconstruction by Separable Approximation in a different way. The ROMP algorithm generates it sequen-
(SpaRSA) [182], Stochastic Gradient Pursuit (SGP) [183], tially, by adding one or many reliable indexes to the existing
Stochastic Search Algorithms [184], Tree Search Matching list in each iteration. While the SP algorithm re-evaluates
Pursuit (TSMP) [185], and Vector Approximate Message all the indexes at each iteration, in the ROMP algorithm, an
Passing (VAMP) [186]. index added to the list can not be removed [162].

V. ALGORITHM DISCUSSION 6) ROMP × StOMP


This section presents a generic comparison of some algo- The threshold value is the difference between the ROMP and
rithms previously mentioned and some performance com- the StOMP algorithms. While the ROMP uses all columns
parison found in the literature. Moreover, open research that have a dot product above half the size of the largest dot
challenges related to the CS theory, especially concerned to product, the StOMP uses a preset threshold value [162].
sparse recovery algorithms, are presented.
7) GP × IHT
A. GENERIC DISCUSSION
The difference between the GP and the IHT algorithms is in
1) BP × OMP
how the sparsity constraint is enforced. For the gradient pur-
While the OMP algorithm begins with an empty set of suit algorithms, in each iteration, a new dictionary element
columns and adds to the support set only the most important is added and cannot be removed afterwards. Otherwise, in
new column among all those in each step, the BP-simplex be- the IHT algorithm, the indices can be added and removed
gins with a “full” index set and then iteratively improves this because it keeps only the most important (decided by the
set by placing negligible terms with useful new ones [111]. largest magnitude) dictionary elements [166].
2) OMP × SP
8) TOMP × OMP
The difference between the OMP and the SP algorithms lies
The major difference between the TOMP and the OMP algo-
in the way that they generate the index set Λi . In the case
rithms is the manner (quantity) of selecting the columns of
of the OMP, after an index is included in the set Λi , it
the measurement matrix A. While the TOMP algorithm se-
stays there during all the process. In a different way from
quentially selects the whole next “good” family of columns,
this, SP holds an estimate Λi with size s that is refined in
the OMP algorithm sequentially selects the next “good”
each iteration [157]. Hence, the index can be added to or
column [168].
removed from the estimated support set at any iteration of the
algorithm [157]. Moreover, the SP solves two least squares
B. PERFORMANCE DISCUSSION
problems in each iteration while the OMP solves only one.
A performance comparison between algorithms from each
3) SP × CoSaMP category is analyzed below. From the Convex Relaxation
The SP and the CoSaMP algorithms add the new candidate category, the AMP and FISTA algorithms were implemented.
columns in a different way. With the SP algorithm, only s The BCS via RVM was implemented representing the Non-
index are added at each iteration, while 2s columns are added convex Optimization category. And finally, from the Greedy
in the CoSaMP algorithm [157]. Futhermore, the signal algorithms, the MP and OMP were implemented.
estimate ĥ is different for SP and CoSaMP. While the SP Let Ns be the number of realizations, the average normal-
solves a LS problem again to get the final approximation ized mean squared error (NMSE) described by (71) is used to
of the current iteration, the CoSaMP algorithm keeps the s evaluate the algorithms in terms of the size of the measured
largest components of x̂ [157]. Therefore, the SP solves two signal y (M ) and the signal’s sparsity.
least squares problems in each iteration whereas the CoSaMP
1 X kh − ĥk22
solves only one [157], [187]. N M SE = (71)
Ns khk22
4) StOMP × OMP The system model is defined by (6). For these simulations,
The StOMP and the OMP algorithms differ in the number Ns = 1000, N = 1024, A is i.i.d. Gaussian, with elements
of columns selected at each iteration: the OMP selects one distributed N (0, M −1 ). The sparse signal h to be estimated
VOLUME , 2018 15

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

is Bernoulli-Gaussian, that is, its elements are i.i.d N (0, 1) that is, the NMSE values increase. When γ is low, BCS is the
with probability γ and the others are set to 0. Signal-to- algorithm that achieves the best performance (lower NMSE
noise ratio (SNR) is 30dB. The results are compared to the value), which is close to the one achieved by the “Oracle”.
theoretical performance bound “Oracle”. It has the previous However, when the signal to be estimated is less sparse (big
knowledge of the non-zero tap positions. The non-zero coef- γ values), FISTA shows a better performance in recovering
ficients are calculated by applying the Least Square algorithm the signal.
using the submatrix As composed of the column related to Table 4 shows the percentage of non-zero tap positions
the non-zero tap positions of the signal to be estimated. correctly found for the five algorithms analyzed. The result
First, the algorithms performances are analyzed varying of FISTA is not presented for M = 200 because in this
the size of M for γ = 0.05 as shown in Fig. 8. scenario this algorithm did not converge. It can be observed
that when the M value increases, the percentage of non-
0
Oracle
zero tap positions correctly found increases. Moreover, notice
AMP that although AMP and FISTA algorithms present the highest
−10 FISTA percentage values for M = 400 and γ = 0.05, the algorithms
BCS BCS and OMP are the ones that achieve the best results
OMP
in terms of NMSE (see Fig. 8). It means that even if BCS
NMSE (dB)

MP
−20 and OMP correctly find less non-zero tap positions than the
algorithms AMP and FISTA, BCS and OMP are better able to
estimate the non-zero coefficients resulting in lower NMSE
−30
values.
TABLE 4. Percentage of non-zero tap positions correctly found.
−40
200 300 400 500 600 700 800 900
γ = 0.05 γ = 0.1 γ = 0.2
Algorithm
Number of Measurements (M ) M = 200 M = 400 M = 512
AMP 81.7% 98.4% 97.7% 88.5%
FIGURE 8. Algorithms performances varying M for γ = 0.05. FISTA - 98.5% 98.2% 89.5%
BCS 93.9% 97.0% 95.8% 91.9%
It can be seen that the performances of all the algorithms OMP 92.1% 96.7% 95.8% 92.9%
MP 67.6% 96.8% 96.2% 70.1%
increase when the number of measurements M increases.
However, it can be noticed that a low M value (M < N )
Moreover, it can be observed that when the γ value in-
allows the algorithms to recover the sparse signal resulting in
creases, that is, the signal become less sparse, the percentage
low NMSE values. Among the algorithms analyzed, the BCS
of non-zero tap positions correctly found decreases. This
presents the best performance. Furthermore, its performance
occurs for all the algorithms analyzed and confirms what was
is close to the one achieved by the “Oracle”. It confirms the
suggested in Fig. 9.
good results achieved by the algorithms from the Bayesian
Other performance comparisons between other algorithms
theory.
The algorithms performances are also analyzed varying γ can be found in the literature. Some of them are presented
for M = 512 as shown in Fig. 9. below.
A performance comparison of the SP, the OMP, the ROMP,
0 the GOMP, and the GOAMP algorithms is made in [12] for
Oracle the reconstruction of an image. The recovery performance
AMP
was analyzed in the form of Peak Signal to Noise Ratio
FISTA
−10
BCS (PSNR) value achieved and running time elapsed. From these
OMP simulations, the PSNR value is better when the GOAMP
NMSE (dB)

−20
MP algorithm is used.
In [31], the authors compare the BCS, the BP, the GraDeS,
the OMP, and the IHT algorithms to estimate a noisy sparse
−30 signal of length N = 1024. The metrics used were: phase
transition diagram, recovery time, recovery error, and covari-
ance. The results show that techniques of convex relaxation
−40
perform better in terms of recovery error, while greedy al-
0.05 0.1 0.15 0.2 0.25 0.3
gorithms are faster, and Bayesian based techniques appear to
Probability γ
have an advantageous balance of small recovery error and a
FIGURE 9. Algorithms performances varying γ for M = 512. short recovery time [31].
A comparison between the OMP and the modified LARS
According to Fig. 9, as the signal becomes less sparse (i.e. for solving LASSO algorithms is made in [120] considering
γ increases), the performances of all algorithms decrease, the solution accuracy and the convergence time. The results
16 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

show that generally the OMP requires fewer iterations than TABLE 5. Complexity and minimum measurement (M ) requirement.
the LARS to converge to the final solution, suggesting that
Algorithm Complexity M Reference
the OMP is much faster than the LARS [120]. However, for
BP O(N 3 ) O(slogN ) [5], [111], [157]
the cases where some columns of A are highly correlated, CoSaMP O(M N ) O(slogN ) [5], [18], [159]
the OMP was considered less accurate than the LARS [120]. IHT O(M N ) O(slog(N/s)) [170]
In [163], the authors compare the GOMP, the OMP, the MP O(M N Nit ) O(slog(N/s)) [138], [188]
OMP O(sM N ) O(slogN ) [5], [18], [157], [162]
StOMP, the ROMP, and the CoSaMP algorithms for a mea- ROMP O(sM N ) O(slog 2 N ) [5], [18], [159], [162]
surement matrix A 128 × 256 generated by a Gaussian distri- SP O(sM N ) O(slog(N/s)) [5], [157]
bution N (0, 1/128). The sparse signal varies from s = 1 to StOMP O(N logN ) O(N logN ) [5], [18], [158]
s = 70 and it is generated in two ways: Gaussian signals and
pulse amplitude modulation (PAM) signals. The results show TABLE 6. Storage cost [189].
that the critical sparsity of the GOMP algorithm is larger than
that of the OMP, the ROMP, the StOMP, and the CoSaMP Algorithm Storage cost
GP 2m + E + 2k + N
algorithms [163]. MP E + M + 2k + N
Algorithms OMP, StOMP, CoSaMP, MMP, and BPDN are OMP 2M k + 0.5k2 + 2.5k + E + N
compared in [153] varying the SNR for two different sparsity StOMP 2M + E + 2k + N
values (s = 20 and s = 30). The 100 × 256 measurement
matrix is generated by a Gaussian distribution. The results
show that the MMP performs close to the OMP to s = 20, C. RESEARCH CHALLENGES
but for s = 30, the performance of the MMP is better [153]. As can be observed, several papers have addressed CS ap-
Moreover, the running time of these algorithms is shown as plications, designing of the measurement matrix, and sparse
a function of s. The MMP algorithm has the highest running recovery algorithms. However, there are still many research
time and the OMP and the StOMP algorithms have the lowest challenges to overcome.
running time among algorithms under test [153]. Each domain of application has its characteristics and
In [137], the authors compare the performance of the IRLS these should be used to improve the estimation of the sparse
algorithm using the regularization. The results show that for signal. For example in [50], the authors optimize the sensing
p = 1 the unregularized IRLS and regularized IRLS are matrix by a proper specialization of a specific sparsity matrix
almost identical but for p = 0 and p = 1/2, the regularized taking advantage of the input signal statistical features.
IRLS algorithm recovers the greatest range of signals [137]. As said before, the two important challenges addressed to
The authors in [168] compare the performance of the compressive sensing are the design of the measurement ma-
TOMP, the BP, and the OMP algorithms. According to their trix and the development of an efficient recovery algorithm.
results, TOMP needs less iteration than the OMP because the Concerning the first challenge, while random measure-
TOMP algorithm selects the whole tree at a time and not only ment matrices have been widely studied, only a few deter-
one element. Moreover, the TOMP can achieve better results ministic measurement matrices have been considered [2].
than the BP and the OMP in reconstruction quality [168]. However, in structures that allow fast implementation with
In [124], the authors implement the algorithms OMP and reduced storage requirements, deterministic measurement
AMP in FPGA. As the OMP processing time increases matrices are highly desirable [2]. Therefore, this domain can
quadratically with the number of non-zero coefficients of be improved.
the signal to be estimated, this algorithm is more suitable to Addressed to the second challenge, a lot of CS approaches
recover very sparse signals. On the other hand, if the signal assume that the signal’s sparsity is known. However, in
to be estimated has several non-zero components it is more several applications such as the cognitive radio networks it
efficient to use the AMP algorithm to recover the signal than is not true. Thus, it is necessary to develop sparse recovery
the OMP. algorithms that do not need this information and that are
The algorithms GraDeS and LARS have complexity
O(M N ). Table 5 presents the complexity of other algorithms
TABLE 7. The RIC value of the matrix A.
as well as the minimum measurement (M ) requirement.
The storage cost per iteration (number of floating point Algorithm The RIC
√ value Reference
numbers) of some sparse recovery algorithms presented BP δ2s < 2 − 1 or [127]
in [189] are reproduced in Table 6, where E is the compu- δs + δ2s + δ3s < 1
CoSaMP δ4s < 0.1 [127]
tational cost of storing A or AT , k is the size of the support GraDeS δ2s < 1/3 [127]
set Λi in the iteration i, and η is the number of conjugate √
IHT δ3s < 1/ 32√and δ2s < 0.25 [190]
gradient steps used per iteration that is lower or equal to the LARS δ2s < 2 − 1 or [127]
number of elements selected. δs + δ2s + δ3s < 1
1
Finally, Table 7 presents the recovery condition related to OMP δs+1 < √s+1 [142]
0.03
the RIC value of the matrix A of some algorithms. ROMP δ2s < √ [127]
log(s)
SP δ3s < 0.06 [127]

VOLUME , 2018 17

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

able to be adaptive to time changes. Another alternative is to Journal of Communications, Network and System Sciences, vol. 8, pp.
develop a sparsity order estimation method, so the sparsity 197–216, 2015.
[3] C. E. Shannon, “Communication in the presence of noise,” Proceedings
order can be accurately estimated before using a recovery of the IRE, vol. 37, no. 1, pp. 10–21, Jan 1949.
algorithm [191]. [4] S. K. Sharma, E. Lagunas, S. Chatzinotas, and B. Ottersten, “Application
Moreover, in some cases, the signal’s sparsity can be time- of compressive sensing in cognitive radio communications: A survey,”
IEEE Communications Surveys Tutorials, vol. 18, no. 3, pp. 1838–1860,
varying. Hence, the investigation of adaptive sparsity order 2016.
estimation methods to capture the dynamicity of the signal of [5] S. Qaisar, R. M. Bilal, W. Iqbal, M. Naureen, and S. Lee, “Compressive
interest constitutes an important research challenge [72]. sensing: From theory to applications, a survey,” Journal of Communica-
tions and Networks, vol. 15, no. 5, pp. 443–456, Oct 2013.
Another opportunity is the development of the sparse
[6] S. Sharma, A. Gupta, and V. Bhatia, “A new sparse signal-matched
recovery algorithm on a distributed platform such as a sensor measurement matrix for compressive sensing in UWB communication,”
network as is made in [192]. IEEE Access, vol. 4, pp. 5327–5342, 2016.
Furthermore, sparse recovery algorithms can be combined [7] S. Mallat, A Wavelet Tour of Signal Processing, Third Edition: The
Sparse Way, 3rd ed. Academic Press, 2008.
with deep learning to improve sparse signal recovery. Some [8] Z. Qin, J. Fan, Y. Liu, Y. Gao, and G. Y. Li, “Sparse representation
different deep-learning approaches to solve sparse linear for wireless communications: A compressive sensing approach,” IEEE
inverse problems have already been reported in the liter- Signal Processing Magazine, vol. 35, no. 3, pp. 40–58, May 2018.
[9] E. J. Candes, J. Romberg, and T. Tao, “Robust uncertainty principles: ex-
ature [193]–[197]. However, some improvements in their act signal reconstruction from highly incomplete frequency information,”
performances can be suggested. For example vary the op- IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 489–509,
timization algorithm used in the neural network, the loss Feb 2006.
[10] D. L. Donoho, “Compressed sensing,” IEEE Transactions on Information
function, or the activation function of some of these neural Theory, vol. 52, no. 4, pp. 1289–1306, April 2006.
network, or even suggest a new neural network to sparse [11] G. Pope, “Compressive sensing: a summary of reconstruction algo-
signal estimation in order to produce faster and accurate rithms,” Master’s thesis, Eidgenossische Technische Hochschule, Swiss,
2009.
results.
[12] S. Budhiraja, “A survey of compressive sensing based greedy pursuit
reconstruction algorithms,” International Journal of Image, Graphics and
VI. CONCLUSION Signal Processing (IJIGSP), vol. 7, no. 10, 2015.
The compressive sensing and its sparse recovery algorithms [13] T. Akhila and R. Divya, “A survey on greedy reconstruction algorithms
in compressive sensing,” International Journal of Research in Computer
are used in several areas and have been extensively studied and Communication Technology, vol. 5, no. 3, pp. 126–129, 2016.
in this work. With growing demand for cheaper, faster, and [14] J. A. Tropp, “Greed is good: algorithmic results for sparse approxima-
more efficient devices, the usefulness of compressive sensing tion,” IEEE Transactions on Information Theory, vol. 50, no. 10, pp.
2231–2242, Oct 2004.
theory is progressively greater and more important. This
[15] K. V. Siddamal, S. P. Bhat, and V. S. Saroja, “A survey on compressive
paper has provided a review of this theory. We have pre- sensing,” in 2015 2nd International Conference on Electronics and Com-
sented mathematical and theoretical foundations of the key munication Systems (ICECS), Feb 2015, pp. 639–643.
concepts. More specifically, we have focused on the sparse [16] T. Nguyen and Y. Shin, “Deterministic sensing matrices in compressive
sensing: A survey,” The Scientific World Journal, 2013.
recovery algorithms illustrating numerous algorithms. Some [17] A. Gilbert and P. Indyk, “Sparse recovery using sparse matrices,” Pro-
comparisons of them were also discussed. Furthermore, sev- ceedings of the IEEE, vol. 98, no. 6, pp. 937–947, June 2010.
eral applications of compressive sensing have been presented [18] F. Salahdine, N. Kaabouch, and H. E. Ghazi, “A survey on compressive
sensing techniques for cognitive radio network,” Physical Commun. J.,
such as Image and Video, Compressive Transmission Data, Elsevier, 2016.
Systems Communication, and Detection and Recognition [19] J. W. Choi, B. Shim, Y. Ding, B. Rao, and D. I. Kim, “Compressed
Systems. The importance of choosing an efficient sparse sensing for wireless communications: Useful tips and tricks,” IEEE
Communications Surveys Tutorials, vol. 19, no. 3, pp. 1527–1550, 2017.
recovery algorithm to increase the performance of the sparse [20] U. P. Shukla, N. B. Patel, and A. M. Joshi, “A survey on recent advances
signal estimation was also highlighted. As shown in the in speech compressive sensing,” in 2013 International Mutli-Conference
previous sections of this paper, the compressive sensing on Automation, Computing, Communication, Control and Compressed
Sensing (iMac4s), March 2013, pp. 276–280.
theory can provide useful and promising techniques in future.
[21] G. Wunder, H. Boche, T. Strohmer, and P. Jung, “Sparse signal processing
Indeed, this theme is in significant and wide development concepts for efficient 5G system design,” IEEE Access, vol. 3, pp. 195–
in several applications. However, it still faces a number of 208, 2015.
open research challenges. For example, to determine the [22] Y. Zhang, L. Y. Zhang, J. Zhou, L. Liu, F. Chen, and X. He, “A review of
compressive sensing in information security field,” IEEE Access, vol. 4,
suitable measurement matrix and develop a sparse recovery pp. 2507–2519, 2016.
algorithm that does not need know the signal’s sparsity and [23] Q. Shen, W. Liu, W. Cui, and S. Wu, “Underdetermined DOA estimation
can be adaptive to time-varying sparsity. Moreover, signal under the compressive sensing framework: A review,” IEEE Access,
vol. 4, pp. 8865–8878, 2016.
statistical information can be added in the CS acquisition or [24] Z. Zhang, Y. Xu, J. Yang, X. Li, and D. Zhang, “A survey of sparse
CS reconstruction to reduce the amount of required resources representation: Algorithms and applications,” IEEE Access, vol. 3, pp.
(time, hardware, energy, etc.). 490–530, 2015.
[25] M. Rani, S. B. Dhok, and R. B. Deshmukh, “A systematic review
of compressive sensing: Concepts, implementations and applications,”
REFERENCES IEEE Access, vol. 6, pp. 4875–4894, 2018.
[1] M. Amin, Compressive Sensing for Urban Radar, 1st ed. Boca Raton, [26] S. W. Hu, G. X. Lin, S. H. Hsieh, and C. S. Lu, “Performance analysis
FL, USA: CRC Press, Inc., 2017. of joint-sparse recovery from multiple measurement vectors via convex
[2] M. Abo-Zahhad, A. Hussein, and A. Mohamed, “Compressive sensing optimization: Which prior information is better?” IEEE Access, vol. 6,
algorithms for signal processing applications: A survey,” International pp. 3739–3754, 2018.

18 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

[27] H. Palangi, R. Ward, and L. Deng, “Distributed compressive sensing: Transactions on Biomedical Circuits and Systems, vol. 10, no. 1, pp. 149–
A deep learning approach,” IEEE Transactions on Signal Processing, 162, Feb 2016.
vol. 64, no. 17, pp. 4504–4518, Sept 2016. [51] S. Vasanawala, M. Murphy, M. Alley, P. Lai, K. Keutzer, J. Pauly,
[28] J. Ziniel and P. Schniter, “Efficient high-dimensional inference in the and M. Lustig, “Practical parallel imaging compressed sensing MRI:
multiple measurement vector problem,” IEEE Transactions on Signal Summary of two years of experience in accelerating body MRI of pe-
Processing, vol. 61, no. 2, pp. 340–354, Jan 2013. diatric patients,” in 2011 IEEE International Symposium on Biomedical
[29] J. Wen, J. Tang, and F. Zhu, “Greedy block coordinate descent under Imaging: From Nano to Macro, March 2011, pp. 1039–1043.
restricted isometry property,” Mobile Networks and Applications, vol. 22, [52] D. Craven, B. McGinley, L. Kilmartin, M. Glavin, and E. Jones, “Adap-
no. 3, pp. 371–376, Jun 2017. tive dictionary reconstruction for compressed sensing of ECG signals,”
[30] E. J. Candes and M. B. Wakin, “An introduction to compressive sam- IEEE Journal of Biomedical and Health Informatics, vol. 21, no. 3, pp.
pling,” IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 21–30, 645–654, May 2017.
March 2008. [53] H. Djelouat, X. Zhai, M. A. Disi, A. Amira, and F. Bensaali, “System-
[31] Y. Arjoune, N. Kaabouch, H. E. Ghazi, and A. Tamtaoui, “Compressive on-chip solution for patients biometric: A compressive sensing-based
sensing: Performance comparison of sparse recovery algorithms,” in approach,” IEEE Sensors Journal, pp. 1–1, 2018.
2017 IEEE 7th Annual Computing and Communication Workshop and [54] S. Pudlewski, A. Prasanna, and T. Melodia, “Compressed-sensing-
Conference (CCWC), Jan 2017, pp. 1–7. enabled video streaming for wireless multimedia sensor networks,” IEEE
[32] E. Candes and J. Romberg, “Sparsity and incoherence in compressive Transactions on Mobile Computing, vol. 11, no. 6, pp. 1060–1072, Jun.
sampling,” Inverse Problems, vol. 23, no. 3, pp. 969–985, 2007. 2012.
[33] J. Wen, D. Li, and F. Zhu, “Stable recovery of sparse signals via lp - [55] B. Srinivasarao, V. C. Gogineni, S. Mula, and I. Chakrabarti, “A novel
minimization,” Applied and Computational Harmonic Analysis, vol. 38, framework for compressed sensing based scalable video coding,” Signal
no. 1, pp. 161 – 176, 2015. Processing: Image Communication, vol. 57, pp. 183–196, 2017.
[34] R. G. Baraniuk, “Compressive sensing [lecture notes],” IEEE Signal [56] C. Li, H. Jiang, P. Wilford, Y. Zhang, and M. Scheutzow, “A new com-
Processing Magazine, vol. 24, no. 4, pp. 118–121, July 2007. pressive video sensing framework for mobile broadcast,” IEEE Transac-
[35] E. J. Candes and T. Tao, “Near-optimal signal recovery from random tions on Broadcasting, vol. 59, no. 1, pp. 197–205, March 2013.
projections: Universal encoding strategies?” IEEE Transactions on Infor- [57] T. Goldstein, L. Xu, K. F. Kelly, and R. Baraniuk, “The STOne transform:
mation Theory, vol. 52, no. 12, pp. 5406–5425, Dec 2006. Multi-resolution image enhancement and compressive video,” IEEE
[36] Z. Chen and J. J. Dongarra, “Condition numbers of gaussian random Transactions on Image Processing, vol. 24, no. 12, pp. 5581–5593, Dec
matrices,” SIAM Journal on Matrix Analysis and Applications, vol. 27, 2015.
no. 3, pp. 603–620, Jul. 2005. [58] R. G. Baraniuk, T. Goldstein, A. C. Sankaranarayanan, C. Studer,
A. Veeraraghavan, and M. B. Wakin, “Compressive video sensing:
[37] W. U. Bajwa, J. D. Haupt, G. M. Raz, S. J. Wright, and R. D. Nowak,
Algorithms, architectures, and applications,” IEEE Signal Processing
“Toeplitz-structured compressed sensing matrices,” in 2007 IEEE/SP
Magazine, vol. 34, no. 1, pp. 52–66, Jan 2017.
14th Workshop on Statistical Signal Processing, Aug 2007, pp. 294–298.
[59] M. Herman and T. Strohmer, “Compressed sensing radar,” in 2008 IEEE
[38] A. Amini, V. Montazerhodjat, and F. Marvasti, “Matrices with small
International Conference on Acoustics, Speech and Signal Processing,
coherence using p-ary block codes,” IEEE Transactions on Signal Pro-
March 2008, pp. 1509–1512.
cessing, vol. 60, no. 1, pp. 172–181, Jan 2012.
[60] M. S. Kang, S. J. Lee, S. H. Lee, and K. T. Kim, “ISAR imaging of high-
[39] R. Calderbank, S. Howard, and S. Jafarpour, “Construction of a large
speed maneuvering target using gapped stepped-frequency waveform and
class of deterministic sensing matrices that satisfy a statistical isometry
compressive sensing,” IEEE Transactions on Image Processing, vol. 26,
property,” IEEE Journal of Selected Topics in Signal Processing, vol. 4,
no. 10, pp. 5043–5056, Oct 2017.
no. 2, pp. 358–374, April 2010.
[61] I. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, “Wireless
[40] S. Li, F. Gao, G. Ge, and S. Zhang, “Deterministic construction of sensor networks: A survey,” Compuer Networks, vol. 38, pp. 393–422,
compressed sensing matrices via algebraic curves,” IEEE Transactions 2002.
on Information Theory, vol. 58, no. 8, pp. 5035–5041, Aug 2012.
[62] M. A. Razzaque, C. Bleakley, and S. Dobson, “Compression in wireless
[41] S. Haykin, Adaptive filter theory, 4th ed. Upper Saddle River, NJ: sensor networks: A survey and comparative evaluation,” ACM Transac-
Prentice Hall, 2002. tions on Sensor Networks, vol. 10, no. 5, 2013.
[42] E. J. Candes and T. Tao, “Decoding by linear programming,” IEEE [63] C. Karakus, A. C. Gurbuz, and B. Tavli, “Analysis of energy efficiency of
Transactions on Information Theory, vol. 51, no. 12, pp. 4203–4215, Dec compressive sensing in wireless sensor networks,” IEEE Sensors Journal,
2005. vol. 13, no. 5, pp. 1999–2008, May 2013.
[43] M. Elad and A. M. Bruckstein, “A generalized uncertainty principle and [64] M. Hooshmand, M. Rossi, D. Zordan, and M. Zorzi, “Covariogram-based
sparse representation in pairs of bases,” IEEE Transactions on Informa- compressive sensing for environmental wireless sensor networks,” IEEE
tion Theory, vol. 48, no. 9, pp. 2558–2567, Sep 2002. Sensors Journal, vol. 16, no. 6, pp. 1716–1729, March 2016.
[44] D. L. Donoho and Y. Tsaig, “Fast solution of l1 -norm minimization [65] Z. Li, H. Huang, and S. Misra, “Compressed sensing via dictionary
problems when the solution may be sparse,” IEEE Transactions on learning and approximate message passing for multimedia internet of
Information Theory, vol. 54, no. 11, pp. 4789–4812, Nov 2008. things,” IEEE Internet of Things Journal, vol. 4, no. 2, pp. 505–512, April
[45] D. L. Donoho and J. Tanner, “Counting faces of randomly projected 2017.
polytopes when the projection radically lowers dimension,” Journal of [66] M. Mangia, F. Pareschi, R. Rovatti, and G. Setti, “Low-cost security of
the American Mathematical Society, vol. 22, no. 1, pp. 1–53, 2009. IoT sensor nodes with rakeness-based compressed sensing: Statistical and
[46] G. Satat, M. Tancik, and R. Raskar, “Lensless imaging with compressive known-plaintext attacks,” IEEE Transactions on Information Forensics
ultrafast sensing,” IEEE Transactions on Computational Imaging, vol. 3, and Security, vol. PP, no. 99, pp. 1–1, 2017.
no. 3, pp. 398–407, Sept 2017. [67] Y. Gargouri, H. Petit, P. Loumeau, B. Cecconi, and P. Desgreys, “Com-
[47] U. V. Dias and M. E. Rane, “Block based compressive sensed thermal pressed sensing for astrophysical signals,” in 2016 IEEE International
image reconstruction using greedy algorithms,” International Journal of Conference on Electronics, Circuits and Systems (ICECS), Dec 2016,
Image, Graphics and Signal Processing, vol. 6, no. 10, pp. 36–42, 2014. pp. 313–316.
[48] M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. [68] J. Bobin, J. L. Starck, and R. Ottensamer, “Compressed sensing in
Kelly, and R. G. Baraniuk, “Single-pixel imaging via compressive sam- astronomy,” IEEE Journal of Selected Topics in Signal Processing, vol. 2,
pling,” IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 83–91, no. 5, pp. 718–726, Oct 2008.
March 2008. [69] J. Yang, “A machine learning paradigm based on sparse signal represen-
[49] A. Saucedo, S. Lefkimmiatis, N. Rangwala, and K. Sung, “Improved tation,” Ph.D. dissertation, University of Wollongong, 2013.
computational efficiency of locally low rank MRI reconstruction using [70] J. Lu, N. Verma, and N. K. Jha, “Compressed signal processing on
iterative random patch adjustments,” IEEE Transactions on Medical nyquist-sampled signals,” IEEE Transactions on Computers, vol. 65,
Imaging, vol. 36, no. 6, pp. 1209–1220, June 2017. no. 11, pp. 3293–3303, Nov 2016.
[50] F. Pareschi, P. Albertini, G. Frattini, M. Mangia, R. Rovatti, and G. Setti, [71] B. Sun, H. Feng, K. Chen, and X. Zhu, “A deep learning framework
“Hardware-algorithms co-design and implementation of an analog-to- of quantized compressed sensing for wireless neural recording,” IEEE
information converter for biosignals based on compressed sensing,” IEEE Access, vol. 4, pp. 5169–5178, 2016.

VOLUME , 2018 19

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

[72] H. Sun, A. Nallanathan, C. X. Wang, and Y. Chen, “Wideband spectrum converter using random demodulation,” in 2007 IEEE International Sym-
sensing for cognitive radio networks: a survey,” IEEE Wireless Commu- posium on Circuits and Systems, May 2007, pp. 1959–1962.
nications, vol. 20, no. 2, pp. 74–81, April 2013. [94] T. Ragheb, J. N. Laska, H. Nejati, S. Kirolos, R. G. Baraniuk, and
[73] A. Ali and W. Hamouda, “Advances on spectrum sensing for cognitive ra- Y. Massoud, “A prototype hardware for random demodulation based
dio networks: Theory and applications,” IEEE Communications Surveys compressive analog-to-digital conversion,” in 2008 51st Midwest Sym-
Tutorials, vol. 19, no. 2, pp. 1277–1304, 2017. posium on Circuits and Systems, Aug 2008, pp. 37–40.
[74] E. Panayirci, H. Senol, M. Uysal, and H. V. Poor, “Sparse channel [95] Y. Chen, M. Mishali, Y. C. Eldar, and A. O. Hero, “Modulated wideband
estimation and equalization for OFDM-based underwater cooperative converter with non-ideal lowpass filters,” in 2010 IEEE International
systems with amplify-and-forward relaying,” IEEE Transactions on Sig- Conference on Acoustics, Speech and Signal Processing, March 2010,
nal Processing, vol. 64, no. 1, pp. 214–228, Jan 2016. pp. 3630–3633.
[75] C. Li, K. Song, and L. Yang, “Low computational complexity design over [96] D. E. Bellasi, L. Bettini, C. Benkeser, T. Burger, Q. Huang, and C. Studer,
sparse channel estimator in underwater acoustic OFDM communication “VLSI design of a monolithic compressive-sensing wideband analog-to-
system,” IET Communications, vol. 11, no. 7, pp. 1143–1151, 2017. information converter,” IEEE Journal on Emerging and Selected Topics
[76] J. Ying, J. Zhong, M. Zhao, and Y. Cai, “Turbo equalization based on in Circuits and Systems, vol. 3, no. 4, pp. 552–565, Dec 2013.
compressive sensing channel estimation in wideband HF systems,” in [97] Y. Gargouri, H. Petit, P. Loumeau, B. Cecconi, and P. Desgreys, “Analog-
2013 International Conference on Wireless Communications and Signal to-information converter design for low-power acquisition of astrophysi-
Processing, Oct 2013, pp. 1–5. cal signals,” in 2017 15th IEEE International New Circuits and Systems
[77] E. C. Marques, N. Maciel, L. A. B. Naviner, H. Cai, and J. Yang, Conference (NEWCAS), June 2017, pp. 113–116.
“Compressed sensing for wideband HF channel estimation,” International [98] M. Trakimas, R. D’Angelo, S. Aeron, T. Hancock, and S. Sonkusale, “A
Conference on Frontiers of Signal Processing, pp. 1–5, 2018. compressed sensing analog-to-information converter with edge-triggered
[78] W. F. Schreiber, “Advanced television systems for terrestrial broadcast- sar adc core,” IEEE Transactions on Circuits and Systems I: Regular
ing: Some problems and some proposed solutions,” Proceedings of the Papers, vol. 60, no. 5, pp. 1135–1148, May 2013.
IEEE, vol. 83, no. 6, pp. 958–981, Jun 1995. [99] J. F. Gemmeke, H. V. Hamme, B. Cranen, and L. Boves, “Compressive
[79] Z. Fan, Z. Lu, and Y. Han, “Accurate channel estimation based on sensing for missing data imputation in noise robust speech recognition,”
bayesian compressive sensing for next-generation wireless broadcasting IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 2, pp.
systems,” in 2014 IEEE International Symposium on Broadband Multi- 272–287, April 2010.
media Systems and Broadcasting, June 2014, pp. 1–5. [100] M. Gavrilescu, “Improved automatic speech recognition system by using
[80] P. Zhang, Z. Hu, R. C. Qiu, and B. M. Sadler, “A compressed sensing compressed sensing signal reconstruction based on l0 and l1 estimation
based ultra-wideband communication system,” in 2009 IEEE Interna- algorithms,” in 2015 7th International Conference on Electronics, Com-
tional Conference on Communications, June 2009, pp. 1–5. puters and Artificial Intelligence (ECAI), June 2015.
[81] K. M. Cohen, C. Attias, B. Farbman, I. Tselniker, and Y. C. Eldar, [101] A. Latif and W. A. Mousa, “An efficient undersampled high-resolution
“Channel estimation in UWB channels using compressed sensing,” in radon transform for exploration seismic data processing,” IEEE Transac-
2014 IEEE International Conference on Acoustics, Speech and Signal tions on Geoscience and Remote Sensing, vol. 55, no. 2, pp. 1010–1024,
Processing (ICASSP), May 2014, pp. 1966–1970. Feb 2017.
[82] Z. Marzi, D. Ramasamy, and U. Madhow, “Compressive channel estima- [102] J. Cao, Y. Wang, J. Zhao, and C. Yang, “A review on restoration of seismic
tion and tracking for large arrays in mm-wave picocells,” IEEE Journal of wavefields based on regularization and compressive sensing,” Inverse
Selected Topics in Signal Processing, vol. 10, no. 3, pp. 514–527, April Problems in Science and Engineering, vol. 19, no. 5, p. 679–704, 2011.
2016. [103] C. Stork and D. Brookes, “The decline of conventional seismic acquisi-
[83] X. Ma, F. Yang, S. Liu, J. Song, and Z. Han, “Design and optimization tion and the rise of specialized acquisition: This is compressive sensing,”
on training sequence for mmwave communications: A new approach for Society of Exploration Geophysicists Annual Meeting, p. 4386–4392,
sparse channel estimation in massive MIMO,” IEEE Journal on Selected 2014.
Areas in Communications, vol. 35, no. 7, pp. 1486–1497, July 2017. [104] M. Hawes, L. Mihaylova, F. Septier, and S. Godsill, “Bayesian compres-
[84] C. R. Berger, Z. Wang, J. Huang, and S. Zhou, “Application of com- sive sensing approaches for direction of arrival estimation with mutual
pressive sensing to sparse channel estimation,” IEEE Communications coupling effects,” IEEE Transactions on Antennas and Propagation,
Magazine, vol. 48, no. 11, pp. 164–174, November 2010. vol. 65, no. 3, pp. 1357–1368, March 2017.
[85] W. U. Bajwa, J. Haupt, A. M. Sayeed, and R. Nowak, “Compressed chan- [105] Q. Wang, T. Dou, H. Chen, W. Yan, and W. Liu, “Effective block sparse
nel sensing: A new approach to estimating sparse multipath channels,” representation algorithm for DOA estimation with unknown mutual cou-
Proceedings of the IEEE, vol. 98, no. 6, pp. 1058–1076, June 2010. pling,” IEEE Communications Letters, vol. 21, no. 12, pp. 2622–2625,
[86] B. Mansoor, S. J. Nawaz, and S. M. Gulfam, “Massive-MIMO sparse up- Dec 2017.
link channel estimation using implicit training and compressed sensing,” [106] L. Zhao, J. Xu, J. Ding, A. Liu, and L. Li, “Direction-of-arrival estimation
Applied Sciences, vol. 7, 2017. of multipath signals using independent component analysis and compres-
[87] R. Ferdian, Y. Hou, and M. Okada, “A low-complexity hardware imple- sive sensing,” PLOS ONE, vol. 12, no. 7, pp. 1–17, 07 2017.
mentation of compressed sensing-based channel estimation for ISDB-T [107] D. Malioutov, M. Cetin, and A. S. Willsky, “A sparse signal recon-
system,” IEEE Transactions on Broadcasting, vol. 63, no. 1, pp. 92–102, struction perspective for source localization with sensor arrays,” IEEE
March 2017. Transactions on Signal Processing, vol. 53, no. 8, pp. 3010–3022, Aug
[88] J. A. Tropp, J. N. Laska, M. F. Duarte, J. K. Romberg, and R. G. Baraniuk, 2005.
“Beyond nyquist: Efficient sampling of sparse bandlimited signals,” IEEE [108] I. Bilik, T. Northardt, and Y. Abramovich, “Expected likelihood for
Transactions on Information Theory, vol. 56, no. 1, pp. 520–544, Jan compressive sensing-based DOA estimation,” in IET International Con-
2010. ference on Radar Systems (Radar 2012), Oct 2012, pp. 1–4.
[89] X. Chen, Z. Yu, S. Hoyos, B. M. Sadler, and J. Silva-Martinez, “A sub- [109] A. Y. Carmi, L. S. Mihaylova, and S. J. Godsill, Compressed Sensing &
nyquist rate sampling receiver exploiting compressive sensing,” IEEE Sparse Filtering. Springer, 2014.
Transactions on Circuits and Systems I: Regular Papers, vol. 58, no. 3, [110] D. Kanevsky, A. Carmi, L. Horesh, P. Gurfil, B. Ramabhadran, and
pp. 507–520, March 2011. T. N. Sainath, “Kalman filtering for compressed sensing,” in 2010 13th
[90] M. Pelissier and C. Studer, “Non-uniform wavelet sampling for rf analog- International Conference on Information Fusion, July 2010, pp. 1–8.
to-information conversion,” IEEE Transactions on Circuits and Systems [111] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition
I: Regular Papers, vol. 65, no. 2, pp. 471–484, Feb 2018. by basis pursuit,” SIAM Journal on Scientific Computing, vol. 20, pp.
[91] W. Guo, Y. Kim, A. H. Tewfik, and N. Sun, “A fully passive compressive 33–61, 1998.
sensing SAR ADC for low-power wireless sensors,” IEEE Journal of [112] R. J. Vanderbei, Linear Programming: Foundations and Extensions,
Solid-State Circuits, vol. 52, no. 8, pp. 2154–2167, Aug 2017. 2nd ed. Norwell, MA: Kluwer, 2001.
[92] Y. Zhang, X. Fu, Q. Zhang, Z. Feng, and X. Liu, “Novel schemes to opti- [113] P. S. Huggins and S. W. Zucker, “Greedy basis pursuit,” IEEE Transac-
mize sampling rate for compressed sensing,” Journal of Communications tions on Signal Processing, vol. 55, no. 7, pp. 3760–3772, July 2007.
and Networks, vol. 17, no. 5, pp. 517–524, Oct 2015. [114] R. Tibshirani, “Regression shrinkage and selection via the LASSO,” Jour-
[93] J. N. Laska, S. Kirolos, M. F. Duarte, T. S. Ragheb, R. G. Baraniuk, and nal of the Royal Statistical Society, Series B: Methodological, vol. 58, pp.
Y. Massoud, “Theory and implementation of an analog-to-information 267–288, 1996.

20 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

[115] W. Li, “Estimation and tracking of rapidly time-varying broadband [137] R. Chartrand and W. Yin, “Iteratively reweighted algorithms for com-
acoustic communication channels,” Ph.D. dissertation, Massachusetts pressive sensing,” in 2008 IEEE International Conference on Acoustics,
Institute of Technology, 2006. Speech and Signal Processing, March 2008, pp. 3869–3872.
[116] W. J. Fu, “Penalized regressions: The bridge versus the LASSO,” Journal [138] S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequency
of Computational and Graphical Statistics, vol. 7, no. 3, pp. 397–416, dictionaries,” IEEE Transactions on Signal Processing, vol. 41, no. 12,
1998. pp. 3397–3415, Dec 1993.
[117] A. Maleki, L. Anitori, Z. Yang, and R. G. Baraniuk, “Asymptotic [139] P. K. Meher, B. K. Mohanty, and T. Srikanthan, “Area-delay efficient
analysis of complex LASSO via complex approximate message passing architecture for mp algorithm using reconfigurable inner-product cir-
(CAMP),” IEEE Transactions on Information Theory, vol. 59, no. 7, pp. cuits,” in 2014 IEEE International Symposium on Circuits and Systems
4290–4308, July 2013. (ISCAS), June 2014, pp. 2628–2631.
[118] M. A. T. Figueiredo, R. D. Nowak, and S. J. Wright, “Gradient projection [140] N. M. de Paiva Jr., E. C. Marques, and L. A. de Barros Naviner, “Sparsity
for sparse reconstruction: Application to compressed sensing and other analysis using a mixed approach with greedy and LS algorithms on
inverse problems,” IEEE Journal of Selected Topics in Signal Processing, channel estimation,” in 2017 3rd International Conference on Frontiers
vol. 1, no. 4, pp. 586–597, Dec 2007. of Signal Processing (ICFSP), 2017.
[119] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regres- [141] Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matching
sion,” Annals of statistics, vol. 32, no. 2, pp. 407–451, 2004. pursuit: recursive function approximation with applications to wavelet
decomposition,” in Proceedings of 27th Asilomar Conference on Signals,
[120] M. A. Hameed, “Comparative analysis of orthogonal matching pursuit
Systems and Computers, Nov 1993, pp. 40–44 vol.1.
and least angle regression,” Master’s thesis, Michigan State University,
[142] J. Wen, Z. Zhou, J. Wang, X. Tang, and Q. Mo, “A sharp condition
United State of America, 2012.
for exact support recovery of sparse signals with orthogonal matching
[121] E. Candes and T. Tao, “The dantzig selector: statistical estimation when pursuit,” in 2016 IEEE International Symposium on Information Theory
p is much larger than n,” The Annals of Statistics, vol. 35, no. 6, p. (ISIT), July 2016, pp. 2364–2368.
2313–2351, 2007.
[143] ——, “A sharp condition for exact support recovery with orthogonal
[122] A. Maleki, “Approximate message passing algorithms for compressed matching pursuit,” IEEE Transactions on Signal Processing, vol. 65,
sensing,” Ph.D. dissertation, Stanford University, 2011. no. 6, pp. 1370–1382, March 2017.
[123] D. L. Donoho, A. Maleki, and A. Montanari, “Message passing algo- [144] J. Wang, “Support recovery with orthogonal matching pursuit in the
rithms for compressed sensing,” Proc. Natl. Acad. Sci., 2009. presence of noise,” IEEE Transactions on Signal Processing, vol. 63,
[124] L. Bai, P. Maechler, M. Muehlberghuber, and H. Kaeslin, “High-speed no. 21, pp. 5868–5877, Nov 2015.
compressed sensing reconstruction on FPGA using OMP and AMP,” in [145] A. Kulkarni and T. Mohsenin, “Low overhead architectures for OMP
2012 19th IEEE International Conference on Electronics, Circuits, and compressive sensing reconstruction algorithm,” IEEE Transactions on
Systems (ICECS 2012), Dec 2012, pp. 53–56. Circuits and Systems I: Regular Papers, vol. 64, no. 6, pp. 1468–1480,
[125] M. L. Gallo, A. Sebastian, G. Cherubini, H. Giefers, and E. Elefthe- June 2017.
riou, “Compressed sensing with approximate message passing using in- [146] H. Rabah, A. Amira, B. K. Mohanty, S. Almaadeed, and P. K. Meher,
memory computing,” IEEE Transactions on Electron Devices, vol. 65, “FPGA implementation of orthogonal matching pursuit for compressive
no. 10, pp. 4304–4312, Oct 2018. sensing reconstruction,” IEEE Transactions on Very Large Scale Integra-
[126] L. Zheng, Z. Wu, M. Seok, X. Wang, and Q. Liu, “High-accuracy com- tion (VLSI) Systems, vol. 23, no. 10, pp. 2209–2220, Oct 2015.
pressed sensing decoder based on adaptive (`0 , `1 ) complex approximate [147] B. Knoop, J. Rust, S. Schmale, D. Peters-Drolshagen, and S. Paul, “Rapid
message passing: Cross-layer design,” IEEE Transactions on Circuits and digital architecture design of orthogonal matching pursuit,” in 2016 24th
Systems I: Regular Papers, vol. 63, no. 10, pp. 1726–1736, Oct 2016. European Signal Processing Conference (EUSIPCO), Aug 2016, pp.
[127] R. Garg and R. Khandekar, “Gradient descent with sparsification: an 1857–1861.
iterative algorithm for sparse recovery with restricted isometry property,” [148] C. Morales-Perez, J. Rangel-Magdaleno, H. Peregrina-Barreto,
Proceedings of the 26th Annual International Conference on Machine J. Ramirez-Cortes, and I. Cruz-Vega, “FPGA-based broken bar
Learning, pp. 337–344, 2009. detection on IM using OMP algorithm,” in 2017 IEEE International
[128] I. Daubechies, M. Defrise, and C. D. Mol, “An iterative threshold- Instrumentation and Measurement Technology Conference (I2MTC),
ing algorithm for linear inverse problems with a sparsity constraint,” May 2017, pp. 1–6.
Communications on Pure and Applied Mathematics, vol. 57, no. 11, p. [149] S. Liu, N. Lyu, and H. Wang, “The implementation of the improved
1413–1457, 2004. OMP for AIC reconstruction based on parallel index selection,” IEEE
[129] I. Daubechies, M. Fornasier, and I. Loris, “Accelerated projected gradient Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26,
method for linear inverse problems with sparsity constraints,” The Journal no. 2, pp. 319–328, Feb 2018.
of Fourier Analysis and Applications, vol. 14, p. 764–792, 2008. [150] Önder Polat and S. K. Kayhan, “High-speed FPGA im-
plementation of orthogonal matching pursuit for compressive
[130] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algo-
sensing signal reconstruction,” Computers Electrical Engineering,
rithm with application to wavelet-based image deblurring,” in 2009 IEEE
vol. 71, pp. 173 – 190, 2018. [Online]. Available:
International Conference on Acoustics, Speech and Signal Processing,
http://www.sciencedirect.com/science/article/pii/S0045790617337904
April 2009, pp. 693–696.
[151] Z. Yu, J. Su, F. Yang, Y. Su, X. Zeng, D. Zhou, and W. Shi, “Fast
[131] D. P. Wipf and B. D. Rao, “Sparse bayesian learning for basis selection,”
compressive sensing reconstruction algorithm on FPGA using orthogonal
IEEE Transactions on Signal Processing, vol. 52, no. 8, pp. 2153–2164,
matching pursuit,” in 2016 IEEE International Symposium on Circuits
Aug 2004.
and Systems (ISCAS), May 2016, pp. 249–252.
[132] S. Ji, Y. Xue, and L. Carin, “Bayesian compressive sensing,” IEEE [152] J. L. V. M. Stanislaus and T. Mohsenin, “High performance compressive
Transactions on Signal Processing, vol. 56, no. 6, pp. 2346–2356, June sensing reconstruction hardware with QRD process,” in 2012 IEEE
2008. International Symposium on Circuits and Systems, May 2012, pp. 29–
[133] M. A. T. Figueiredo, “Adaptive sparseness using jeffreys prior,” in 32.
Proceedings of the 14th International Conference on Neural Information [153] S. Kwon, J. Wang, and B. Shim, “Multipath matching pursuit,” IEEE
Processing Systems: Natural and Synthetic, ser. NIPS’01. Cambridge, Transactions on Information Theory, vol. 60, no. 5, pp. 2986–3001, May
MA, USA: MIT Press, 2001, pp. 697–704. 2014.
[134] J. M. Bernardo and A. F. M. Smith, “Bayesian theory,” New York: Wiley, [154] Y. C. Eldar, P. Kuppinger, and H. Bolcskei, “Block-sparse signals: Un-
1994. certainty relations and efficient recovery,” IEEE Transactions on Signal
[135] M. E. Tipping, “Sparse bayesian learning and the relevance vector Processing, vol. 58, no. 6, pp. 3042–3054, June 2010.
machine,” Journal of Machine Learning Research, vol. 1, pp. 211–244, [155] J. Wen, H. Chen, and Z. Zhou, “An optimal condition for the block
2001. orthogonal matching pursuit algorithm,” IEEE Access, vol. 6, pp. 38 179–
[136] I. F. Gorodnitsky and B. D. Rao, “Sparse signal reconstruction from 38 185, 2018.
limited data using FOCUSS: a re-weighted minimum norm algorithm,” [156] J. Wen, Z. Zhou, Z. Liu, M.-J. Lai, and X. Tang, “Sharp sufficient
IEEE Transactions on Signal Processing, vol. 45, no. 3, pp. 600–616, conditions for stable recovery of block sparse signals by block orthogonal
Mar 1997. matching pursuit,” Applied and Computational Harmonic Analysis, 2018.

VOLUME , 2018 21

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2018.2886471, IEEE Access

E. C. Marques et al.: A Review of Sparse Recovery Algorithms

[157] W. Dai and O. Milenkovic, “Subspace pursuit for compressive sens- [180] J. K. Pant and S. Krishnan, “Two-pass lp -regularized least-squares algo-
ing signal reconstruction,” IEEE Transactions on Information Theory, rithm for compressive sensing,” in 2017 IEEE International Symposium
vol. 55, no. 5, pp. 2230–2249, May 2009. on Circuits and Systems (ISCAS), May 2017, pp. 1–4.
[158] D. L. Donoho, Y. Tsaig, I. Drori, and J. L. Starck, “Sparse solution of [181] G. Mileounis, B. Babadi, N. Kalouptsidis, and V. Tarokh, “An adaptive
underdetermined systems of linear equations by stagewise orthogonal greedy algorithm with application to nonlinear communications,” IEEE
matching pursuit,” IEEE Transactions on Information Theory, vol. 58, Transactions on Signal Processing, vol. 58, no. 6, pp. 2998–3007, June
no. 2, pp. 1094–1121, Feb 2012. 2010.
[159] D. Needell and J. A. Tropp, “CoSaMP: Iterative signal recovery from [182] S. J. Wright, R. D. Nowak, and M. A. T. Figueiredo, “Sparse recon-
incomplete and inaccurate samples,” California Institute of Technology, struction by separable approximation,” IEEE Transactions on Signal
Pasadena, Tech. Rep., 2008. Processing, vol. 57, p. 2479–2493, 2009.
[160] J. Lu, H. Zhang, and H. Meng, “Novel hardware architecture of sparse [183] E. Vlachos, A. S. Lalos, and K. Berberidis, “Stochastic gradient pursuit
recovery based on FPGAs,” in 2010 2nd International Conference on for adaptive equalization of sparse multipath channels,” IEEE Journal on
Signal Processing Systems, vol. 1, July 2010, pp. V1–302–V1–306. Emerging and Selected Topics in Circuits and Systems, vol. 2, no. 3, pp.
413–423, Sept 2012.
[161] H. D. R. Aniles, “FPGA-based compressed sensing reconstruction of
[184] B. A. Olshausen and K. J. Millman, “Learning sparse codes with a
sparse signals,” Master’s thesis, Instituto Nacional de Astrofísica, Óptica
mixture-of-gaussians prior,” in Advances in Neural Information Process-
y Electrónica, 2014.
ing Systems 12. MIT Press, 2000.
[162] D. Needell and R. Vershynin, “Uniform uncertainty principle and signal
[185] J. Lee, J. W. Choi, and B. Shim, “Sparse signal recovery via tree search
recovery via regularized orthogonal matching pursuit,” Found. Comput.
matching pursuit,” Journal of Communications and Networks, vol. 18,
Math., vol. 9, no. 3, pp. 317–334, Apr. 2009.
no. 5, pp. 699–712, October 2016.
[163] J. Wang, S. Kwon, and B. Shim, “Generalized orthogonal matching [186] S. Rangan, P. Schniter, and A. K. Fletcher, “Vector approximate message
pursuit,” IEEE Transactions on Signal Processing, vol. 60, no. 12, pp. passing,” CoRR, vol. abs/1610.03082, 2016.
6202–6216, Dec 2012. [187] C. Song, S. Xia, and X. Liu, “Improved analyses for SP and cosamp
[164] J. Wen, Z. Zhou, D. Li, and X. Tang, “A novel sufficient condition algorithms in terms of restricted isometry constants,” CoRR, vol.
for generalized orthogonal matching pursuit,” IEEE Communications abs/1309.6073, 2013.
Letters, vol. 21, no. 4, pp. 805–808, April 2017. [188] A. K. Mishra and R. S. Verster, Compressive Sensing Based Algorithms
[165] H. Sun and L. Ni, “Compressed sensing data reconstruction using adap- for Electronic Defence. Springer, 2017.
tive generalized orthogonal matching pursuit algorithm,” in Proceedings [189] Y. C. Eldar and G. Kutyniok, Compressed Sensing: Theory and Applica-
of 2013 3rd International Conference on Computer Science and Network tions, 1st ed. Cambridge University Press, 2012.
Technology, Oct 2013, pp. 1102–1106. [190] S. Foucart, “Sparse recovery algorithms: Sufficient conditions in terms
[166] T. Blumensath and M. E. Davies, “Gradient pursuits,” IEEE Transactions of restricted isometry constants,” in Approximation Theory XIII: San
on Signal Processing, vol. 56, no. 6, pp. 2370–2382, June 2008. Antonio 2010, M. Neamtu and L. Schumaker, Eds. New York, NY:
[167] G. H. Golub and F. V. Loan, “Matrix computations,” MD: Johns Hopkins Springer New York, 2012, pp. 65–77.
Univ. Press, 1996. [191] S. K. Sharma, S. Chatzinotas, and B. Ottersten, “Compressive sparsity
[168] C. La and M. N. Do, “Signal reconstruction using sparse tree representa- order estimation for wideband cognitive radio receiver,” in 2014 IEEE
tions,” Proc. Wavelets XI SPIE Opt. Photon., August 2005. International Conference on Communications (ICC), June 2014, pp.
[169] T. Blumensath and M. E. Davies, “Iterative Thresholding for Sparse 1361–1366.
Approximations,” Journal of Fourier Analysis and Applications, vol. 14, [192] J. F. C. Mota, J. M. F. Xavier, P. M. Q. Aguiar, and M. Puschel,
no. 5, pp. 629–654, 2008. “Distributed basis pursuit,” IEEE Transactions on Signal Processing,
[170] ——, “Iterative hard thresholding for compressed sensing,” vol. 60, no. 4, pp. 1942–1956, April 2012.
Applied and Computational Harmonic Analysis, vol. 27, [193] A. Mousavi, A. B. Patel, and R. G. Baraniuk, “A deep learning approach
no. 3, pp. 265 – 274, 2009. [Online]. Available: to structured signal recovery,” in 2015 53rd Annual Allerton Conference
http://www.sciencedirect.com/science/article/pii/S1063520309000384 on Communication, Control, and Computing (Allerton), Sept 2015, pp.
1336–1343.
[171] H. Huang and A. Makur, “Backtracking-based matching pursuit method
[194] M. Borgerding, P. Schniter, and S. Rangan, “AMP-inspired deep net-
for sparse signal reconstruction,” IEEE Signal Processing Letters, vol. 18,
works for sparse linear inverse problems,” IEEE Transactions on Signal
no. 7, pp. 391–394, July 2011.
Processing, vol. 65, no. 16, pp. 4293–4308, Aug 2017.
[172] A. C. Gilbert, M. J. Strauss, J. A. Tropp, and R. Vershynin, “Algorithmic [195] Z. Wang, Q. Ling, and T. S. Huang, “Learning deep l0 encoders,”
linear dimension reduction in the `1 norm for sparse vectors,” in Allerton Proceedings of the thirtieh AAAI Conference on Artificial Intelligence,
2006 - 44th Annual Allerton Conference on Communication, Control, p. 2194–2200, 2016.
and Computing, 2006. [196] M. Borgerding and P. Schniter, “Onsager-corrected deep learning for
[173] J. D. Blanchard, J. Tanner, and K. Wei, “CGIHT: conjugate gradient iter- sparse linear inverse problems,” in 2016 IEEE Global Conference on
ative hard thresholding for compressed sensing and matrix completion,” Signal and Information Processing (GlobalSIP), Dec 2016, pp. 227–231.
Information and Inference: A Journal of the IMA, vol. 4, no. 4, pp. 289– [197] U. S. Kamilov and H. Mansour, “Learning optimal nonlinearities for iter-
327, 2015. ative thresholding algorithms,” IEEE Signal Processing Letters, vol. 23,
[174] X. Zhu, L. Dai, W. Dai, Z. Wang, and M. Moonen, “Tracking a dy- no. 5, pp. 747–751, May 2016.
namic sparse channel via differential orthogonal matching pursuit,” in
MILCOM 2015 - 2015 IEEE Military Communications Conference, Oct
2015, pp. 792–797.
[175] N. B. Karahanoglu and H. Erdogan, “Compressed sensing signal recovery
via forward–backward pursuit,” Digital Signal Processing, vol. 23, no. 5,
pp. 1539 – 1548, 2013.
[176] A. C. Gilbert, S. Muthukrishnan, and M. Strauss, “Improved time bounds
for near-optimal sparse fourier representations,” Proceedings of the SPIE
5914, Wavelets XI, vol. 5914, 2005.
[177] S. Foucart, “Hard thresholding pursuit: An algorithm for compressive
sensing,” SIAM Journal on Numerical Analysis, vol. 49, no. 6, pp. 2543–
2563, 2011.
[178] A. C. Gilbert, M. J. Strauss, and R. Vershynin, “One sketch for all: Fast
algorithms for compressed sensing,” in Proc. 39th ACM Symp. Theory
of Computing, 2007.
[179] J. Tanner and K. Wei, “Normalized iterative hard thresholding for matrix
completion,” SIAM Journal on Scientific Computing, vol. 35, no. 5, pp.
S104–S125, 2013.

22 VOLUME , 2018

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.

You might also like