RLS Algorithm With Convex Regularization
RLS Algorithm With Convex Regularization
8, AUGUST 2011
Abstract—In this letter, the RLS adaptive algorithm is consid- Adaptive sparse system identification has been recently suc-
ered in the system identification setting. The RLS algorithm is cessfully extended to nonlinear systems [9], [10].
regularized using a general convex function of the system impulse In this letter we consider regularization of the RLS cost func-
response estimate. The normal equations corresponding to the
convex regularized cost function are derived, and a recursive al-
tion in a manner alike to the approach as outlined in [8]. How-
gorithm for the update of the tap estimates is established. We also ever, here the regularizing term is defined as a general convex
introduce a closed-form expression for selecting the regularization function of the system estimate, rather than being defined specif-
parameter. With this selection of the regularization parameter, ically as the weighted norm. This generalization allows uti-
we show that the convex regularized RLS algorithm performs as lization of any convex function for regularization, which per-
well as, and possibly better than, the regular RLS when there is mits one to exploit a much more general class of prior knowl-
a constraint on the value of the convex function evaluated at the
true weight vector. Simulations demonstrate the superiority of the
edge about the system to be identified, rather than being limited
convex regularized RLS with automatic parameter selection over only to sparsity. We develop the update algorithm for the convex
regular RLS for the sparse system identification setting. regularized RLS using results from subgradient calculus. Addi-
Index Terms—Adaptive filter, convex regularization, l1 norm, l0
tionally, we develop conditions on the proper selection of the
norm, RLS, sparsity. regularization parameter. We prove that if the regularization pa-
rameter is selected accordingly, the convex regularized RLS al-
gorithm performs as well as, if not better than, the regular RLS
I. INTRODUCTION algorithm in terms of the mean square deviation (MSD) of the
tap estimates. We consider norm and smoothed norm as ex-
T HE last decade has seen a flurry of activities in regular-
ization of an otherwise ill-posed inverse problem by a
convex, most of the time sparsity based prior. The sparsity prior
amples for regularizing convex functions. Simulations demon-
strate that the resulting -RLS and -RLS algorithms outper-
form the regular RLS in the sparse system identification setting.
utilizes the knowledge that the object to be recovered is sparse
in a certain, known representation. The replacement of the non-
II. CONVEX REGULARIZED RLS ALGORITHM
convex pseudo-norm as a count for sparsity with the convex
norm has led to new data acquisition paradigms introduced We first review the adaptive input-output system identifica-
under Compressive Sensing [1], and it has found numerous ap- tion setting:
plications including sparse channel estimation [2].
These advances in sparse signal representation have also (1)
impacted sparse adaptive system identification. In [3], the is the im-
authors propose to modify the LMS cost function by addi- pulse response for the FIR system to be identified.
tion of a convex approximation for the norm penalty. The is the input vector
resulting sparsity enhancing LMS variant is called as the where is the input signal. is the desired output
-LMS. The authors of [4] propose to regularize the LMS signal, and denotes the observation noise at time . The
cost function by adding an norm term or a log-sum term. estimate for the system tap vector at time is given by
They have recently considered the regularization of the LMS . The regular RLS
algorithm by a general convex function [5]. -norm regular- cost function with exponential forgetting factor is defined as
ized recursive least squares (RLS) adaptive algorithms have
also been suggested in the literature. The SPARLS algorithm
(2)
[6] presents an expectation-maximization (EM) approach for
sparse system identification. The authors of [7] propose the
application of an online coordinate descent algorithm together Here, is the instantaneous error between the desired output
with the least-squares cost function penalized by an -norm and estimated system output:
term. Another RLS algorithm for sparse system identification
(3)
is proposed in [8], where the RLS cost function is regularized
by adding a weighted norm of the current system estimate. We modify the RLS cost function by the addition of a convex
function of the instantaneous system estimate. This convex
Manuscript received April 14, 2011; revised May 31, 2011; accepted June 02, penalty function can be chosen to reflect any prior knowledge
2011. Date of publication June 13, 2011; date of current version June 23, 2011. about the true system, including but not limited to sparsity:
The associate editor coordinating the review of this manuscript and approving
it for publication was Dr. Yuriy V. Zakharov.
The authors are with the Department of Electronics and Communications
(4)
Engineering, Istanbul Technical University, Istanbul, Turkey (e-mail: ek-
[email protected]; [email protected]). is a general convex function. is the pos-
Digital Object Identifier 10.1109/LSP.2011.2159373 sibly time-varying regularization parameter which governs the
compromise between the effect of the regularizing convex func- is the gain vector defined as
tion term and the estimation error. We wish to find the optimal . Equation (7) can be rewritten as follows.
system tap vector which minimizes the regularized cost
function . For convex and nondifferentiable functions subgra- (13)
dient analysis offers a substitute for the gradient when finding
this minimum [11]. At any point where the convex function After evaluating (13) using the recursions (11) and (12), we
fails to be differentiable, there exist possibly many valid sub- come up with the following update equation for :
gradient vectors. All the subgradients together are called as the
subdifferential of and is designated by . We denote a (14)
subgradient vector of at with . A valid
where is the a priori estimation error. Let
subgradient vector of with respect to can be written as
us remember the update equation for the standard a priori RLS
follows, by using the fact that is differentiable everywhere:
algorithm:
(5)
(15)
One theorem from the subdifferential calculus states that a point Equation (14) differs from the standard RLS algorithm update
minimizes a convex function if and only if equation (15) with the inclusion of the rightmost term. Equa-
, that is if is a subgradient of at [11]. Hence, to find tion (14) summarizes an adaptive algorithm which calculates
the optimal which minimizes we set the subgradient of (approximately) the solution to the convex regularized normal
as given in (5) equal to . After evaluating the gradient equation as given in (7). We entitle this adaptive RLS based
and setting the subgradient equal to , the relation for the algorithm as the “Convex Regularized-RLS” (CR-RLS). The
th term reads as follows: CR-RLS algorithm is summarized in Algorithm 1.
(6) inputs
The relations for all can be written together 1: for do time recursion
in a matrix form as a set of modified normal equations: 2: gain vector
3: a priori error
(7) 4:
where is the deterministic autocorrelation matrix 5:
estimate for the input signal : 6: end for end of recursion
(12) (18)
472 IEEE SIGNAL PROCESSING LETTERS, VOL. 18, NO. 8, AUGUST 2011
where . Equation (18) leads to the following the- IV. SIMULATION RESULTS
orem. We will employ two sparsity inducing convex penalty func-
Theorem 1: if , where tions in the CR-RLS algorithm and analyze their performances
in sparse system identification. The true measure of sparsity is
(19) the pseudo-norm, which is known to be a nonconvex func-
tion. One obvious convex relaxation option for the sparsity
Proof: From (18) it is obvious that as long as measure is the norm. For this choice
. This condition , where a corresponding subgradient is calculated
can be rewritten as as [4], [8]. Here is the compo-
nent-wise sign function. The CR-RLS algorithm resulting from
(20) this choice of is equivalent to the -RLS algorithm as out-
lined in [8].
We only allow , hence when
Another choice for convexly relaxing is the approximation
the above inequality holds only for and becomes an
as follows [3]:
equality. If , for the inequality to hold
can be any value between 0 and as given in (19).
Theorem 1 states that the CR-RLS algorithm provides an (27)
MSD as low as, and possibly lower than, that of the regular RLS
algorithm, if is chosen using (19). However, it is not possible where is an appropriate constant. A subgradient for (27) is
to evaluate in (19), because it refers to and hence to . approximately calculated as [3]
Now we will try to find a calculable approximation to by re-
placing . can be rewritten as (28)
Fig. 1. Performance of different algorithms for and dB. Fig. 2. Performance of auto -RLS and RLS under dB and for
varying .
TABLE I
STEADY-STATE MSD FOR DIFFERENT SPARSITY VALUES V. CONCLUSION
In this letter, we introduced a convex regularized RLS ap-
proach for adaptive system identification, when there is a priori
information about the true system formalized in the form of a
convex function. We develop the update steps for the resulting
algorithm by employing subgradient analysis on the convex reg-
ularized cost function. We also present a closed-form expression
for the automatic selection of the regularization parameter in
the case of white input. Simulation results suggest that the au-
tomatic parameter selection works almost as well as optimizing
parameter selection converge to almost the same MSD values
a constant regularization parameter manually. Simulations also
as the CR-RLS algorithms with the ad hoc, optimally selected
show that the introduced -RLS and -RLS algorithms with
. Hence, we can state that (26) presents a viable systematic
automatic parameter selection show better performance than
method for automatically selecting in the white input case,
RLS in the sparse setting, and that they gracefully converge to
rather than resorting to improvisation of a parameter value for
the regular RLS algorithm when sparsity vanishes.
each simulation setting.
As a second experiment we consider the effect of the sparsity
REFERENCES
on the algorithm performance. We simulate the algorithms with
dB and for and , where [1] E. J. Candes and M. B. Wakin, “An introduction to compressive sam-
pling,” IEEE Signal Process. Mag., vol. 25, no. 2, pp. 21–30, Mar.
corresponds to a completely non-sparse system. The respec- 2008.
tive parameters for SPARLS are . The [2] C. R. Berger, Z. Wang, J. Huang, and S. Zhou, “Application of com-
steady-state MSD values at the end of 1000 iterations for the pressive sensing to sparse channel estimation,” IEEE Commun. Mag.,
vol. 48, no. 11, pp. 164–174, Nov. 2010.
algorithms are given in Table I. Table I shows that RLS perfor- [3] Y. Gu, J. Jin, and S. Mei, “ norm constraint LMS algorithm for sparse
mance does not vary with sparsity. Performance of the other al- system identification,” IEEE Signal Process. Lett., vol. 16, no. 9, pp.
gorithms deteriorate with decreasing sparsity, where for 774–777, Sep. 2009.
[4] Y. Chen, Y. Gu, and A. O. Hero, “Sparse LMS for system identifica-
all MSD values become equivalent. The CR-RLS algorithms tion,” in Proc. ICASSP, Apr. 19–24, 2009, pp. 3125–3128.
have better performance than RLS when sparsity is present, [5] Y. Chen, Y. Gu, and A. O. Hero, “Regularized least-mean-square al-
and they gracefully converge to the RLS algorithm with de- gorithms,” ArXiv e-prints Dec. 2010 [Online]. Available: http://arxiv.
creasing sparsity. We also did simulations for where org/abs/1012.5066v2
[6] B. Babadi, N. Kalouptsidis, and V. Tarokh, “SPARLS: The sparse RLS
is chosen nonideally as . The steady-state MSD algorithm,” IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4013–4025,
for -RLS comes out as , and for -RLS it comes Aug. 2010.
out as . These results suggest that rough selection [7] D. Angelosante, J. A. Bazerque, and G. B. Giannakis, “Online adaptive
estimation of sparse signals: Where RLS meets the -norm,” IEEE
of leads to a detoriation of performance for the CR-RLS al- Trans. Signal Process., vol. 58, no. 7, pp. 3436–3447, Jul. 2010.
gorithms, and it can be stated that for very large values the [8] E. M. Eksioglu, “Sparsity regularized RLS adaptive filtering,” IET
CR-RLS algorithms approach the regular RLS. Signal Process., Feb. 2011, to be published.
[9] N. Kalouptsidis, G. Mileounis, B. Babadi, and V. Tarokh, “Adaptive
In Fig. 2, we plot the MSD curves of auto -RLS and RLS algorithms for sparse system identification,” Signal Process., vol. 91,
with varying for dB. The curves affirm that auto no. 8, pp. 1910–1919, Aug. 2011.
-RLS performs better when sparsity is present and converges [10] G. Mileounis, B. Babadi, N. Kalouptsidis, and V. Tarokh, “An adap-
to the regular RLS when sparsity vanishes. There is no need for tive greedy algorithm with application to nonlinear communications,”
IEEE Trans. Signal Process., vol. 58, no. 6, pp. 2998–3007, Jun. 2010.
tweaking any parameters for the automatic CR-RLS algorithms [11] D. Bertsekas, A. Nedic, and A. Ozdaglar, Convex Analysis and Opti-
depending on the simulation scenario. mization. Cambridge, MA: Athena Scientific, 2003.