RLS Algorithm With Convex Regularization

This document discusses a modification to the recursive least squares (RLS) adaptive filtering algorithm by adding a regularization term in the form of a general convex function of the system impulse response estimate. The normal equations for the convex regularized cost function are derived. A recursive algorithm is developed to update the tap estimates. A closed-form expression for selecting the regularization parameter automatically is also introduced.

Uploaded by

Mourad Benziane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views4 pages

RLS Algorithm With Convex Regularization

Uploaded by

Mourad Benziane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

470 IEEE SIGNAL PROCESSING LETTERS, VOL. 18, NO.

8, AUGUST 2011

RLS Algorithm With Convex Regularization

Ender M. Eksioglu, Member, IEEE, and A. Korhan Tanc, Student Member, IEEE

Abstract—In this letter, the RLS adaptive algorithm is consid- Adaptive sparse system identification has been recently suc-
ered in the system identification setting. The RLS algorithm is cessfully extended to nonlinear systems [9], [10].
regularized using a general convex function of the system impulse In this letter we consider regularization of the RLS cost func-
response estimate. The normal equations corresponding to the
convex regularized cost function are derived, and a recursive al-
tion in a manner alike to the approach as outlined in [8]. How-
gorithm for the update of the tap estimates is established. We also ever, here the regularizing term is defined as a general convex
introduce a closed-form expression for selecting the regularization function of the system estimate, rather than being defined specif-
parameter. With this selection of the regularization parameter, ically as the weighted norm. This generalization allows uti-
we show that the convex regularized RLS algorithm performs as lization of any convex function for regularization, which per-
well as, and possibly better than, the regular RLS when there is mits one to exploit a much more general class of prior knowl-
a constraint on the value of the convex function evaluated at the
true weight vector. Simulations demonstrate the superiority of the
edge about the system to be identified, rather than being limited
convex regularized RLS with automatic parameter selection over only to sparsity. We develop the update algorithm for the convex
regular RLS for the sparse system identification setting. regularized RLS using results from subgradient calculus. Addi-
Index Terms—Adaptive filter, convex regularization, l1 norm, l0
tionally, we develop conditions on the proper selection of the
norm, RLS, sparsity. regularization parameter. We prove that if the regularization pa-
rameter is selected accordingly, the convex regularized RLS al-
gorithm performs as well as, if not better than, the regular RLS
I. INTRODUCTION algorithm in terms of the mean square deviation (MSD) of the
tap estimates. We consider norm and smoothed norm as ex-
T HE last decade has seen a flurry of activities in regular-
ization of an otherwise ill-posed inverse problem by a
convex, most of the time sparsity based prior. The sparsity prior
amples for regularizing convex functions. Simulations demon-
strate that the resulting -RLS and -RLS algorithms outper-
form the regular RLS in the sparse system identification setting.
utilizes the knowledge that the object to be recovered is sparse
in a certain, known representation. The replacement of the non-
II. CONVEX REGULARIZED RLS ALGORITHM
convex pseudo-norm as a count for sparsity with the convex
norm has led to new data acquisition paradigms introduced We first review the adaptive input-output system identifica-
under Compressive Sensing [1], and it has found numerous ap- tion setting:
plications including sparse channel estimation [2].
These advances in sparse signal representation have also (1)
impacted sparse adaptive system identification. In [3], the is the im-
authors propose to modify the LMS cost function by addi- pulse response for the FIR system to be identified.
tion of a convex approximation for the norm penalty. The is the input vector
resulting sparsity enhancing LMS variant is called as the where is the input signal. is the desired output
-LMS. The authors of [4] propose to regularize the LMS signal, and denotes the observation noise at time . The
cost function by adding an norm term or a log-sum term. estimate for the system tap vector at time is given by
They have recently considered the regularization of the LMS . The regular RLS
algorithm by a general convex function [5]. -norm regular- cost function with exponential forgetting factor is defined as
ized recursive least squares (RLS) adaptive algorithms have
also been suggested in the literature. The SPARLS algorithm
(2)
[6] presents an expectation-maximization (EM) approach for
sparse system identification. The authors of [7] propose the
application of an online coordinate descent algorithm together Here, is the instantaneous error between the desired output
with the least-squares cost function penalized by an -norm and estimated system output:
term. Another RLS algorithm for sparse system identification
(3)
is proposed in [8], where the RLS cost function is regularized
by adding a weighted norm of the current system estimate. We modify the RLS cost function by the addition of a convex
function of the instantaneous system estimate. This convex
Manuscript received April 14, 2011; revised May 31, 2011; accepted June 02, penalty function can be chosen to reflect any prior knowledge
2011. Date of publication June 13, 2011; date of current version June 23, 2011. about the true system, including but not limited to sparsity:
The associate editor coordinating the review of this manuscript and approving
it for publication was Dr. Yuriy V. Zakharov.
The authors are with the Department of Electronics and Communications
(4)
Engineering, Istanbul Technical University, Istanbul, Turkey (e-mail: ek-
[email protected]; [email protected]). is a general convex function. is the pos-
Digital Object Identifier 10.1109/LSP.2011.2159373 sibly time-varying regularization parameter which governs the

1070-9908/$26.00 © 2011 IEEE

EKSIOGLU AND TANC: RLS ALGORITHM WITH CONVEX REGULARIZATION 471

compromise between the effect of the regularizing convex func- is the gain vector defined as
tion term and the estimation error. We wish to find the optimal . Equation (7) can be rewritten as follows.
system tap vector which minimizes the regularized cost
function . For convex and nondifferentiable functions subgra- (13)
dient analysis offers a substitute for the gradient when finding
this minimum [11]. At any point where the convex function After evaluating (13) using the recursions (11) and (12), we
fails to be differentiable, there exist possibly many valid sub- come up with the following update equation for :
gradient vectors. All the subgradients together are called as the
subdifferential of and is designated by . We denote a (14)
subgradient vector of at with . A valid
where is the a priori estimation error. Let
subgradient vector of with respect to can be written as
us remember the update equation for the standard a priori RLS
follows, by using the fact that is differentiable everywhere:
algorithm:
(5)
(15)
One theorem from the subdifferential calculus states that a point Equation (14) differs from the standard RLS algorithm update
minimizes a convex function if and only if equation (15) with the inclusion of the rightmost term. Equa-
, that is if is a subgradient of at [11]. Hence, to find tion (14) summarizes an adaptive algorithm which calculates
the optimal which minimizes we set the subgradient of (approximately) the solution to the convex regularized normal
as given in (5) equal to . After evaluating the gradient equation as given in (7). We entitle this adaptive RLS based
and setting the subgradient equal to , the relation for the algorithm as the “Convex Regularized-RLS” (CR-RLS). The
th term reads as follows: CR-RLS algorithm is summarized in Algorithm 1.

Algorithm 1 Convex Regularized-RLS (CR-RLS) algorithm

(6) inputs
The relations for all can be written together 1: for do time recursion
in a matrix form as a set of modified normal equations: 2: gain vector
3: a priori error
(7) 4:
where is the deterministic autocorrelation matrix 5:
estimate for the input signal : 6: end for end of recursion

(8) III. SELECTION OF THE REGULARIZATION PARAMETER

The cost function in (4) includes the penalty term
and is the deterministic cross-correlation estimate to put to use some a priori knowledge about the true system.
vector between and : The convex function formalizes this a priori information. We
assume that this a priori information is in the form of a constraint
(9) on the true system parameters given as follows:

and both have rank-one update equations associated with (16)

them. For the right hand side of (7) a new variable can be
defined: where denotes an upper bound constant. is the solution to
the normal equation (7). is the solution to the nonregular-
(10) ized normal equation given as or .
We denote the deviation of the system estimates from the true
The update equation (9) and the definition (10) together lead to system parameters as and . From
an update equation for . Assuming that and (7) it follows that
do not change considerably over a single time step, this update
equation can be approximately written as (17)

(11) The instantaneous square deviation for is calculated as fol-

lows:
We define the inverse of the autocorrelation matrix by
. Using the matrix inversion lemma and (8), there is a well-
known update equation for :

(12) (18)
472 IEEE SIGNAL PROCESSING LETTERS, VOL. 18, NO. 8, AUGUST 2011

where . Equation (18) leads to the following the- IV. SIMULATION RESULTS
orem. We will employ two sparsity inducing convex penalty func-
Theorem 1: if , where tions in the CR-RLS algorithm and analyze their performances
in sparse system identification. The true measure of sparsity is
(19) the pseudo-norm, which is known to be a nonconvex func-
tion. One obvious convex relaxation option for the sparsity
Proof: From (18) it is obvious that as long as measure is the norm. For this choice
. This condition , where a corresponding subgradient is calculated
can be rewritten as as [4], [8]. Here is the compo-
nent-wise sign function. The CR-RLS algorithm resulting from
(20) this choice of is equivalent to the -RLS algorithm as out-
lined in [8].
We only allow , hence when
Another choice for convexly relaxing is the approximation
the above inequality holds only for and becomes an
as follows [3]:
equality. If , for the inequality to hold
can be any value between 0 and as given in (19).
Theorem 1 states that the CR-RLS algorithm provides an (27)
MSD as low as, and possibly lower than, that of the regular RLS
algorithm, if is chosen using (19). However, it is not possible where is an appropriate constant. A subgradient for (27) is
to evaluate in (19), because it refers to and hence to . approximately calculated as [3]
Now we will try to find a calculable approximation to by re-
placing . can be rewritten as (28)

This cost function with the corresponding subgradient has been

(21) utilized in the LMS context, and the resulting algorithm has been
called as the -LMS [3]. Fittingly, we entitle the novel algo-
where . At this stage of (19) becomes
rithm which results from utilizing the cost function (27) in the
CR-RLS approach as the -RLS algorithm.
(22) In the experiments the true system function has a total of
taps, where only of them are nonzero. The nonzero
There are two terms in the right hand side nominator of (22). coefficients are positioned randomly and take their values from
The term is calculable. employs , the a distribution. The input signal is ,
calculation of which would only require additional oper- and measurement noise is , where is chosen
ations per time step in Algorithm 1. The second term is to fulfill the desired SNR. The CR-RLS algorithms are realized
in two different modes, first with constant and secondly
(23) with automatic selection using (26). For constant case, is
From the definition of the subgradient for a convex function found as the optimum value which results in minimum steady-
[11] and using (16) the following holds. state MSD using repeated simulations. For the automatic case
, where is calculated at each time instant
(24) via (26). The value is taken to be the true value of , that
is for -RLS and for -RLS . We also
Assuming the input is white and is large enough, the following implement the regular RLS, the SPARLS of [6]1 and an oracle
inequality can be deduced using (24): RLS algorithm. For SPARLS the algorithm parameters are fine-
tuned as to result in minimum steady-state MSD. The oracle
(25) RLS is the regular RLS algorithm where the positions of the
true nonzero system parameters are known. For all algorithms
Here, denotes the matrix trace operator. With (25), the , and each simulation setting is averaged over
expression in (22) modifies into 2000 independent realizations. For SPARLS and for
-RLS .
In the first experiment we realize the algorithms for
and dB. For -RLS the optimum , for
(26)
-RLS the optimum and for SPARLS .
We plot the variation of the MSD versus iteration number in
Equation (26) presents a calculable approximation for in
Fig. 1. The oracle RLS has the best performance as expected.
the case of white input. The instantaneous regularization param-
On the other hand, CR-RLS algorithms present considerable im-
eter can be automatically updated by as
provement over the regular RLS. The -RLS has better perfor-
suggested by Theorem 1, where is calculated using (26). The
mance than -RLS and SPARLS, and it is not very far off from
operational complexity of Algorithm 1 with automatic up-
the oracle. The CR-RLS variants with automatic regularization
date via (26) will be per iteration, just like the regular
RLS. 1The authors of [6] did generously share their code for simulations.
EKSIOGLU AND TANC: RLS ALGORITHM WITH CONVEX REGULARIZATION 473

Fig. 1. Performance of different algorithms for and dB. Fig. 2. Performance of auto -RLS and RLS under dB and for
varying .
TABLE I
STEADY-STATE MSD FOR DIFFERENT SPARSITY VALUES V. CONCLUSION
In this letter, we introduced a convex regularized RLS ap-
proach for adaptive system identification, when there is a priori
information about the true system formalized in the form of a
convex function. We develop the update steps for the resulting
algorithm by employing subgradient analysis on the convex reg-
ularized cost function. We also present a closed-form expression
for the automatic selection of the regularization parameter in
the case of white input. Simulation results suggest that the au-
tomatic parameter selection works almost as well as optimizing
parameter selection converge to almost the same MSD values
a constant regularization parameter manually. Simulations also
as the CR-RLS algorithms with the ad hoc, optimally selected
show that the introduced -RLS and -RLS algorithms with
. Hence, we can state that (26) presents a viable systematic
automatic parameter selection show better performance than
method for automatically selecting in the white input case,
RLS in the sparse setting, and that they gracefully converge to
rather than resorting to improvisation of a parameter value for
the regular RLS algorithm when sparsity vanishes.
each simulation setting.
As a second experiment we consider the effect of the sparsity
REFERENCES
on the algorithm performance. We simulate the algorithms with
dB and for and , where [1] E. J. Candes and M. B. Wakin, “An introduction to compressive sam-
pling,” IEEE Signal Process. Mag., vol. 25, no. 2, pp. 21–30, Mar.
corresponds to a completely non-sparse system. The respec- 2008.
tive parameters for SPARLS are . The [2] C. R. Berger, Z. Wang, J. Huang, and S. Zhou, “Application of com-
steady-state MSD values at the end of 1000 iterations for the pressive sensing to sparse channel estimation,” IEEE Commun. Mag.,
vol. 48, no. 11, pp. 164–174, Nov. 2010.
algorithms are given in Table I. Table I shows that RLS perfor- [3] Y. Gu, J. Jin, and S. Mei, “ norm constraint LMS algorithm for sparse
mance does not vary with sparsity. Performance of the other al- system identification,” IEEE Signal Process. Lett., vol. 16, no. 9, pp.
gorithms deteriorate with decreasing sparsity, where for 774–777, Sep. 2009.
[4] Y. Chen, Y. Gu, and A. O. Hero, “Sparse LMS for system identifica-
all MSD values become equivalent. The CR-RLS algorithms tion,” in Proc. ICASSP, Apr. 19–24, 2009, pp. 3125–3128.
have better performance than RLS when sparsity is present, [5] Y. Chen, Y. Gu, and A. O. Hero, “Regularized least-mean-square al-
and they gracefully converge to the RLS algorithm with de- gorithms,” ArXiv e-prints Dec. 2010 [Online]. Available: http://arxiv.
creasing sparsity. We also did simulations for where org/abs/1012.5066v2
[6] B. Babadi, N. Kalouptsidis, and V. Tarokh, “SPARLS: The sparse RLS
is chosen nonideally as . The steady-state MSD algorithm,” IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4013–4025,
for -RLS comes out as , and for -RLS it comes Aug. 2010.
out as . These results suggest that rough selection [7] D. Angelosante, J. A. Bazerque, and G. B. Giannakis, “Online adaptive
estimation of sparse signals: Where RLS meets the -norm,” IEEE
of leads to a detoriation of performance for the CR-RLS al- Trans. Signal Process., vol. 58, no. 7, pp. 3436–3447, Jul. 2010.
gorithms, and it can be stated that for very large values the [8] E. M. Eksioglu, “Sparsity regularized RLS adaptive filtering,” IET
CR-RLS algorithms approach the regular RLS. Signal Process., Feb. 2011, to be published.
[9] N. Kalouptsidis, G. Mileounis, B. Babadi, and V. Tarokh, “Adaptive
In Fig. 2, we plot the MSD curves of auto -RLS and RLS algorithms for sparse system identification,” Signal Process., vol. 91,
with varying for dB. The curves affirm that auto no. 8, pp. 1910–1919, Aug. 2011.
-RLS performs better when sparsity is present and converges [10] G. Mileounis, B. Babadi, N. Kalouptsidis, and V. Tarokh, “An adap-
to the regular RLS when sparsity vanishes. There is no need for tive greedy algorithm with application to nonlinear communications,”
IEEE Trans. Signal Process., vol. 58, no. 6, pp. 2998–3007, Jun. 2010.
tweaking any parameters for the automatic CR-RLS algorithms [11] D. Bertsekas, A. Nedic, and A. Ozdaglar, Convex Analysis and Opti-
depending on the simulation scenario. mization. Cambridge, MA: Athena Scientific, 2003.