0% found this document useful (0 votes)

63 views7 pages

A Robust and Regularized Extreme Learning Machine

Method that combines maximization of the hidden layer’s information transmission, through Batch Intrinsic Plasticity (BIP), with robust estimation of the output weights.

Uploaded by

Matheus Rocha Barbosa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views7 pages

A Robust and Regularized Extreme Learning Machine

Method that combines maximization of the hidden layer’s information transmission, through Batch Intrinsic Plasticity (BIP), with robust estimation of the output weights.

Uploaded by

Matheus Rocha Barbosa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/307559947

A Robust and Regularized Extreme Learning Machine

Conference Paper · October 2014

CITATIONS READS
2 176

2 authors:

Ananda Lima Freire Guilherme A. Barreto

Federal Institute of Education, Science and Technology, Fortaleza, Ceará Universidade Federal do Ceará
15 PUBLICATIONS 54 CITATIONS 159 PUBLICATIONS 1,297 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Robust System Identification Using Kernel Methods, such as Recursive Least Squares (RLS) and Least Squares SVR (LSSVR). View project

Reference points selection for minimal learning machine View project

All content following this page was uploaded by Guilherme A. Barreto on 21 March 2017.

The user has requested enhancement of the downloaded file.

A Robust and Regularized
Extreme Learning Machine

Ananda Freire∗ and Guilherme Barreto∗

∗ FederalUniversity of Ceará (UFC), Department of Teleinformatics Engineering (DETI)
Av. Mister Hull, S/N - Center of Technology, Campus of Pici, Fortaleza, Ceará, Brazil
Email: [email protected] and [email protected]

Abstract—In a moment when the study of outlier robustness with random hidden weights values, commonly known now
within Extreme Learning Machine is still in its infancy, we as Extreme Learning Machine (ELM) [11], has become very
propose a method that combines maximization of the hidden popular due to its generalization performance and much faster
layer’s information transmission, through Batch Intrinsic Plas- learning speed [12]. The random choice of input-to-hidden-
ticity (BIP), with robust estimation of the output weights. This layer weights (input weights for short) leaves only the hidden-
method named R-ELM/BIP generates a reliable solution in the
to-output-layer weights (output weights) to be determined ana-
presence of corrupted data with a good generalization capability
and small weight norms. Computer experiments were carried lytically. Although, even with the considerable attention that
out with three regression problems using traditional ELM, ELM ELM techniques have received in computational intelligence
with BIP, ELM using Iteratively Reweighted Least Squares as and machine learning communities lately, the study into the
estimation method (ROB-ELM) and our proposal (R-ELM/BIP). effects of outliers in ELM is only in its infancy [13].
Two aspects influence the robustness properties in a ELM
network: the computational robustness, related to numerical
I. I NTRODUCTION stability, and outliers robustness. The first one has been gene-
Machine Learning problems are often contaminated by rally ignored, since many efforts emphasize on the accuracy
noise, which reflects inaccuracies in observations and the of solutions [14]. Those computational problems occur when
stochastic nature of the underlying process [1]. This contami- the hidden layer output matrix H is ill-conditioned, caused
nation may generate outliers, which can be defined intuitively by the random input weights and biases selection. This makes
as data points inconsistent with the remainder of data set [2]. the linear system, used to train the output weights, results in
Khamis et al (2005) [3] and Steege et al (2012) [4] show a solution sensitive to data perturbation and become a poor
that outliers influence the modeling accuracy as well as the estimation to the truth [14]. Besides, it is known that the size of
estimated parameters. Therefore, when fitting a model to the the output layer weight is more relevant for the generalization
data, those outliers need to be identified and eliminated or, capability than the configuration of the neural network, in
alternatively, examined closely, as they may be of the main terms of number of neurons and format of activation function
interest themselves [1]. [15], [16]. Works such as [15], [17], [18] and [19] explore this
issue.
Robust neural networks have been a subject of interest
for many years in many different applications. Works such as The second aspect, related to outliers robustness, has been
Yong (1993) [5] shows that the conventional back-propagation explored in recent years within a few development proposals,
algorithm for neural network regression is robust to leverages using estimation methods that are known for being less sensi-
(input data x corrupted), but not to outliers (output data y tive to outliers then the Ordinary Least Squares (OLS). Work
corrupted). Larsen et al (1998) [6] proposed a neural network such as Huynh and Wong (2008) [2] substitutes the Singular
optimized using the maximum a posteriori technique with a Value Decomposition method by the Weighted Least Squares,
modified likelihood function which incorporates the potential which is similar to OLS, but creates penalties corresponding
risk of outliers in the data. Lee et al (2009) [7] proposed a to training patterns to weight their contribution to the final so-
Welsch M-estimator radial basis function with pruning and lution. Barros and Barreto (2013) [20] concentrate their efforts
growing techniques for noisy time series prediction. Łobos on robust classification problems with a proposal of an ELM
et al (2010) [8] present on-line techniques for robust esti- that used Iteratively Reweighted Least Squares (IRLS), named
mation of parameters of harmonic signals based on the total ROB-ELM. And finally, Horata et al (2013) [13] addresses
least-squares criteria, which can be implemented by analogue both aspects by applying three estimation methods: IRLS, the
adaptive circuits. Feng et al (2010) [9] propose an algorithm Multivariate Least-Trimmed Squares (MLTS) estimator and the
for the neural network quantile regression adopted from a One-Step Reweighted MLTS (RMLTS) modified by Extended
Majorization-Minimization algorithm for optimization and ap- Complete Orthogonal Decomposition (ECOD), which acts
plied it on an empirical analysis of credit card portfolio data. over the computational problem.
Aladag et al. (2014) [10] propose a median neuron model
That being said, it is not possible to overlook the impor-
multilayer feed forward (MNM-MFF) model, trained with a
tance of approaching both aspects. Then, we propose a method
modified particle swarm optimization metaheuristic, in order to
that combines both regularizing and outlier robustness features
deal with forecasting performance problems caused by outliers.
to ELM learning, through optimization of the hidden layer
In recent years, a multilayer feedforward neural network output by altering the activation function’s parameters with a
new method named Batch Intrinsic Plasticity (BIP) [21] and in principle, we could remove the outliers from the training
then estimate the output weights with the robust estimation data, although this is not always feasible. A second approach,
method IRLS. BIP shapes the output distributions of the input known as robust regression, uses estimation methods that are
neurons into exponential distributions through the adaptation not as sensitive to outliers as the OLS.
of slope and bias of the neuron activation function. By forcing
the hidden neurons activation into an exponential distribution, Huber [22] introduced the concept of 𝑀 -estimation, where
it maximizes the networks information transmission, caused 𝑀 stands for “maximum likelihood”. Here, robustness is
by the high entropy of the distribution, leading also to good achieved by minimizing another function than the sum of
generalization properties [21]. Our proposal has been tested the squared errors [20]. Based on Huber theory, a general
on regression problems with artificial and real-world datasets 𝑀 -estimator applied to the 𝑖-th output neuron minimizes the
with promising results. following objective function:
𝑁
∑
II. A LGORITHMS 𝐽(𝜷 𝑖 ) = 𝜌(𝑑𝑖𝑛 − 𝜷 𝑇𝑖 h𝑛 ), (4)
𝑛=1
A. Extreme Learning Machine
where the function 𝜌(⋅) gives the contribution of each error
ELM is a two layer network with random and fixed weights 𝑒𝑖𝑛 = 𝑑𝑖𝑛 − 𝑦𝑖𝑛 to the objective function, 𝑑𝑖𝑛 is the desired
matrix on the hidden layer. Its input-to-hidden-layer weights output of the 𝑖-th output neuron for the 𝑛-th linear system
matrix is represented by 𝜷 ℎ𝑑𝑑 ∈ ℝ𝑝×𝑞 , where 𝑝 is the number input sample h𝑛 . OLS is a particular case of 𝑀 -estimator,
of input attributes and 𝑞 the number of hidden neurons [12]. characterized by 𝜌(𝑒𝑖𝑛 ) = 𝑒2𝑖𝑛 .
The output of the hidden layer is given by:
( ) The function 𝜌 should possess these properties [23]:
h𝑛 = 𝜙 𝜷 𝑇ℎ𝑑𝑑 u𝑛 + b𝑛 , (1)
Property 1: 𝜌(𝑒𝑖𝑛 ) ≥ 0.
where h𝑛 ∈ ℝ𝑞 , u𝑛 ∈ ℝ𝑝 is the current input vector, b𝑛 ∈ Property 2: 𝜌(0) = 0.
ℝ𝑞 receives the biases and 𝜙 is a logistic activation function. Property 3: 𝜌(𝑒𝑖𝑛 ) = 𝜌(−𝑒𝑖𝑛 ).
During the first step of training phase, all inputs from the Property 4: 𝜌(𝑒𝑖𝑛 ) ≥ 𝜌(𝑒𝑖′ 𝑛 ), for ∣𝑒𝑖𝑛 ∣ > ∣𝑒𝑖′ 𝑛 ∣.
training sequence ((u𝑛 , d𝑛 ), 𝑛 = 1...𝑁 ) are presented to the
network and the corresponding network states (h𝑛 , d𝑛 ) are Let 𝜓 be the derivative of 𝜌. Differentiating 𝜌 with respect
harvested in H and D matrices respectively, where d𝑛 is the to the estimated weight vector 𝜷 𝑖 , we have:
desired output. 𝑁
∑ ( )
Since the network output is given by Eq. (2), for the second ˆ 𝑇 h𝑛 h𝑇 = 0,
𝜓 𝑦𝑖𝑛 − 𝜷 (5)
𝑖 𝑛
and last part of training, we compute the output weights 𝜷 ∈ 𝑛=1
ℝ𝑞×𝑚 , which connects the hidden layer to the output neurons, where 0 ∈ ℝ𝑝 is a row vector of zeros. Then, defining the
as a linear regression problem. weight function 𝑤(𝑒𝑖𝑛 ) = 𝜓(𝑒𝑖𝑛 )/𝑒𝑖𝑛 , and let 𝑤𝑖𝑛 = 𝑤(𝑒𝑖𝑛 ),
Y = 𝜷𝑇 H (2) the estimating equations are given by:
𝑁
∑ ( )
The ordinary least square (OLS) solution of the linear ˆ 𝑇 h𝑛 h𝑇 = 0.
𝑤𝑖𝑛 𝑦𝑖𝑛 − 𝜷 (6)
𝑖 𝑛
system created is given by the Moore-Penrose generalized 𝑛=1
inverse as follows:
( )−1 Thus, solving the estimating equations corresponds to solving
𝜷 = HH𝑇 HD𝑇 . (3) a weighted least-squares problem, minimizing:
∑ ∑
2 2
𝑤𝑖𝑛 𝑒𝑖𝑛 = 𝑤2 (𝑒𝑖𝑛 )𝑒2𝑖𝑛 . (7)
In several real-world problems the matrix HH𝑇 can be 𝑛 𝑛
singular, undermining the use of Eq. (3). In fact, a near singular
HH𝑇 (yet invertible) matrix is also a problem because it can It should be highlighted that the weights depend on the
lead to numerically unstable results. Huang et al [11] has residuals (i.e. estimated errors), the residuals depend upon the
solved this issue using Singular Value Decomposition (SVD) estimated coefficients, and the estimated coefficients depend
approach( to compute )the Moore-Penrose pseudo-inverse ins- upon the weights [20]. As a consequence, there is no closed-
tead of (HH𝑇 )−1 H [13]. That method can support either form equation for the estimation of 𝜷 𝑖 . An alternative is an
full or not full column rank matrices [13], which gives to iterative estimation method named iteratively reweighted least-
ELM some computational robustness property. Unfortunately, squares (IRLS) [23] which is often used and will be explained
it is also computationally expensive when dealing with large below.
datasets and may produce an unreliable solution when the
training data is corrupted by outliers. C. Iteratively Reweighted Least Squares
As described in [20], [23]:
B. Introduction to 𝑀 -Estimation
An important characteristic of OLS is that it assigns the ˆ (0) using the OLS
Step 1 - Provide an initial estimate 𝜷 𝑖
same importance to all error samples, i.e. all errors contribute solution in Eq. (3).
the same way to the final solution [20]. To handle this issue,
Step 2 - At each iteration 𝑡, compute the residuals from the E. Robust ELM with Batch Intrinsic Plasticity
previous iteration 𝑒𝑖𝑛 (𝑡 − 1), 𝑛 = 1, . . . , 𝑁 , associated with
the 𝑖-th output neuron, and then compute the corresponding In a sum, the basic idea of the proposed approach is
weights 𝑤𝑖𝑛 (𝑡 − 1) = 𝑤[𝑒𝑖𝑛 (𝑡 − 1)]. very simple: we combine the regularizer effect and learning
optimization property of BIP with the outlier robustness of
the 𝑀 -estimation framework and the IRLS algorithm to create
Step 3 - Solve for new weighted-least-squares estimate of an optimized ELM network. We will refer to this approach
𝜷 𝑖 (𝑡): as Robust ELM/BIP (R-ELM/BIP for short). The steps for its
implementation follows below.
[ ]
ˆ (𝑡) = HW(𝑡 − 1)H𝑇 −1 HW(𝑡 − 1)D𝑇 ,
𝜷 (8)
𝑖 𝑖
Step 1 - Initialize randomly 𝜷 ℎ𝑑𝑑 and collect all stimuli X =
(x1 , ..., x𝑞 ) with all training input data;
where W(𝑡 − 1) = diag{𝑤𝑖𝑛 (𝑡 − 1)} is an 𝑁 × 𝑁 weight
matrix and D𝑖 is the desired outputs matrix for the 𝑖-th output Step 2 - Calculate (𝑎𝑗 , 𝑏𝑗 ) for all hidden neurons as described
neuron. Repeat Steps 2 and 3 until the convergence of the in Section II-D;
ˆ (𝑡).
estimated coefficient vector 𝜷 𝑖
Step 3 - Re-introduce the training input data to the network
Several weighting functions for the 𝑀 -estimators can be and collect the network states H;
chosen, and, in this work, we adopted the bisquare weighting ˆ for all output neurons accordingly
Step 4 - Finally, find 𝜷 𝑖
function: with Section II-C.
{ [ ( 𝑒 )2 ] 2
𝑤(𝑒𝑖𝑛 ) = 1 − 𝑖𝑛
𝜅 , if ∣𝑒𝑖𝑛 ∣ > 𝜅 (9) III. R ESULTS
1, otherwise.
The experiments were made using three regression pro-
where the parameter 𝜅 is a tuning constant. Smaller values of blems: SinC, Abalone and Boston Housing. The first one, is
𝜅 leads to more resistance to outliers, but at the expense of an artificial dataset composed by 2000 samples with 1 input
lower efficiency when the errors are normally distributed [20]. and 1 output, generated from Eq. (12).
In particular, 𝜅 = 4.685𝜎 for the bisquare function, where 𝜎 𝑦𝑖 = 𝑠𝑖𝑛(𝜋𝑥𝑖 )/(𝜋𝑥𝑖 ), 𝑖 = 1, ..., 𝑁 and −𝜋 ≤ 𝑥𝑖 ≤ 𝜋. (12)
is a robust estimate of the standard deviation of the errors. A
common approach is to take 𝜎 = MAR/0.6745, where MAR The Abalone and Boston Housing Corrected (Boston for
is the median absolute residual. short), taken respectively from UCI1 and StatLib2 databases,
are real-world problems. Abalone dataset offers 4177 samples
with 7 inputs and 1 output, while Boston has 506 samples
D. Batch Intrinsic Plasticity with 18 inputs and 1 output. For ELM’s training and testing
samples, the sets were divided: SinC (1000/1000), Abalone
BIP is an unsupervised learning rule, based on a biologi- (2000/2177) and Boston (379/127). Besides, the attributes from
cally plausible mechanism, that adapts bias (𝑏𝑗 ) and slope (𝑎𝑗 ) all sets were scaled to [0, 1] and their target values to [-1,1].
of the hidden neurons activation function, tuning them into
Following Horata’s evaluation methodology [13], the out-
more suitable regimes, maximizing information transmission
lier robustness properties of the methods presented on Sec-
and acting as a feature regularizer [21]. This is accomplished
tion II are investigated by contaminating randomly the training
by forcing the 𝑗-th hidden neuron activation with an logistic
data targets with one-sided or two-sided outliers. To apply
activation function (hyperbolic tangent in this case, see Eq.
those outliers, the subset 𝐾 ⊂ {1, ..., 𝑁 } of row indexes
(10)) into a desired exponential distribution 𝑓𝑑𝑒𝑠 .
of D will indicate which samples will be contaminated and
Δ𝑘 ∈ ℝ𝑚 , ∀𝑘 ∈ 𝐾, is a row vector that receives normal
(1 − exp(−𝑎𝑗 𝑥𝑗𝑛 − 𝑏𝑗 ))
ℎ𝑗𝑛 = (10) distributed errors.
(1 + exp(−𝑎𝑗 𝑥𝑗𝑛 − 𝑏𝑗 ))
Let d𝑘 be a row of D and d̃𝑘 be the respective contami-
nated row by one-sided outliers when:
For each hidden neuron, all the incoming synaptic sum
x𝑗 = 𝜷 𝑇ℎ𝑑𝑑𝑗 U is collected, where U = (u(1), ..., u(𝑁 )). d̃𝑘 = d𝑘 + ∣Δ𝑘 ∣, (13)
Then random targets T𝑓𝑑𝑒𝑠 = (𝑡1 , ..., 𝑡𝑁 )𝑇 , from the desired or by two-sided outliers when:
exponential output distribution, and the collected stimuli x𝑗
are drawn in ascending order. d̃𝑘 = d𝑘 + Δ𝑘 . (14)
The model 𝛷(x𝑗 ) = (x𝑇𝑗 , (1, ..., 1)𝑇 ) is built so we can
Each problem will simulate other sub-problems depending
calculate:
on the type of outlier and its contamination rate. The training
targets are corrupted by one-sided or two-sided outliers and
(𝑎𝑗 , 𝑏𝑗 )𝑇 = (𝛷(x𝑗 )𝑇 𝛷(x𝑗 ) + 𝜆𝐼)−1 𝛷(x𝑗 )𝑇 𝑓 −1 (t𝑓𝑑𝑒𝑠 ), (11) the percentage of those contaminated samples may be: 10%,
20%, 30% or 40% of the total number of training data. Hence,
where 𝑓 −1 is the inverse of the activation function. Thus, 𝜆 >
0 is the regularization parameter and 𝐼 ∈ ℝ𝑞×𝑞 is an identity 1 http://archive.ics.uci.edu/ml/index.html

matrix [21]. 2 http://lib.stat.cmu.edu/index.php

TABLE I. C OMPARISON OF TRAINING ’ S ’ MEAN RMSE AND STANDARD DEVIATION OF ELM, ELM/BIP, ROB-ELM AND R-ELM/BIP WITH
ARTIFICIAL DATASET (S IN C) AND REAL REGRESSION PROBLEMS (A BALONE AND B OSTON H OUSING ).

Mean training RMSE and Standard Deviation

Problems ELMs
Contamination rate (%)

10 20 30 40
ELM 0.31728 ± 0.01759 0.42294 ± 0.022938 0.50394 ± 0.017375 0.54911 ± 0.016443
SinC ELM/BIP 0.30301 ± 0.018275 0.41215 ± 0.023355 0.4943 ± 0.017155 0.53996 ± 0.016893
(1 sided) R-ELM 0.32699 ± 0.018297 0.44896 ± 0.024869 0.55008 ± 0.019694 0.59641 ± 0.020741
R-ELM/BIP 0.31575 ± 0.019083 0.44451 ± 0.025284 0.55402 ± 0.018957 0.63107 ± 0.019579
ELM 0.32718 ± 0.021844 0.44715 ± 0.01762 0.55631 ± 0.023449 0.6355 ± 0.02309
SinC ELM/BIP 0.3141 ± 0.022108 0.43723 ± 0.018417 0.54661 ± 0.023217 0.62714 ± 0.023176
(2 sided) R-ELM 0.32833 ± 0.02183 0.44892 ± 0.017735 0.55872 ± 0.023568 0.63782 ± 0.022994
R-ELM/BIP 0.3167 ± 0.022154 0.44112 ± 0.018683 0.55189 ± 0.023331 0.63266 ± 0.02314
ELM 0.30937 ± 0.015178 0.41892 ± 0.014043 0.49249 ± 0.015587 0.54649 ± 0.014396
Abalone ELM/BIP 0.30962 ± 0.01533 0.41921 ± 0.01404 0.49273 ± 0.01571 0.54659 ± 0.014432
(1 sided) R-ELM 0.32316 ± 0.015803 0.45169 ± 0.014966 0.54808 ± 0.017707 0.62445 ± 0.01714
R-ELM/BIP 0.32332 ± 0.015825 0.45202 ± 0.014961 0.54818 ± 0.017724 0.62428 ± 0.017088
ELM 0.31865 ± 0.015152 0.44631 ± 0.014997 0.54589 ± 0.016201 0.63184 ± 0.016713
Abalone ELM/BIP 0.31893 ± 0.015148 0.44648 ± 0.015001 0.54601 ± 0.016026 0.63168 ± 0.016412
(2 sided) R-ELM 0.32091 ± 0.015043 0.44927 ± 0.015082 0.54944 ± 0.01625 0.63559 ± 0.01667
R-ELM/BIP 0.32104 ± 0.015045 0.44941 ± 0.015083 0.54954 ± 0.016202 0.6356 ± 0.016628
ELM 0.29169 ± 0.033615 0.40355 ± 0.030263 0.47067 ± 0.033956 0.52405 ± 0.032386
Boston ELM/BIP 0.29252 ± 0.033987 0.40409 ± 0.029703 0.47028 ± 0.033765 0.52485 ± 0.032287
(1 sided) R-ELM 0.31603 ± 0.036465 0.45043 ± 0.033717 0.54187 ± 0.040013 0.5924 ± 0.043863
R-ELM/BIP 0.3168 ± 0.035917 0.45117 ± 0.033769 0.54166 ± 0.040368 0.591 ± 0.044263
ELM 0.29624 ± 0.036144 0.42591 ± 0.034494 0.53481 ± 0.03501 0.59901 ± 0.043044
Boston ELM/BIP 0.29661 ± 0.036586 0.42572 ± 0.034833 0.53576 ± 0.035314 0.60062 ± 0.042521
(2 sided) R-ELM 0.3116 ± 0.035407 0.44557 ± 0.035873 0.5599 ± 0.037153 0.62804 ± 0.043756
R-ELM/BIP 0.31081 ± 0.036255 0.44614 ± 0.035326 0.56035 ± 0.036784 0.62959 ± 0.044111

TABLE II. C OMPARISON OF TEST ’ S MEAN RMSE AND STANDARD DEVIATION OF ELM, ELM/BIP, ROB-ELM AND R-ELM/BIP WITH ARTIFICIAL
DATASET (S IN C) AND REAL REGRESSION PROBLEMS (A BALONE AND B OSTON H OUSING ).

Mean testing RMSE and Standard Deviation

Problems ELMs
Contamination rate (%)

10 20 30 40
ELM 0.11997 ± 0.0050207 0.18408 ± 0.0091126 0.26066 ± 0.0095486 0.33467 ± 0.010247
SinC ELM/BIP 0.089072 ± 0.0071982 0.16654 ± 0.010607 0.25085 ± 0.0094876 0.3269 ± 0.011963
(1 sided) R-ELM 0.087491 ± 0.0012655 0.089213 ± 0.003473 0.091219 ± 0.0015507 0.12792 ± 0.013189
R-ELM/BIP 5.0649×10−5 ± 8.5172×10−5 2.2089×10−5 ± 4.0014×10−5 4.711×10−6 ± 6.1517×10−6 4.6419×10−6 ± 9.6504×10−6
ELM 0.091381 ± 0.0030023 0.095552 ± 0.0048112 0.1019 ± 0.0067492 0.10256 ± 0.0076451
SinC ELM/BIP 0.040336 ± 0.0079798 0.057581 ± 0.010523 0.076226 ± 0.012953 0.08296 ± 0.016879
(2 sided) R-ELM 0.087293 ± 0.0011283 0.087445 ± 0.001065 0.088951 ± 0.0079636 0.088068 ± 0.0013317
R-ELM/BIP 4.4631×10−5 ± 8.7431×10−5 1.721×10−5 ± 3.2819×10−5 9.2179×10−6 ± 1.6389×10−5 4.3035×10−6 ± 7.1121×10−6
ELM 0.10728 ± 0.0062478 0.17863 ± 0.0070967 0.25384 ± 0.0090707 0.33196 ± 0.0099962
Abalone ELM/BIP 0.10809 ± 0.0055507 0.17856 ± 0.0070735 0.25435 ± 0.0087548 0.33289 ± 0.0099281
(1 sided) R-ELM 0.062609 ± 0.0012822 0.062607 ± 0.0016053 0.06265 ± 0.0015469 0.067467 ± 0.0018998
R-ELM/BIP 0.063793 ± 0.0020453 0.063137 ± 0.0015558 0.063912 ± 0.0021094 0.06857 ± 0.0019539
ELM 0.072112 ± 0.0049853 0.081733 ± 0.006677 0.0897 ± 0.0095399 0.096313 ± 0.012501
Abalone ELM/BIP 0.07261 ± 0.0035585 0.082111 ± 0.0075051 0.089434 ± 0.0083095 0.098868 ± 0.01285
(2 sided) R-ELM 0.06307 ± 0.0015705 0.062332 ± 0.001205 0.062575 ± 0.0015548 0.063167 ± 0.0018103
R-ELM/BIP 0.063934 ± 0.0015417 0.063521 ± 0.0021724 0.063724 ± 0.001802 0.063873 ± 0.0020998
ELM 0.14095 ± 0.026788 0.22239 ± 0.027087 0.29328 ± 0.032442 0.376 ± 0.034651
Boston ELM/BIP 0.13998 ± 0.020638 0.22301 ± 0.025641 0.29786 ± 0.039615 0.37361 ± 0.030934
(1 sided) R-ELM 0.067481 ± 0.0081434 0.064883 ± 0.0060752 0.069308 ± 0.011972 0.14881 ± 0.060624
R-ELM/BIP 0.069561 ± 0.0098091 0.067461 ± 0.0088994 0.073066 ± 0.010384 0.1506 ± 0.056514
ELM 0.11801 ± 0.032185 0.15066 ± 0.025102 0.19344 ± 0.033945 0.21684 ± 0.03698
Boston ELM/BIP 0.11963 ± 0.019048 0.15508 ± 0.020773 0.18842 ± 0.030141 0.21325 ± 0.036929
(2 sided) R-ELM 0.066718 ± 0.011338 0.069368 ± 0.014331 0.067116 ± 0.010247 0.087772 ± 0.023071
R-ELM/BIP 0.068507 ± 0.0099174 0.068531 ± 0.011441 0.069932 ± 0.011803 0.084385 ± 0.017704
TABLE III. H OLM -S IDAK ’ S POST HOC TEST RESULTS FOR
COMPARISON OF ALL METHODS WITH R-ELM/BIP.
proposed R-ELM/BIP. It considers that the performance of any
of the other methods tested is significantly different from the
ELM ELM/BIP ROB-ELM R-ELM/BIP. In Tab. III, we present the p-values and their 𝛼,
1-sided problems where it is clear that our proposed method is significantly
p-values 0.0000 0.0000 0.2709 different, in terms of test’s mean RMSE, from ELM and
𝛼 0.0170 0.0253 0.0500 ELM/BIP but not from ROB/ELM.
2-sided problems
10 ELM
p-values 0.0001 0.0005 0.0787 x 10
2.5 ELM/BIP
𝛼 0.0170 0.0253 0.0500 R−ELM
both problems
R−ELM/BIP

Average output weight norm

p-values 0.0000 0.0000 0.1376 2
𝛼 0.0170 0.0253 0.0500

1.5
1 sided 2 sided
this setup results in 8 sub-problems for each dataset, as shown
in Tab. I and II. 1

It is important to highlight that the test data do not suffer

any contamination. This way, we can evaluate if the network 0.5
has learned the true function underlying the data or if it has
learned the noise as well.
0
For the ELM networks, we experimented the traditional 10 20 30 40 10 20
Outlier contamination rate (%)
30 40
design, the improved version using BIP (ELM/BIP), the ELM
using IRLS as estimation method (ROB-ELM) [20] and our
Fig. 1. Average weight norm with Sinc dataset, contaminated by 1 sided (1)
proposed method R-ELM/BIP. We adopted the number of and 2 sided (2) noises.
hidden neurons used in Horata’s paper [13]: 35 for SinC and
Boston datasets, and 25 for Abalone dataset. We chose hy-
perbolic tangent as activation function and the hidden neurons ELM (1)
2000 ELM/BIP (1)
were randomly initialized between [-0.1,0.1]. For BIP imple- R−ELM (1)
mentation, we chose 𝜇 = 0.2 for the exponential distribution R−ELM/BIP (1)
and the linear regression parameter is set to 𝜆 = 10−2 . Lastly,
Average output weight norm

ELM (2)
1500 ELM/BIP (2)
all implementations were executed with MATLAB, where, for
R−ELM (2)
each dataset and sub-problems, training and test samples were R−ELM/BIP (2)
Boston
randomly selected into 50 independent runs. Abalone Housing
1000
In Tab. I, we present the training’s mean RMSE and the
standard deviation for all datasets and their respective sub-
problems. The algorithms showed similar performances in
training, although, as the contamination rate increased, the 500
methods with no robust estimation presented better results.
This behavior was expected, since the non-robust methods can
not ignore the contaminated samples in any level, and try to 0
10 20 30 40 10 20 30 40
learn even the noise. Outlier contamination rate (%)

In Tab. II, we present the test’s mean RMSE and its respec-
tive standard deviation for all datasets and their respective sub- Fig. 2. Average weight norm with Abalone and Boston Housing datasets,
contaminated by 1 sided (1) and 2 sided(2) noises.
problems. With the artificial dataset SinC, it is clear that the R-
ELM/BIP produces the results with best performance. Further-
more, the methods that uses M-Estimators, presented smaller Another aspect to be investigated is presented on Fig. 1 and
variations on their mean RMSE in spite of the increasing Fig. 2, with the norm average of the estimated output weights
contamination rate. To determine the statistical significance of for each type of outliers and contamination rate. As discussed
the rank differences observed for each method in the three previously, the size of the output weights influences on the
datasets, we carried out a non-parametric Friedman test (pro- generalization and sensitivity to data perturbation, therefore
vided by MATLAB) with the ranking of the mean test’s RMSE the methods using BIP show the best results. Even though,
for 1 sided problems, 2 sided problems and both at once. due to the ill-posed H matrix, the resulting output weights
Given the null-hypothesis that all algorithms are equivalent, have very large values (see Fig. 1). In this particular case, the
the test provided 𝑝 = 1.9368 × 10−6 , 𝑝 = 9.1203 × 10−5 R-ELM/BIP presents its norms around 107 , which is smaller
and 𝑝 = 1.1826 × 10−10 respectively, where all 𝑝 < 0.01. than the others with norms around 1010 . Also, from Fig. 2, we
Therefore, we can reject the null-hypothesis, stating that there can see the evolution of the average norm of weights through
is a significant statistical difference. the different contamination rates. For those networks without
the BIP implementation, their norm values not only increase
Based on this rejection, we applied the Holm-Sidak’s post with the data corruption but also they can achieve high values,
hoc test (provided by [24]) to compare all methods with the depending on the problem. However, those networks with BIP,
showed norms that were almost indifferent about the corruption [8] T. Łobos, P. Kostyła, Z. Wacławek, and A. Cichocki, “Adaptive neural
rate, providing much smaller values. networks for robust estimation of signal parameters,” The International
Journal for Computation and Mathematics in Electrical and Electronic
Engineering, vol. 19, no. 3, pp. 903–912, 2000.
IV. C ONCLUSION [9] Y. Feng, R. Li, A. Sudjianto, and Y. Zhang, “Robust neural network
with applications to credit portfolio data analysis,” Stat Interface, vol. 3,
This work presented a new ELM algorithm endowed with no. 4, p. 437444, 2010.
an optimized hidden layer combined with outlier robust es- [10] C. H. Aladag, E. Egrioglu, and U. Yolcu, “Robust multilayer neural
timation of the output weights. That optimization forced the network based on median neuron model,” Neural Computing and
hidden neurons to respond only for the few and most important Applications, vol. 24, no. 3-4, pp. 945–956, 2014.
stimuli, preventing saturated neurons. Moreover, the IRLS [11] G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Extreme learning machine:
promotes a learning that takes into account the contribution Theory and applications,” Neurocomputing, vol. 70, no. 1-3, pp. 489–
of each residual to the objective function, which diminishes 501, 2006.
the influence of outliers to the final solution. [12] G.-B. Huang, D. Wang, and Y. Lan, “Extreme learning machines: a
survey,” International Journal of Machine Learning and Cybernetics,
From the exposed on Tab. II and Fig. 1 and 2, the vol. 2, no. 2, pp. 107 – 122, 2011.
proposed approach R-ELM/BIP achieved results which were [13] P. Horata, S. Chiewchanwattana, and K. Sunat, “Robust extreme lear-
ning machine,” Neurocomputing, vol. 102, pp. 31–34, 2013, advances
consistent with the best performances presented, with little in Extreme Learning Machines (ELM 2011).
variation along the increasing contamination rate, plus regu-
[14] G. Zhao, Z. Shen, and Z. Man, “Robust input weight selection for
larized output weights that can provide less sensitivity to data well-conditioned extreme learning machine,” International Journal of
perturbation. Even though there was no significant statistical Information Technology, vol. 17, no. 1, 2011.
difference between the test’s mean RMSE between ROB-ELM [15] A. C. P. Kulaif and F. J. V. Zuben, “Improved regularization in extreme
and R-ELM/BIP, the values of output weights given by our learning machines,” in Annals of Congresso Brasileiro de Inteligłncia
approach are less sensitive to contamination rates and also to Computacional (CBIC), 2013.
different problems. This provides a more reliable feature to the [16] P. L. Bartlett, “The sample complexity of pattern classification with
results that can be achieved. neural networks: the size of the weights is more important than the size
of the network,” Information Theory, IEEE Transactions on, vol. 44,
As for future work, we will investigate the influence of no. 2, pp. 525–536, March 1998.
model selection and different regularization methods over the [17] W. Deng, Q. Zheng, and L. Chen, “Regularized extreme learning
machine,” in CIDM. IEEE, 2009, pp. 389–395.
ELM performance with corrupted data.
[18] Y. Wang, F. Cao, and Y. Yuan, “A study on effectiveness of extreme
learning machine,” Neurocomputing, vol. 74, no. 16, pp. 2483–2490,
ACKNOWLEDGMENT 2011, advances in Extreme Learning Machine: Theory and Applications
Biological Inspired Systems. Computational and Ambient Intelligence
The authors would like to thank the Coordenação de Selected papers of the 10th International Work-Conference on Artificial
Aperfeiçoamento de Pessoal de Nı́vel Superior (CAPES) and Neural Networks (IWANN2009).
Fundação Núcleo de Tecnologia Industrial do Ceará (NUTEC) [19] J. M. Martnez-Martı́nez, P. Escandell-Montero, E. Soria-Olivas, J. D.
for the financial support. Martı́n-Guerrero, R. Magdalena-Benedito, and J. Gómez-Sanchis, “Reg-
ularized extreme learning machine for regression problems,” Neurocom-
puting, vol. 74, no. 17, pp. 3716–3721, 2011.
R EFERENCES [20] A. L. B. Barros and G. A. Barreto, “Building a robust extreme
learning machine for classification in the presence of outliers,” in Hybrid
[1] G. Beliakov, A. Kelarev, and J. Yearwood, “Robust artificial neural
Artificial Intelligent Systems, ser. Lecture Notes in Computer Science,
networks and outlier detection. technical report,” CoRR, 2011.
J.-S. Pan, M. Polycarpou, M. Woniak, A. C. Carvalho, H. Quintin, and
[Online]. Available: http://dblp.uni-trier.de/db/journals/corr/corr1110.
E. Corchado, Eds. Springer Berlin Heidelberg, 2013, vol. 8073, pp.
html#abs-1110-0169
588–597.
[2] H. T. Huynh and Y. Won, “Weighted least squares scheme for
[21] K. Neumann and J. Steil, “Optimizing extreme learning machines via
reducing effects of outliers in regression based on extreme learning
ridge regression and batch intrinsic plasticity,” Neurocomputing, vol.
machine,” J. Digital Content Technol. Appl.(JDCTA), vol. 2, no. 3, pp.
102, pp. 23–30, 2013, advances in Extreme Learning Machines (ELM
40–46, 2008. [Online]. Available: http://dblp.uni-trier.de/db/journals/
2011).
jdcta/jdcta2.html#HuynhW08
[22] P. J. Huber, “Robust estimation of a location parameter,” Annals of
[3] A. Khamis, Z. Ismail, K. Haron, and A. T. Mohammed, “The effects
Mathematical Statistics, vol. 35, no. 1, pp. 73–101, 1964.
of outliers data on neural network performance,” Journal of Applied
Sciences, vol. 5, no. 8, pp. 1394–1398, 2005. [23] J. Fox, Applied Regression Analysis, Linear Models, and Related
Methods. Sage Publications, 1997.
[4] F. Steege, V. Stephan, and H. Grob, “Effects of noise-reduction on
neural function approximation,” in Proc. 20th European Symposium on [24] G. Cardillo, “Holm-sidak t-test: a routine for multiple t-test compari-
Artificial Neural Networks, Computational Intelligence and Machine sons,” http://www.mathworks.com/matlabcentral/fileexchange/12786.
Learning, 2012, pp. 73–78.
[5] Y. Liu, “Robust parameter estimation and model selection for neural
network regression,” in Advances in Neural Information Processing
Systems (NIPS), J. D. Cowan, G. Tesauro, and J. Alspector, Eds.
Morgan Kaufmann, 1993, pp. 192–199.
[6] J. Larsen, L. Nonboe, M. Hintz-Madsen, and L. Hansen, “Design of
robust neural network classifiers,” in Acoustics, Speech and Signal Pro-
cessing, 1998. Proceedings of the 1998 IEEE International Conference
on, vol. 2, May 1998, pp. 1205–1208 vol.2.
[7] C.-C. Lee, C.-L. Tsai, Y.-C. Chiang, and C.-Y. Shih, “Noisy time series
prediction using m-estimator based robust radial basis function neural
networks with growing and pruning techniques,” Expert Systems and
Applications, vol. 36, no. 3, p. 47174724, 2009.

View publication stats

chp7 3 Economic Dispatch PDF
100% (1)
chp7 3 Economic Dispatch PDF
11 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
30 pages
Andrew NG Main - Notes PDF
No ratings yet
Andrew NG Main - Notes PDF
226 pages
OI Gateway User Guide
No ratings yet
OI Gateway User Guide
115 pages
Discrete Mathematics For Computer Science
No ratings yet
Discrete Mathematics For Computer Science
19 pages
Unit2-Notes Ai Updated
No ratings yet
Unit2-Notes Ai Updated
36 pages
Advantages and Disadvantages of Information Gathering Techniques
100% (3)
Advantages and Disadvantages of Information Gathering Techniques
4 pages
L05 Slides - mlp2
No ratings yet
L05 Slides - mlp2
21 pages
L06 Slides - mlp3
No ratings yet
L06 Slides - mlp3
26 pages
Shen Dissertation 2020
No ratings yet
Shen Dissertation 2020
193 pages
Optimization Problems For Machine Learning: A Survey
No ratings yet
Optimization Problems For Machine Learning: A Survey
41 pages
Cs229 ML Notes
No ratings yet
Cs229 ML Notes
192 pages
Fit Without Fear - Remarkable Mathematical Phenomena of Deep Learning Through The Prism of Interpolation
No ratings yet
Fit Without Fear - Remarkable Mathematical Phenomena of Deep Learning Through The Prism of Interpolation
51 pages
Six Lectures On NN - Montanari
No ratings yet
Six Lectures On NN - Montanari
77 pages
Crypto Cash
No ratings yet
Crypto Cash
25 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
Extreme Learning Machines - A Review and State of The Art PDF
No ratings yet
Extreme Learning Machines - A Review and State of The Art PDF
15 pages
Wa0006.
No ratings yet
Wa0006.
70 pages
How To Configure Webutil With Oracle Forms 10g
100% (1)
How To Configure Webutil With Oracle Forms 10g
3 pages
1810 01075 PDF
No ratings yet
1810 01075 PDF
59 pages
Chapter 1 - Chapter 1 Introduction: Data-Analytic Thinking
No ratings yet
Chapter 1 - Chapter 1 Introduction: Data-Analytic Thinking
54 pages
CB3405 - Unit 3 - Notes
No ratings yet
CB3405 - Unit 3 - Notes
43 pages
SAD Chaper8
100% (1)
SAD Chaper8
33 pages
Installation Guide S900
No ratings yet
Installation Guide S900
28 pages
Curs5site PDF
No ratings yet
Curs5site PDF
47 pages
1 - Lipschitz Layers Compared
No ratings yet
1 - Lipschitz Layers Compared
24 pages
ML11 Generalization
No ratings yet
ML11 Generalization
40 pages
Lecture 2
No ratings yet
Lecture 2
57 pages
CDMA RF Performance Analysis & Troubleshooting Guide R28.0 - Final
No ratings yet
CDMA RF Performance Analysis & Troubleshooting Guide R28.0 - Final
191 pages
Stable Minima Cannot Ove
No ratings yet
Stable Minima Cannot Ove
46 pages
Control System
0% (2)
Control System
1 page
Shortcomings in Single Layer Neural Networks: Most Real World Problems Are Not
No ratings yet
Shortcomings in Single Layer Neural Networks: Most Real World Problems Are Not
43 pages
Unit 2
No ratings yet
Unit 2
37 pages
CinFreePascal PDF
No ratings yet
CinFreePascal PDF
23 pages
Breaking The Curse of Dimensionality With Convex Neural Networks
No ratings yet
Breaking The Curse of Dimensionality With Convex Neural Networks
53 pages
Lect 1
No ratings yet
Lect 1
24 pages
Towards A Mathematical Understanding of Neural Network-Based Machine Learning: What We Know and What We Don't
No ratings yet
Towards A Mathematical Understanding of Neural Network-Based Machine Learning: What We Know and What We Don't
56 pages
Function Approximation Using Robust Wavelet Neural Networks: Sheng-Tun Li and Shu-Ching Chen
No ratings yet
Function Approximation Using Robust Wavelet Neural Networks: Sheng-Tun Li and Shu-Ching Chen
6 pages
Distributionally Robust Selfsupervised Learning For Tabular Data
No ratings yet
Distributionally Robust Selfsupervised Learning For Tabular Data
23 pages
Cao 2015
No ratings yet
Cao 2015
17 pages
F - P N N S L: Unction Space Arameterization of Eural Etworks For Equential Earning
No ratings yet
F - P N N S L: Unction Space Arameterization of Eural Etworks For Equential Earning
29 pages
High-Performance Extreme Learning Machines - A Complete Toolbox For Big Data Applications PDF
No ratings yet
High-Performance Extreme Learning Machines - A Complete Toolbox For Big Data Applications PDF
15 pages
ML Unit-5
No ratings yet
ML Unit-5
11 pages
Credits 2jbbd PDF
No ratings yet
Credits 2jbbd PDF
8 pages
Salehfar 1995
No ratings yet
Salehfar 1995
8 pages
A Robust Least Squares Support Vector Machine For Regression and Classification With Noise
No ratings yet
A Robust Least Squares Support Vector Machine For Regression and Classification With Noise
13 pages
Yu Et Al. - 2022 - Progressive Ensemble Kernel-Based Broad Learning System For Noisy Data Classification
No ratings yet
Yu Et Al. - 2022 - Progressive Ensemble Kernel-Based Broad Learning System For Noisy Data Classification
14 pages
Pinball-Huber Boosted Extreme Learning Machine Regression: A Multiobjective Approach To Accurate Power Load Forecasting
No ratings yet
Pinball-Huber Boosted Extreme Learning Machine Regression: A Multiobjective Approach To Accurate Power Load Forecasting
16 pages
Unit 2 Soft
No ratings yet
Unit 2 Soft
14 pages
2013-Elsevier-Weighted Extreme Learning Machine For Imbalance Learning
No ratings yet
2013-Elsevier-Weighted Extreme Learning Machine For Imbalance Learning
14 pages
Major Classes of Neural Networks
No ratings yet
Major Classes of Neural Networks
21 pages
OSS BSS The Challenges Ahead
No ratings yet
OSS BSS The Challenges Ahead
25 pages
Neural Networks As Radial-Interval Systems Through Learning Function
No ratings yet
Neural Networks As Radial-Interval Systems Through Learning Function
3 pages
Ratnn Si 2015 09 04
No ratings yet
Ratnn Si 2015 09 04
23 pages
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-09-07 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-09-07 Reference-Material-I
7 pages
Navarro College ADN Entrance Exam (TEAS) Scheduling Guide
0% (1)
Navarro College ADN Entrance Exam (TEAS) Scheduling Guide
1 page
Symbolic Regression and Machine Learning
No ratings yet
Symbolic Regression and Machine Learning
8 pages
William D. Patterson - Curriculum Vitae 01224316569/07892893687
No ratings yet
William D. Patterson - Curriculum Vitae 01224316569/07892893687
7 pages
Classification of Cardboard Papers Using A Multilayer Perceptron
No ratings yet
Classification of Cardboard Papers Using A Multilayer Perceptron
14 pages
Extreme Learning Machine: A Review
No ratings yet
Extreme Learning Machine: A Review
14 pages
CSE Work Sheet-1
No ratings yet
CSE Work Sheet-1
13 pages
Effective Algorithms of The Moore-Penrose Inverse Matrices For Extreme Learning Machine
No ratings yet
Effective Algorithms of The Moore-Penrose Inverse Matrices For Extreme Learning Machine
18 pages
Funambol SyncML Book
No ratings yet
Funambol SyncML Book
29 pages
Extreme Learning Machine A New Learning Scheme of Feedforward Neural Networks
No ratings yet
Extreme Learning Machine A New Learning Scheme of Feedforward Neural Networks
6 pages
Computer Science Sumita Arora Database Concept
No ratings yet
Computer Science Sumita Arora Database Concept
6 pages
Liebert Exs 10 20 Kva Brochure English
No ratings yet
Liebert Exs 10 20 Kva Brochure English
8 pages
An Overview of Overfitting and Its Solutions
No ratings yet
An Overview of Overfitting and Its Solutions
7 pages
Short-Term Load Forecasting With Weather Component Based On Improved Extreme Learning Machine
No ratings yet
Short-Term Load Forecasting With Weather Component Based On Improved Extreme Learning Machine
6 pages
An Overview of Overfitting and Its Solutions
No ratings yet
An Overview of Overfitting and Its Solutions
7 pages
Deep Feedforward Networks Application To Patter Recognition
No ratings yet
Deep Feedforward Networks Application To Patter Recognition
5 pages
Integrating Data Selection and Extreme Learning Machine For Imbalanced Data
No ratings yet
Integrating Data Selection and Extreme Learning Machine For Imbalanced Data
9 pages
ML Summary PDF
No ratings yet
ML Summary PDF
5 pages
Quickref Guide
No ratings yet
Quickref Guide
2 pages
Bitwise Neural Network
No ratings yet
Bitwise Neural Network
5 pages
The Fast Computation Methods For Extreme Learning Machine: Tao Dou Xu Zhou
No ratings yet
The Fast Computation Methods For Extreme Learning Machine: Tao Dou Xu Zhou
7 pages
Deep Online Sequential Extreme Learning Machines and Its Application in Pneumonia Detection
No ratings yet
Deep Online Sequential Extreme Learning Machines and Its Application in Pneumonia Detection
6 pages
Anggia Safira Ivana Yolandha 8040180402
No ratings yet
Anggia Safira Ivana Yolandha 8040180402
6 pages
Cpe 320 Assignment 2023-1
No ratings yet
Cpe 320 Assignment 2023-1
3 pages
A Comparison of Extreme Learning Machines and Back-Propagation Trained Feed-Forward Networks Processing The Mnist Database
No ratings yet
A Comparison of Extreme Learning Machines and Back-Propagation Trained Feed-Forward Networks Processing The Mnist Database
4 pages
Unlocking The Value of Used Goods Market: A Case Study by Quikr
No ratings yet
Unlocking The Value of Used Goods Market: A Case Study by Quikr
2 pages
Open Gapps Log
No ratings yet
Open Gapps Log
2 pages
Step1: Step 2: B) Reissue Via Phonebanking - Through Any HDFC Bank Branch - You Can Collect An Insta Non Personalized Chip + Pin Debit
No ratings yet
Step1: Step 2: B) Reissue Via Phonebanking - Through Any HDFC Bank Branch - You Can Collect An Insta Non Personalized Chip + Pin Debit
1 page

A Robust and Regularized Extreme Learning Machine

Uploaded by

A Robust and Regularized Extreme Learning Machine

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

A Robust and Regularized Extreme Learning Machine

Conference Paper · October 2014

Ananda Lima Freire Guilherme A. Barreto

SEE PROFILE SEE PROFILE

Reference points selection for minimal learning machine View project

The user has requested enhancement of the downloaded file.

Ananda Freire∗ and Guilherme Barreto∗

matrix [21]. 2 http://lib.stat.cmu.edu/index.php

Mean training RMSE and Standard Deviation

Mean testing RMSE and Standard Deviation

Average output weight norm

It is important to highlight that the test data do not suffer

View publication stats

You might also like