Nafips-Cbsf2018 Paper 60
Nafips-Cbsf2018 Paper 60
1 Introduction
Machine learning (ML) is a research field from Artificial Intelligence (AI) which
studies techniques for building systems capable of learning automatically by
experience. Within this area, there is a specific field dedicated to working
with temporal data or time series. Time series data mining techniques have
been constantly explored in the literature in the past decade and it is still an
important topic frequently addressed by researches nowadays. Numerous works
2
have contributed in multiple advances in time series data mining techniques [2, 11,
12, 15, 17, 21]. Applications such as classification, clustering, anomaly detection and
forecast are examples of common data mining tasks applied in time series.
Time series classification has been subject of various researches that explored
temporal data collected from sensors, such as accelerometers, electrocardiogram
(ECG) and electroencephalogram (EEG) as sources for multi-class identification.
Examples of applications are: Human Activity Recognition (HAR) [15], ECG-based
[2, 5, 17, 21] and EEG-based [11] classification.
As time series data may have complex characteristics, it is necessary to apply
sophisticated solutions that can handle the nonlinear operations. Among the
data mining techniques, the Artificial Neural Network (ANN) is an interesting
computational intelligent approach for addressing the problem due to its adaptive
and generalization capabilities. Classical neural network based algorithms such as
Support Vector Machine (SVM) [17], Multi-Layer Perceptron (MLP) [15], Learning
Vector Quantization (LVQ) [5, 12] have been employed in time series classification
problems. Furthermore, deep learning based neural networks such as Deep MLP [2]
and Convolution Neural Network (CNN) [21] have been increasingly explored in
related studies.
Another category of intelligent classifiers is based on Hybrid Neural Systems
(HNS) [19]. HNS are systems which combine artificial neural networks with multiple
intelligent methods in a single model for solving a specific problem. This approach
aims to extract specific advantages from different techniques in order to build a
more robust system. A type of hybrid neural system is Fuzzy Neural Network. In this
approach, concepts of Fuzzy Sets and Artificial Neural Networks are combined in a
unique method.
Hybrid Fuzzy-LVQ neural networks have been widely explored in the literature.
For instance, in [5], the authors presented a new model for data classification using
LVQ-based neural network combined with type-2 fuzzy logic. In this work, a fuzzy
inference system was employed to determine which network’s prototype is the
nearest to an input vector. This new method was implemented and tested with two
data sets for comparing its effectiveness against the original LVQ algorithm and a
type-1 Fuzzy-LVQ. In a different approach, [8] employed a FLVQ-based algorithm
with wavelet transformation for classifying abnormalities in images of inner surface
of the eye. In this work, the authors compared the FLVQ method with other two
methods: Levenger-Marquardt (LM) and Adaptive Neuro-Fuzzy Inference System
(ANFIS).
In [4], a study is conducted on the application of the FLVQ model proposed
by [18] for probability distribution identification. In a more practical way, a
classification algorithm based on Generalized Fuzzy-LVQ method was designed and
implemented in a FPGA [1, 10]. Furthermore, in [20] three different variation of LVQ
algorithms are introduced and compared: Fuzzy-soft LVQ, batch LVQ and Fuzzy-LVQ.
As a motivation, few works have explored Fuzzy-LVQ strategies for dealing with
high-dimensional temporal data. Therefore, we believe hybrid Fuzzy-LVQ classifiers
have the potential to be explored more deeply.
3
The learning method in LVQ consists in using the input vector as guidance for
organizing the prototypes in specific regions that defines a class. Firstly, a set of
prototypes is initialized and for each prototype is assigned a class. Each class must be
represented by at least one prototype, a class can have multiple prototypes, and one
prototype only represents a unique class. Then, during the learning process, each
instance from the training set is compared with all network’s prototypes, using a
similarity measure. LVQ-based algorithms are classified as competitive learning due
to the selection of the closest prototype within the set of P prototypes:
P
w = ar g mi n d (x j , p i ) (1)
i =1
where w is the index of the winner prototype (the closest prototype of an specific
instance x j ). The distance is measured by a distance function. The Euclidean
distance, or L 2 -norm, is generally used to calculate this distance.
s
° °2 n
X
d (x j , p i ) = °x j − p i ° = (x j k − p i k )2 (2)
k=1
4
where n is the dimension of the instance x j , which is the same for p i . If the class
of an instance is equal to the class of the closest prototype (winner prototype), this
prototype is moved towards the instance, otherwise it moves away. Consider t as
the iteration counter of the training algorithm. The learning rule for Kohonen’s LVQ1
algorithm is given by:
(
p w (t ) + α(t )[x j − p w (t )] if C (p w ) = C (x j );
p w (t + 1) = (3)
p w (t ) − α(t )[x j − p w (t )] if C (p w ) 6= C (x j ).
For all prototypes p i (t ) where i 6= w, the prototypes remains the same. In our
experiments, we adopted a linearly decreasing learning rate α(t ) = α(0)(1− Nt ), where
α(0) is the initial learning rate and N is the maximum number of training iterations.
Kohonen introduced in 1988 the LVQ2 algorithm, another variation similar to the
original LVQ [14]. However, the learning process is based on two prototypes p 1st and
p 2nd that are the first and second nearest prototypes to an instance x j , respectively.
One of them must belong to the correct class and the other to a incorrect class.
Furthermore, these prototypes must fall into a zone defined around the mid plane
between them. For an instance x j and the two nearest prototypes p 1st and p 2nd , let
d 1st and d 2nd be the distances of x j to p 1st and p 2nd , respectively. Then x j will fall
into a window of a width w if Equation 4 is satisfied. It is recommended to adopt the
width w between 0.2 and 0.3 [13]. The prototype LVQ2 learning rule is given by the
Equation 5.
1 XN °
°x j − p w °2
°
QE = (6)
N j =1
Adaptability
Adaptive LVQ-ANN is a specific variation of LVQ-based methods which has the
capability of adjusting their architecture to improve network performance during the
training process. In general, the adaptive characteristics implies the ability of making
changes in the network’s structure by including or removing prototypes (codebooks
or neurons). In previous work [3], a study was conducted on a proposed adaptive
LVQ algorithm, applied to human activity recognition using data collected from a
tri-axial accelerometer. In [3], the Kohonen’s LVQ algorithm was modified to include
an adaptive step at the end of each epoch during the network training. The adaptive
process consisted in two stages:
Fuzzy
In the proposed AFLVQ method, the fuzzy part is based on the Fuzzy-LVQ
introduced by Chung [7]. Its algorithm consists of optimizing a fuzzy objective
function by minimizing the network output error, calculated by the difference of
the class membership of the target and actual values, and minimizing the distances
between training patterns and competing neurons. In their works, Chung and Lee [7]
define the following objective function:
N X
P
Q m (U , V ) = [(t j i )m − (µ j i )m ]d (x j , p i )
X
(7)
j =1 i =1
6
The term d (x j , p i ) represents the distance between the i th prototype and the
j th instance (See Equation 1). The fuzziness parameter m define weights for the
membership functions for each prototype in a manner that the greater the value
of m, the smoother is the learning process. The target class membership value of
neuron i for input pattern j is represented by t j i ∈ {0, 1}. Hence, the FLVQ learning
rule and the membership updating rule will be:
" 1 #−1
P µ d (x , p ) ¶ m−1
j i
µji =
X
(9)
`=1 d (x j , p ` )
Note that the previous equations are only valid when the number of prototypes
P is equal to the number of instances N . For LVQ architectures with multiple
prototypes per class, we introduce a competitive step in the training process. In
Figure 1 the FLVQ network is described.
In the figure, we have an input layer that receives the instances from the dataset.
The distance layer calculates the distance between each prototype to the presented
instance. Then, in the competitive layer (called MIN layer by Chung [7]), only the
closest prototype from each class is chosen to undergo the fuzzy competition.
Therefore, the membership computations and parametric vector modification will
be applied at the maximum number of classes in the problem or k prototypes.
– Training: During the training stage, the instances from the training dataset is
presented to the neural network, and the prototypes are adjusted based on
Chung’s Fuzzy-LVQ algorithm [7].
– Adaptation: After completing an epoch, the resulted LVQ-network is evaluated
in order to verify the need for adaptation.If there is a need for adaptation, the
adaptive method from our previous work [3] removes or includes prototypes,
according to criteria that aim to improve the network performance.
4 Methodology
4.1 Dataset
The dataset used in our experiments was selected from the UCI Machine Learning
Repository [9]. It is an Activity Recognition database, built from the recordings
of 30 volunteers executing multiple activities while carrying a waist-mounted
smartphone. This dataset was introduced by Anguita [6] and it contains the
collection of data from embedded inertial sensors: a tri-axial accelerometer and
a gyroscope. In this time series classification problem, the aim is to recognize
activities or actions performed by humans based on the information retrieved from
body-worn motion sensors. The selected activities classes are: standing, sitting,
laying down, walking, walking downstairs and walking upstairs. The dataset is
composed by 10299 samples which of them 7352 compose the training set and
2947 the test set, representing a division of 70% and 30% for training and test data,
respectively.
8
Features
In order to properly represent the input data, multiple features were extracted
from the sensors’ raw data. As it is a time series classification problem, the features
from each instance were calculated based on multiple observations in an ordered
sequence, usually called sub-series or data window. Examples of extracted features
are described in Table 1. These features were selected and extracted by Anguita [6].
Function Description
mean Mean value
std Standard deviation
max Largest values in array
min Smallest value in array
sma Signal magnitude area
correlation Correlation coefficient
energy Average sum of the squares
The total features extracted was 561. Thus, for an instance x ∈ RD , its dimension
is given by D = 561.
4.2 Experiments
5 Results
In Figure 3.(b), taking as an example the best model (LVQ2 and ALVQ2), we can
observe that initially the Q E = 15.78. After training, the error increased to Q E = 30.45,
instead of reducing. Strange as it may seem, low Q E does not necessarily mean a well
trained model, neither high Q E means a poorly trained model. However, there are
limits for acceptable Q E values that may change accordingly to the disposition of
the data. It is extremely important to evaluate the relationship between quantization
error and classification error in order to properly choose a classification model.
(a) (b)
0.25 50
LVQ1 LVQ1
LVQ2 LVQ2
FLVQ FLVQ
ALVQ1
45 ALVQ1
0.2 ALVQ2 ALVQ2
AFLVQ AFLVQ
40
Classification Error
Quantization Error
0.15
35
X: 100
Y: 30.45
30
0.1
25
0.05
20
X: 0
Y: 15.78
0 15
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Epochs Epochs
Fig. 3. Classification Error (C E ) and Quantization Error (Q E ) evolution throughout the epochs
Table 4. Confusion matrix for the best test result (LVQ2 and ALVQ2 with Nc = 6)
C1 C2 C3 C4 C5 C6 Recall
C1 478 7 11 0 0 0 96.37%
C2 18 450 3 0 0 0 95.54%
C3 5 18 397 0 0 0 94.52%
C4 0 2 0 448 40 1 91.24%
C5 0 0 0 29 503 0 94.55%
C6 0 0 0 0 0 537 100.00%
Precision 95.41% 94.34% 96.59% 93.92% 92.63% 99.81% 95.45%
11
Conclusion
Acknowledgements
We would like to express our gratitude to the Coordination for the Improvement of
Higher Education Personnel (CAPES) for the financial support.
References
1. Afif, I.N., Wardhana, Y., Jatmiko, W.: Implementation of adaptive fuzzy neuro generalized
learning vector quantization (afnglvq) on field programmable gate array (fpga) for real
world application. In: Advanced Computer Science and Information Systems (ICACSIS),
2015 International Conference on. pp. 65–71. IEEE (2015)
2. Al Rahhal, M.M., Bazi, Y., AlHichri, H., Alajlan, N., Melgani, F., Yager, R.R.: Deep learning
approach for active classification of electrocardiogram signals. Information Sciences 345,
340–354 (2016)
12
3. Albuquerque, R.F., Braga, A.P.d.S., Torrico, B.C., Reis, L.L.N.d.: Classificação de dinâmicas
de sistemas utilizando redes neurais lvq adaptativas. In: Conferência Brasileira de
Dinâmica, Controle e Aplicações - DINCON (2017)
4. Alfa, G.D., Kurniasari, D., Usman, M., et al.: Neural network fuzzy learning vector
quantization (flvq) to identify probability distributions. International Journal of
Computer Science and Network Security (IJCSNS) 16(10), 16 (2016)
5. Amezcua, J., Melin, P., Castillo, O.: New Classification Method Based on Modular Neural
Networks with the LVQ Algorithm and Type-2 Fuzzy Logic. Springer (2018)
6. Anguita, D., Ghio, A., Oneto, L., Parra, X., Reyes-Ortiz, J.L.: A public domain dataset for
human activity recognition using smartphones. In: ESANN (2013)
7. Chung, F.L., Lee, T.: Fuzzy learning vector quantization. In: Neural Networks, 1993.
IJCNN’93-Nagoya. Proceedings of 1993 International Joint Conference on. vol. 3, pp.
2739–2743. IEEE (1993)
8. Damayanti, A.: Fuzzy learning vector quantization, neural network and fuzzy systems for
classification fundus eye images with wavelet transformation. In: 2017 2nd International
conferences on Information Technology, Information Systems and Electrical Engineering
(ICITISEE). pp. 331–336 (Nov 2017)
9. Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017),
http://archive.ics.uci.edu/ml
10. Fajar, M., Jatmiko, W., Agus, I.M., et al.: Fnglvq fpga design for sleep stages classification
based on electrocardiogram signal. In: Systems, Man, and Cybernetics (SMC), 2012 IEEE
International Conference on. pp. 2711–2716. IEEE (2012)
11. Hajinoroozi, M., Mao, Z., Jung, T.P., Lin, C.T., Huang, Y.: Eeg-based prediction of driver’s
cognitive performance by deep convolutional neural network. Signal Processing: Image
Communication 47, 549–555 (2016)
12. Jain, B.J., Schultz, D.: Asymmetric learning vector quantization for efficient nearest
neighbor classification in dynamic time warping spaces. Pattern Recognition 76, 349–366
(2018)
13. Kohonen, T.: The self-organizing map. Proceedings of the IEEE 78(9), 1464–1480 (1990)
14. Kohonen, T., Barna, G., Chrisley, R.: Statistical pattern recognition with neural networks:
Benchmarking studies. In: IEEE International Conference on Neural Networks. vol. 1, pp.
61–68 (1988)
15. Nakano, K., Chakraborty, B.: Effect of dynamic feature for human activity recognition
using smartphone sensors. In: 2017 IEEE 8th International Conference on Awareness
Science and Technology (iCAST). pp. 539–543 (Nov 2017)
16. Peres, S.M., Rocha, T., Biscaro, H.H., Madeo, R.C.B., Boscarioli, C.: Tutorial sobre
fuzzy-c-means e fuzzy learning vector quantization: Abordagens híbridas para tarefas de
agrupamento e classificação. Revista de Informática Teórica e Aplicada 19(1), 120–163
(2012)
17. Rajesh, K.N., Dhuli, R.: Classification of ecg heartbeats using nonlinear decomposition
methods and support vector machine. Computers in biology and medicine 87, 271–284
(2017)
18. Sakuraba, Y., Nakamoto, T., Moriizumi, T.: New method of learning vector quantization
using fuzzy theory. Systems and computers in Japan 22(13), 93–103 (1991)
19. Wermter, S.: Hybrid neural systems. Springer Science & Business Media (2000)
20. Wu, K.L., Yang, M.S.: A fuzzy-soft learning vector quantization. Neurocomputing 55(3-4),
681–697 (2003)
21. Xia, Y., Wulan, N., Wang, K., Zhang, H.: Detecting atrial fibrillation by deep convolutional
neural networks. Computers in biology and medicine 93, 84–92 (2018)