0% found this document useful (0 votes)
93 views18 pages

Abstract-In This Paper, A Transfer Learning Based Method Is Proposed For The Classification of Seizure and Non

This document proposes using transfer learning with pre-trained networks like GoogLeNet, ResNet101 and VGG19 to classify EEG signals as seizure or non-seizure. It summarizes previous work on EEG classification using machine learning and deep learning. Transfer learning achieves above 99% accuracy using fewer epochs and 100% accuracy with more epochs. GoogLeNet provides high accuracy with less prediction time than other models. The proposed method outperforms other state-of-the-art algorithms for EEG classification.

Uploaded by

Prachi Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views18 pages

Abstract-In This Paper, A Transfer Learning Based Method Is Proposed For The Classification of Seizure and Non

This document proposes using transfer learning with pre-trained networks like GoogLeNet, ResNet101 and VGG19 to classify EEG signals as seizure or non-seizure. It summarizes previous work on EEG classification using machine learning and deep learning. Transfer learning achieves above 99% accuracy using fewer epochs and 100% accuracy with more epochs. GoogLeNet provides high accuracy with less prediction time than other models. The proposed method outperforms other state-of-the-art algorithms for EEG classification.

Uploaded by

Prachi Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Abstract- In this paper, a transfer learning based method is proposed for the classification of seizure and non-

seizure Electroencephalogram (EEG) signals. Recognizing seizure signals is quite important in clinical
diagnosis of Epileptic seizure. Various machine learning and deep learning techniques are employed for this
purpose. However, the classification accuracy is not satisfactory and the model is required to be built from
scratch in these traditional approaches. Transfer learning overcomes this by reusing the pre-trained
networks such as googlenet, resnet101 and vgg19 trained on ImageNet database. We used these pre-trained
image classification network that has already learned to extract informative features from natural images
and then use it as a starting point to learn a new task like classification of seizure in our case. Vgg19 shows
greater accuracy among the three but takes comparatively more prediction time. We will mainly emphasize
on the results obtained from the googlenet, since it provides effective accuracy taking less time for prediction.
The proposed method achieved an accuracy of above 99% for less number of epochs and high accuracy of
100% when we increase the epochs. Our experimental results shows that the proposed methodology achieves
better performance than many state-of-the-art classification algorithms.

Keyword- Electroencephalogram, Epileptic seizure, Transfer learning, pre-trained networks, classification of


seizure and non-seizure

I. INTRODUCTION
The brain contains billions of cells, half of them are neurons and the other half facilitates the activity of neurons.
The brain cells or neurons communicate using electrical impulses [1]. These electrical impulses can be recorded by
placing the electrode on the scalp surface of the person (subject). The Electroencephalogram (EEG) is an
electrophysiological technique for the recording of electrical activity arising from the human brain. EEG acts as a
window in the functioning of the brain. Electrode locations are specified by the 10–20 electrode placement system
devised by International Federation of Societies for Electroencephalography. This system is based on the
relationship between the position of an electrode and the primary area of cerebral cortex. These scalp electrodes are
used to record the EEG signals.
The Electroencephalogram signals are highly random and complex signals, containing a lot of useful information
about the brain state. The pattern of activity changes with the level of a person's arousal - if a person is relaxed, then
the EEG has many slow waves and if a person is excited, then the EEG has numerous fast waves [2]. Since, these
signals are non-linear and non-stationary in nature. By simply observing, it is rather difficult to extract the useful
information directly from the signals in the time domain. Therefore, important features can be extracted and used for
the diagnosis of different diseases by using advanced machine learning and signal processing techniques.
Electroencephalography is a non-invasive technique and thus is used to diagnose brain related diseases and
symptoms. The recordings given by EEG helps in detecting many neurological diseases, such as epilepsy, autism,
tumor, Alzheimer’s cerebrovascular lesions, stroke, sleep disorders, depression and problems associated with
trauma.
In this paper, we will mainly focus on the classification of EEG signals as seizure and non-seizure.
Epilepsy is a prevalent neurological disorder affecting about 50 million people worldwide that is 1-2% of the
world’s population according to World Health Organization (WHO) [3]. It is estimated that annually, 2.4 million
people are diagnosed with epilepsy. The epileptic patients are subject to epileptic seizure caused by abnormal or
disordered electrical charges and recurring electrical discharge of the neurons of cerebral cortex that lead to
innumerable movements, convulsions and the loss of consciousness. It can affect anyone at any age. Seizures that
occur suddenly may lead to serious states. A timely and precise diagnosis of epilepsy is essential for patients in
order to initiate anti-epileptic drug therapy and subsequently decrease the risk of future seizures and seizure related
complications [4]. Therefore, EEG plays a significant role in detecting epilepsy. Since, it measures differences in
voltage changes between electrodes along the subject’s scalp. This is done by sensing ionic currents flowing within
brain neurons and thus provides spatial and temporal information of the brain.

Over the past few years, researchers developed various techniques for the detection of epilepsy seizure using the
EEG signals. Most of the existing automatic seizure detection techniques uses machine learning and deep learning
approach. From the machine learning point of view, a lot of work has already being carried out. In a study presented
by Chen Zhang et al. [5], the authors used an area and energy efficient machine learning method for the seizure
onset and termination detection. A two area efficient support vector machine classifier is incorporated with a weight
and average algorithm to achieve high sensitivity of 95.1% and specificity of 96.2%. U. Rajendra Acharya et al. [6]
suggested the classification of three classes of seizure that are normal, interictal and ictal by extracting four non-
linear feature based on entropy (RE, SAmpEN, SEN and PE) for full time series EEG data and seven classifiers
(Fuzzy sugeno classifier, Support vector machine, KNN, Probabilistic neural network, decision tree, Gaussian
mixture model and Naïve bayes model). It was estimated that fuzzy classifier could differentiate among the three
classes with an accuracy of 98.1%. For the detection of seizure in ambulatory EEG, Patel et al [7] presented a
method employing low power, real time classification technique. The quadratic discriminant analysis (QDA), linear
discriminant analysis (LDA), mahanbolis discriminant analysis (MDA) and SVM classifiers are compared where it
was observed that LDA showed the best result. When tested for 13 subjects, LDA gives 90.9% sensitivity, 59.5%
specificity and 76.5% of overall accuracy.

Considering the deep learning approaches, Ihsan Ullah et al. [8] proposed an ensemble based system of pyramidal
1D convolutional neural network. The proposed model worked on the concept of refinement approach and resulted
in 60% fewer parameters when compared with traditional CNN models. The system gives an accuracy of 99.1±0.9
for University of Bonn dataset. Ram Bilas Pachori and Shivnarayan Patidar [9] proposed a method based on the
Empirical Mode Decomposition (EMD) and the second order difference plot (SODP) for the detection of epileptic
seizure. In this method artificial neural Network (ANN) is used for the classification of ictal and seizure-free EEG
signal including the feature space obtained from the ellipse area parameters of two IMF’s.
A 13-layer deep convolutional neural network (CNN) algorithm based automated detection of normal, preictal, and
seizure classes have been proposed by U. Rajendra Acharya et al. [10] using the data provided by the University of
Bonn. This was the first time where CNN is used for the EEG signal analysis. Here, a computer aided diagnosis
(CAD) is developed to distinguish among the different classes of EEG signals. The proposed technique achieved an
accuracy of 88.67%, specificity of 90.00%, and sensitivity of 95.00%.
MINGRUI SUN and his other associates [11] used data preprocessing method based on discrete Fourier transform
and two deep learning prediction methods, including convolutional neural network (CNN) and recurrent neural
network (RNN), to perform unsupervised feature learning of EEG in epilepsy. In this, a method was outlined to
extract the frequency domain and time-series data based on two layers CNN from the EEG prediction dataset present
on Kaggle competition. The model outperforms the other linear and deep learning methods on different evaluation
criteria. The results were also compared with other traditional ML algorithms such as linear discriminant analysis
and logistic regression, and the evaluation criteria are on the area under the curve but the method proposed exhibit
excellent performance compared to others.

Many of the above mentioned approaches showed good accuracy in classification but requires to train the network
used for classification from scratch. Also, for the classification accuracy there are mainly three problems that arise,
the distributions of the data used for training and testing may be different, the amount of training data may not be
enough and most machine learning approaches generate black-box models that are difficult to interpret. Therefore, in
this paper we will focus on the implementation of transfer learning approach that employs pre-trained network for
the classification of seizures. Yizhang Jiang et al [12] introduced the Transfer learning, Semi-supervised learning
and TSK fuzzy system (TL-SSL-TSK) model which integrates all the three to increase the robustness, accuracy and
interpretability of the classifier for EEG signal classification. Although an accuracy of above 95% is achieved but
challenge lies in reducing the computational cost of the algorithms used.
Our proposed methodology is the first to achieve relatively higher accuracy than the existing method with less
computational cost.

The manuscript is arranged as follows- second section contains a brief description of different machine learning
techniques, third section discusses the dataset used, fourth section introduces the details of the proposed
methodology, fifth section contains the results of different classification techniques. Section six compares the
performance of the proposed method along with the analysis and discussion of the obtained results. Seventh section
concludes the finding of the study.
II. DISCRIPTION ABOUT DIFFERENT MACHINE LEARNING TECHNIQUES
A. Support Vector Machine
SVM or Support vector machine is a supervised machine learning algorithm and it can be used for binary
classification and regression. It constructs an optimal hyperplane as a decision surface in a way that the margin of
separation between the two classes in data is maximized [15]. SVM uses the kernel mechanism that calculates the
distance between the two observations. Therefore, SVM are also referred as kernel machine. To maximize the
distance between the closest members of separate classes, the algorithm finds a decision boundary. The training for a
support vector machine mainly involves two steps. The first step, also known as kernel trick involves the
transformation of predictors or the input data to high dimensional feature space. In the second step, quadratic
optimization problem is solved to fit an optimal hyperplane for the classification of features into two classes.
B. Artificial Neural Network
The basic elements of human brain are neurons. These neurons provide us with the ability to apply our previous
experiences in our actions. Dr. Robert Hecht Neilsen defines neural network as, “a computing system made up of a
number of simple, highly interconnected processing elements which process information by their dynamic state
response to external inputs”. Artificial neural network (ANN) are computing algorithms that mimic the basic
functions of the biological neurons. The function receives input from other neurons, combines them, perform
operations on the result and output the final result. ANN can learn in a self-organizing way that matches the brain
functions such as classification, pattern recognition and optimization. The architecture of ANN describes the
connection between neurons. It contains an input layer, an output layer and one or more hidden layers in between
[16].
C. Transfer Learning
Conventional machine learning and deep learning algorithms are traditionally designed to work in isolation and
are trained to solve specific task. Once the feature space distribution changes, the model has to be rebuilt from the
scratch. Hence, transfer learning is an idea to overcome the isolated learning paradigm and to utilizes knowledge
acquired for one task to solve the related tasks. Transfer learning is commonly used in deep learning applications.
We may take a pretrained network and then use it as a starting point to learn a new task. Rather than training a
network with randomly initialized weights from scratch, it is much easier and faster to fine-tune a network with
transfer learning. Learned features can then be quickly transferred to a new task using small number of training data
[16].
In transfer learning, we can either develop the source model or use a pre-trained model. In our case, we used the pre-
trained model approach. In first step, a pre-trained model is chosen from the available models or a source model is
selected. Then, in the second step, we can reuse the selected model as the starting point for the second task. In the
last step, the model may need to be refined or adapted on the input-output pair data available for the task. The study
mainly emphasize on transfer learning. Fig1 shows the transfer learning architecture used for the classification in
this study.

PRETRAINED NETWORK

MODIFIED NETWORK

TRAINING DATA

TRAINING ALGO OPTIONS

TRAIN NETWORK
Insufficient accuracy Sufficient accuracy

MODIFY OPTION DONE

Fig1. Transfer learning architecture


III. DATA DESCRIPTION
In this paper, we used the dataset acquired by the research team of University of Bonn and this dataset have been
exclusively used for various researches on epilepsy seizure detection. EEG signals were recorded using standard 10-
20 electrode placement system. The original dataset given in [18] consists of 5 folders, each folder with 100 files
where each file represents a single subject/person. Brain activity for 23.6 seconds is recorded in each file and the
corresponding time-series is sampled into 4097 data points. Here, each data point represents the value of EEG
recording at different point in time. Therefore, we have total 500 individuals, each with 4097 data points for the
duration of 23.5 seconds.
For our purpose, we divided and shuffled every 4097 data points in 23 chunks. Each chunk has 178 data points for
1 second. Each data points represent the value of EEG recording at different point in time. Therefore, we have 1150
piece of information in row (23*500) and each information contains 178 data points for 1 second shown by columns.
The last column179 represent the response variable or the category of the 178-dimensional input vectors. The
explanatory variables are X1, X2,…X178. Subjects in classes 2,3,4 and 5 do not have epileptic seizure while subjects
falling in class1 have epileptic seizure. We considered binary classification taking class1 as seizure activity and 5 th
category as class0 for non-seizure [19].
IV. PROPOSED METHADOLOGY
The proposed transfer learning method consists of different stages which are described below.
A. Data Preprocessing
For the classification of seizure and non-seizure EEG signals, we divide the input data into two classes seizure and
non-seizure. Again, divide the data into separate matrix accordingly. Let us consider we have
n×(m-s) for class non-seizure (1)
n× (m-r) for class seizure (2)
where m=r+s for both the matrices. Here,‘s’ represents the number of samples belonging in class1 and ‘r’ represents
the number of samples in class0. Convert the data matrix having seizure and non-seizure class into single vector
individually. Therefore, we get the vectors of size 1×(n*(m-s)) and 1*(n*(m-r)) for non-seizure and seizure classes
respectively. Now, calculate the total number of sample points and divide it by 224. 224 is chosen to pass the matrix
as input in the required format for the network model used.
The number of samples for class0 and class1 will be represented by the given equations (3) and (4).
r×n (3)
s×n (4)
The quotient obtained after dividing the total sample by 224 is…. Our idea is to calculate the number of samples that
can generate 224×224 matrix. For this, multiply the quotient with 224 and extract the result from the total number of
samples points obtained from equation (3) and (4).
Now, in the next step reshape the extracted matrix by considering exact quotient and 224. Suppose the matrix
obtained after (3) and (4) is,
[B]p×224 (5)
then we have to generate the matrices from the input matrix given in equation (5)
[B1]224×224 , [B2]224×224….

In our proposed method, the original number of data samples are represented by the matrix 4591×178. Divide this
original input data matrix into two separate classes of seizure and non-seizure. Seizure class is given by 2292×178
and non-seizure by 2299×178. Therefore, the number of data points in seizure and non-seizure class will be 407976
and 409222 respectively. In the next step, divide the obtained values of data points with 224. The quotient obtained
will help us to determine the exact number of sample points that we should take in order to make it completely
divisible by 224. It is observed that total number of data points when divided by 224 gives the result in fraction. So,
take the highest value nearest to the total data points such that it will be divisible by 224 leaving no remainder. This
will ensure minimum data loss. For both seizure and non-seizure class, this value will be 407680.
Now, we have individual matrices of 407680×1 for both the classes. In the next step, reshape the matrix into
1820×224 and divide the matrices in such a way to get smaller matrices of size 224×224. Hence, at last we obtained 8
separate matrices of size 224×224 for seizure and 8 for non-seizure class, thus giving us total 16 matrices. The
matrices are labeled as s01,s02….s08 and ns01,ns02,…ns08 for seizure and non-seizure class respectively.
To make the final result to feed as input, first convert the obtained matrices in the image format by using the image
function in MATLAB. The image function displays the data in array as an image where each element of the array
specifies the color for 1 pixel of the image. The resulting image is 224-by-224 grid of pixels and can now be used as
input.
B. Model configuration of Transfer Learning
In the proposed method, we used pre-trained image classification network that has already learned to
extract informative features from natural images and then use it as a starting point to learn a new task. The
experiment is performed by selecting googlenet, resnet101 and vgg19 networks trained on ImageNet database.
The network depth for the used networks is defined as the largest number of sequential convolutional or fully
connected layers on the way from input to the output layer. It should be noted that all the networks used in the
work take RGB images as input. Vgg19 has a depth of 19 layers which is much smaller than that of googlenet
with 22 and resnet101 with 101 layers. But in terms of time consumption by CPU, googlenet is the fastest and
vgg19 is the slowest among the three networks. However, vgg19 provides better accuracy for small number of
epochs than googlenet but due to the high prediction time taken by GPU we preferred googlenet. The
googlenet network that we used is trained on ImageNet but it can also be trained on Places365 data sets. It
classifies the input images into 1000 object categories. It have an image input size of 224-by-224 and is a
DAG (Directed Acyclic Graph) network with 144 layers. The sequence of the layers present in googlenet
are….
C. Simulation Environment
In the first step, we are required to load the input as an image datastore and labeling of the images is performed
based on the folder names. The reason to use image datastore is that it allows us to store large image data, also the
data that do not fit in the memory. Then it efficiently read batches of images during the training of network. In the
second step, we divided the 70% data for training and 30% for the validation purpose. Both of them are stored in
separate data store. The third step involves the loading of pretrained network such as googlenet. If the network
support packages are not installed then download them by selecting ‘Add ons’ and then install. After installing the
network, import the network in the Deep Network Designer in the Apps section. We may observe that the first
element of the layer property is the image input layer. For googlenet, this layers requires input size of 224- by-224-
by-3 where 3 denotes the number of color channels.
In the fourth step, our aim is to replace the final layer according to the new dataset. Replace the fully connected
layer with the new fully connected layer and change the number of outputs as per the total number of classes in the
new data set. To make the network learn faster change the learning rate factor of weight and bias to 10. Now, replace
the classification layer with a new one without any class labels, the output classes of the layer are set automatically at
the training time. In the next step, we analyze the modified network and proceed further by exporting the network in
the workspace if no errors are found. For the next step, we use augmented image datastore to automatically resize the
training images. Also, by using augmentation we can prevent the data from overfitting. Now , specify the various
training options and compute the validation accuracy. Set of training options are created using stochastic gradient
descent with momentum. For our study, we have set the initial learn rate to 1e-4. Maximum number of epochs are set
each time as we tested for 6, 50, 100 and 150 epochs or full training cycles on the entire data set. Mini batch with 10
observations at each iteration are used. To display the training metric at each iteration, training progress plot figure is
turned on. Each iteration is an estimation of the gradient and an update of the network parameter. Since we have
specified validation data in training options then the figure shows the validation metric each time the network is
validated. Frequency of network validation in number of iterations is given as 6. Training data is shuffled before each
training epoch and validation data is shuffled before each network validation. For this ‘shuffle’ parameter is set to
‘every-epoch’. For displaying the training progress information in the command window, ‘verbose’ can be set to true.
In our case, we considered it to be false. For the next step, we trained the network using the training data by utilizing
the CPU. If GPU is available then it can be used for training purpose.
After observing the training progress, we classify the validation images using the fine-tuned network and then
calculate the classification accuracy. It should be noted that here we are taking mean value to denote the classification
accuracy. To further validate the results we may display some random sample validation images. The results are
displayed in such a way that each validation image is shown along with the predicted labels and their predicted
probabilities of the images having those labels. Summarizing the process, we may visualize it as given in the Fig2.

Predict
Load pre- Replace Deploy
Train and assess
trained final results
network network
network layers
accuracy

Improve network

Fig2 Approach applied in reusing the pre-trained network

V. RESULTS OF DIFFERENT CLASSIFICATION TECHNIQUES


In order to validate the effectiveness of the presented method, we conducted the classification of EEG signals on
traditional as well as modern machine learning and deep learning techniques. Considering traditional methods
of classification, we observed that the fine Gaussian Support vector machine outperforms every other
technique providing an accuracy of 99.8% with the prediction speed of 6500 obs/sec and taking 43.377sec for
training. The best performance of discriminant analysis was given by quadratic discriminant with an accuracy
of 96.8%, prediction speed of 14000 obs/sec and training time of 20.802 sec. Among the various ensemble
methods implemented, we obtained the best results from the bagged trees having an accuracy of 98.7%,
prediction speed of 10000 obs/sec and training time of 234.45 sec. Logistic regression only provided an
accuracy of 62.9%, taking 47000 obs/sec for prediction with 35.53 sec for training. Both the methods of Naïve
bayes showed good results, Gaussian naïve bayes giving an accuracy of 99.2% and 99.4% for kernel naïve
bayes algorithm. Though kernel method took only 120 obs/sec and training time of 159.8 sec as compared to
Gaussian method that took 25000 obs/sec and training time of 4.9649 sec. The model when trained with fine k
nearest neighbor (KNN) achieved an accuracy of 90.5% in 1300 obs/sec within 88.478 sec. The fine decision
tree resulted in 93.8% accuracy with 35000 obs/sec prediction speed and 10.77 training time. Therefore,
among all the implemented machine learning techniques, fine Gaussian support vector machine shows the best
performance providing high accuracy of 99.8%. The performances of various traditional machine learning
techniques are summarized in the Table1.
Table1. Performance of different Machine learning techniques

Method Accuracy(%) Prediction Speed(obs/sec) Training time (sec)


Linear 63.8 12000 24.392
Discriminant discriminant
Analysis Quadratic 96.8 14000 20.802
discriminant
Bagged trees 98.7 10000 234.45

Boosted trees 98.4 11000 251.05


Ensemble RUSBoosted 91.7 15000 298.89
method trees
Subspace 65.0 1900 256.99
discriminant

Subspace KNN 91.0 140 401.22


Coarse KNN 57.7 1200 139

Cosine KNN 87.3 1400 137.88


K Nearest
Neighbor Cubic KNN 80.0 46 456.39
Fine KNN 90.5 1300 88.478

Medium KNN 80.7 1300 66.71

Weighted KNN 82.8 1600 135.78


Logistic Regression 62.9 47000 35.53
Gaussian naïve 99.2 25000 4.9649
Naïve Bayes bayes
Kernel naïve 99.4 120 159.8
bayes

Coarse 84.6 8300 48.068


Gaussian SVM

Cubic SVM 95.6 16000 35.868

Support Fine Gaussian 99.8 6500 43.377


Vector SVM
Machine Linear SVM 61.2 2900 39.993

Medium 99.1 21000 45.588


Gaussian SVM
Quadratic SVM 92.4 9200 31.189

Coarse tree 84.7 100000 11.102


Decision Tree
Fine tree 93.8 35000 10.77

Medium tree 91.7 48000 10.077

The accuracy of the fine Gaussian SVM model can be evaluated using the confusion matrix which is shown in fig. 3.
The confusion matrix can be defined as data distribution matrix aspect of model output with respect to class label
that is used to describe the performance of a classification model on a set of input data.
Fig3. Confusion matrix for fine Gaussian SVM
Further, the fine Gaussian SVM model can be evaluated using the ROC curve as given in Fig4. ROC curve or
Receiver Operating Characteristics curve is graphical plot that illustrates the diagnostic ability of the binary
classifier as its discrimination threshold is varied. It can be created by plotting the true positive rate (TPR) against
the false negative rate (FNR) at various thresholds.

Fig4 ROC curve for fine Gaussian SVM taking positive class as ‘seizure’

In deep learning approach using the artificial neural network technique we classified the seizure and non-seizure
signals. The tansig transfer function is used and the model is trained using Levenberg-Marquardt backpropagation
algorithm. Although Levenberg-Marquardt backpropagation requires more memory than other algorithms but it is
one of the fastest among all the algorithms available. For this, the sample data is divided into 80% training data,
10% validation and 10% testing data with 10 hidden neurons. Training parameter goal of 10 -3 is given with 500
epochs. The validation performance obtained for the trained network is shown in the below Fig5. It is observed that
the best training performance of 0.00076621 is achieved at epoch 38. The model gives a value of 0.062346 for
gradient and 1e-05 for training gain (Mu) as depicted in Fig6.
Fig5. Validation performance of the model using tansig transfer function and Levenberg-Marquardt algorithm

Fig6 Training result obtained from neural network technique

In this paper we mainly emphasize on the results obtained by the transfer learning approach. Googlenet, resnet101
and vgg19 are selected and are given as pretrained network for the input data. After analyzing the results, we are
more focusing on the results displayed by the googlenet network despite it giving less accuracy than vgg19 for less
number of epochs. The main reason to choose googlenet is that it provides much better training time than vgg19. In
order to validate the test results we conducted the transfer learning starting from 6 epochs to increasing the number
to 50, 100 then 150. It should be noted that learning rate is kept same.Vgg19 gave an accuracy of 100% for the
detection of seizure signals and 99.6% for the accuracy of non-seizure signals for 6 epochs taking training time of
10.45 min as presented by the training progress shown in Fig7. While for the same number of epochs, googlenet
gave accuracy of 98.6% and 98.5% for seizure and non-seizure respectively but in much less time that is within 41
sec as shown in Fig8. As we increase the number of epochs to 50, 100 and 150 the mean accuracy of googlenet and
vgg19 will reach to 100% for both the classes seizure and non-seizure. But if we consider the training time then
googlenet outperforms vgg19. Googlenet takes 7.30 min to train when the epochs are 150 whereas vgg19 takes
42.33 min to train for the same number of epochs, which is much higher than googlenet. Resnet101 also achieved a
high accuracy of 99.7% and 99.9% for 150 epochs in distinguishing seizure and non-seizure respectively. But for
less number of epochs like 6, it provided an accuracy of 77.9% and 80.5% for seizure and non-seizure respectively
as shown by its training progress in Fig9 where it takes 3.52 min for training.

Fig7 Training progress for vgg19 for 6 epochs

Fig8 Training progress for googlenet for 6 epochs


Fig9 Training progress for resnet101 for 6 epochs

VI. RESULT ANALYSIS AND DISCUSSION


Different classification techniques are carried out for the detection of seizure and non-seizure EEG signals.
Traditional machine learning techniques are implemented and the results are further compared with the proposed
methodology of transfer learning. Considering the various machine learning techniques, quadratic discriminant
performed best in discriminant analysis as shown by its confusion matrix in Fig10 and the corresponding ROC curve
is also depicted in Fig11.

Fig10 confusion matrix for quadratic discriminant


Fig11. ROC curve for quadratic discriminant taking positive class as ‘Seizure’

Further observing the results, we determined that bagged trees outperforms every other technique in ensemble
method as shown by the confusion matrix in Fig12 and ROC curve in Fig13.

Fig12 Confusion matrix for Bagged tree ensemble method


Fig13 ROC curve for bagged tree ensemble method for positive class as ‘Seizure’

Fine KNN gave better accuracy than other KNN algorithms that can be observed by the confusion matrix shown in
Fig14 and ROC curve in fig15.

Fig14 Confusion matrix for fine KNN


Fig15 ROC curve for fine KNN for positive class as ‘Seizure’

Logistic regression lacks in terms of accuracy when compared with other machine learning techniques. The
corresponding confusion matrix is depicted in the Fig16 with its ROC curve given in Fig17.

Fig16 Confusion matrix of logistic regression


Fig17 ROC curve of logistic regression for positive class as ‘seizure’

Gaussian naïve bayes and kernel naïve bayes, both the methods of naïve bayes technique provided high accuracy but
kernel naïve bayes outperforms the other one. The obtained confusion matrix is shown in Fig18 and its ROC curve
is depicted in Fig19.

Fig18 Confusion matrix of kernel naïve bayes

Fig19 ROC curve of kernel naïve bayes for positive class as ‘Seizure’

Fine decision trees provides more accurate results than other decision trees. Its corresponding confusion matrix and
ROC curve is shown in the Fig20 and Fig21.
Fig20 Confusion matrix of fine decision tree

Fig21 ROC curve of fine decision tree for positive class as ‘Seizure’
Fine Gaussian support vector machine provides best accuracy as compared to other machine learning techniques.
The result is already shown in the previous section in the form of confusion matrix and ROC curve.
The proposed transfer learning method is tested by varying the pre-trained network and the number of epochs,
keeping the learning rate constant. Table2 presents a summary of the complete study conducted in the classification
of seizure and non-seizure signals using the transfer learning technique. Although machine learning and deep
learning methods can potentially detect seizure from EEG signals but transfer learning is an advanced approach that
not only provides a high accuracy but prevent us from training our network from scratch.

Table2 Performance comparison of used pre-trained networks


PRE-TRAINED NO. OF TYPE OF SIZE EPOCH 6 EPOCH 50 EPOCH 100
NETWORK LAYERS ARCHITECTURE (MB) Detection Accuracy(%) Training Detection Accuracy(%) Training Detection Accuracy(%) Training
Time (min) Time (min) Time (min
Non- Seizure Non-seizure Seizure Non-seizure Seizure
seizure
googlenet 144 DAG Network 27 98.5 98.6 0.68 100 100 2.19 100 100 4.52
resnet101 347 DAG Network 167 80.5 77.9 3.52 98 99.7 8.48 99.7 97.1 30.10
Vgg19 47 Series Network 535 99.6 100 10.45 100 100 13.7 100 100 227.58

In this paper, we evaluated the results using traditional machine learning techniques as well as the proposed transfer
learning framework in order to establish a relative comparison among the two. The performances of the different
techniques are shown in the comparison table given in Table3.

Table3 Comparison of performances of the implemented classification methods

Classification Method Parameters


Fine Gaussian SVM Accuracy= 99.8% Prediction speed= Training time= 43.377
6500obs/sec sec
ANN Training performance= Gradient= 0.062346 Mu= 1e-o5
0.0007621(epoch 8)
Transfer Learning Validation accuracy= 100% Seizure class validation Non-seizure validation
accuracy= 98.5 accuracy= 98.6

Various classification schemes for EEG signals has been carried out using different classification methods and
different datasets. Comparison of some of the methods along with the proposed method are shown in Table4. It can
be observed from Table4 that the classification accuracy of the methods are lower than the proposed transfer
learning approach. Hence, the proposed method can be used efficiently for the discrimination of seizure and non-
seizure signals.
Table4. Comparison of different methods
Schemes Accuracy Classification method
Y. Jiang et al. [12] 95% for most of the cases Transfer learning,semi-
supervised,TSK fuzzy system
L. Xie et al. [14] 95.43% Tranductive transfer learning
Y. Yuan et al. [13] 94.37% Multi-view deep learning framework
U. Rajendra Acharya et al. [10] 88.67% Convolutional neural network
I. Ullah et al. [8] 99.1±0.9% Pyramidal 1D CNN
Ram Bilas Pachori et al. [9] 97.75% Artificial neural network
Proposed method 100% Transfer learning

VII. CONCLUSION
In this work, a transfer learning based method is proposed for the discrimination of seizure and non-seizure EEG
signals. For this method, a pre-trained network googlenet with comparatively less CPU time has been employed and
reused by replacing its final layers and training it for the University of Bonn dataset. The results are assessed and
deployed when sufficient accuracy is reached. The results depicted that the accuracy is above 99%, and achieves a
high accuracy of 100% when number of epochs are increased. Results from other pre-trained networks renet101 and
vgg19 are also evaluated. Test results shows that the seizure and non-seizure classification based on transfer learning
is a promising approach and thus can be used efficiently.
Future research can be conducted to test the results by varying the learning rate and other parameters. Also
validation using a large dataset with more epileptic patients to improve the generalization ability can be carried out
in future. More work can be performed using other efficient pre-trained network.

VIII. REFERENCES

[1] https://imotions.com/blog/what-is-eeg/
[2] https://faculty.washington.edu/chudler
[3] https://www.who.int/news-room/fact-sheets/detail/epilepsy
[4] https://www.mayoclinic.org/diseases-conditions/epilepsy/diagnosis-treatment
[5] C. Zhang, M. A. Bin Altaf and J. Yoo, "Design and Implementation of an On-Chip Patient-Specific Closed-Loop
Seizure Onset and Termination Detection System," in IEEE Journal of Biomedical and Health Informatics, vol. 20,
no. 4, pp. 996-1007
[6] U.Rajendra Acharya, H.Fujita, Vidya K. Sudarshan, Shreya Bhat, Joel E.W.Koh, “Application of entropies for
automated diagonosis of epilepsy using EEG signals” in Knowledge-Based Systems, Volume 88, November 2015,
Pages 85-96

[7] K. Patel, C. Chua, S. Fau and C. J. Bleakley, "Low power real-time seizure detection for ambulatory EEG," 2009
3rd International Conference on Pervasive Computing Technologies for Healthcare, London, 2009, pp. 1-7
[8] I. Ullah, M. Hussain, E.-u.-H. Qazi, H. Aboalsamh An Automated System for Epilepsy detection Using EEG
Brain Signals Based on Deep Learning Approach (2018)
[9] Ram Bilas Pachori, Shivnarayan Patidar, “Epileptic Seizure classification in EEG signals using second-order
difference plot of intrinsic mode functions”,Computer methods and programs in biomedicine,Volume 113, Issue
2, February 2014, Pages 494-502
[10] U. Rajendra Acharya, Shu LihOh, Yuki Hagiwara, Jen Hong Tan, Hojjat, Adeli, “Deep convolutional neural
network for the automated detection and diagnosis of seizure using EEG signals”, Computers in biology and
medicine, Volume 100,1 September 2018, Pages 270-278
[11] Mingrui Sun, Fuxu Wang, Tengfei Min, Tianyi Zang, “Prediction for High Risk Clinical Symptoms of Epilepsy
Based on Deep Learning Algorithm”, IEEE Access PP(99):1-1 
[12] Y. Jiang et al., "Seizure Classification From EEG Signals Using Transfer Learning, Semi-Supervised Learning
and TSK Fuzzy System," in IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 25, no. 12,
Dec. 2017, pp. 2270-2284
[13] Y. Yuan, G. Xun, K. Jia and A. Zhang, "A Multi-View Deep Learning Framework for EEG Seizure Detection,"
in IEEE Journal of Biomedical and Health Informatics, vol. 23, no. 1, Jan2019, pp. 83-94
[14] L. Xie, Z. Deng, P. Xu, K. Choi and S. Wang, "Generalized Hidden-Mapping Transductive Transfer Learning
for Recognition of Epileptic Electroencephalogram Signals," in IEEE Transactions on Cybernetics, vol. 49, no. 6,
pp. 2200-2214, June 2019.
[15] https://elitedatascience.com/machine-learning-algorithms
[16] Atsu S.S. Dorvloa, Joseph A. Jervaseb, Ali Al-Lawatib,” Solar radiation estimation using artificial neural
networks”, Applied Energy 71 (2002) 307–319
[17]https://towardsdatascience.com/a-comprehensive-hands-on-guide-to-transfer-learning-with-real-world-
applications-in-deep-learning
[18] Andrzejak RG, Lehnertz K, Rieke C, Mormann F, David P, Elger CE (2001) Indications of nonlinear
deterministic and finite dimensional structures in time series of brain electrical activity: Dependence on recording
region and brain state, Phys. Rev. E, 64, 061907
[19] https://archive.ics.uci.edu/ml/datasets/Epileptic+Seizure+Recognition

You might also like