0% found this document useful (0 votes)

60 views10 pages

Ieee Access Image Malware Aug22

This document summarizes a research article that proposes a novel combination of lightweight deep learning models for image-based malware classification. Specifically, it combines small convolutional neural networks (CNNs) with an advanced variational autoencoder (VAE) enhanced by channel and spatial attention mechanisms. The goal is to achieve high classification performance while keeping the model architecture and computational requirements small enough for real-time applications. Experimental results on malware image datasets showed the proposed approach outperformed other state-of-the-art techniques in terms of accuracy, while requiring less time and resources.

Uploaded by

chowsaj9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views10 pages

Ieee Access Image Malware Aug22

Uploaded by

chowsaj9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

This article has been accepted for publication in IEEE Access.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3198072

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.DOI

An Attention Mechanism for

Combination of CNN and VAE for
Image-Based Malware Classification
TUAN VAN DAO, HIROSHI SATO AND MASAO KUBO
Department of Computer Science, National Defense Academy, 1-10-20 Hashirimizu, Yokosuka, Kanagawa, Japan
Corresponding author: Tuan Van Dao (e-mail: [email protected]).

ABSTRACT Currently, malware is increasing in both number and complexity dramatically. Several
techniques and methodologies have been proposed to detect and neutralize malicious software. However,
traditional methods based on the signatures or behaviors of malware often require considerable compu-
tational time and resources for feature engineering. Recent studies have applied machine learning to the
problems of identifying and classifying malware families. Combining many state-of-the-art techniques has
become popular but choosing the appropriate combination with high efficiency is still a problem. The
classification performance has been significantly improved using complex neural network architectures.
However, the more complex the network, the more resources it requires. This paper proposes a novel
lightweight architecture by combining small Convolutional Neural Networks and advanced Variational
Autoencoder, enhanced by channel and spatial attention mechanisms. We achieve overperformance and
sufficient time through various experiments compared to other cutting-edge techniques using unbalanced
and balanced Malimg datasets.

INDEX TERMS Malware Classification, Variational Autoencoder, channel attention, spatial attention,
latent representation, information security.

I. INTRODUCTION strings that are all embedded in raw bytes of the Portable
The Internet has become an essential function in our lives. Executable (PE) [4]. The main limitation of static analysis
However, at the same time, it also raises many security threats is that it is not sufficient in the case of code obfuscation
while providing excellent service. Malware is a powerful and zero-malware. In addition, the analysis will be time-
tool for an attacker to intrude, sabotage, and control a tar- consuming if malware is mixed up with many disruptive
get indirectly as a remote administration tool through the methods.
Internet. The abuse of various malware causes a significant On the other hand, dynamic analysis investigates the mal-
impact on cyber-security and threats to individuals, society, ware as they are executed in simulated environments like
and countries [1], [2]. Authors of malware mix different sandboxes or virtual machines [5]. This analysis does not
evading techniques such as user interaction, environment require disassembling the PE file and decompression and un-
awareness, obfuscation, code compression, and code en- packing in advance to gain malware’s features as static anal-
cryption to change existing malicious code’s appearance to ysis. The main limitation of this analysis is that the dynamic
bypass the Anti-virus System and Intrusion Detection System analysis may not always uncover malicious behavior because
(IDS). However, it is often the case that the new variants still some malware can detect virtual environments and change
have the same malicious intentions and characteristics as the its behavior. Moreover, because of the rapid development of
original malware. many automatic malware creation tools [6], these methods
There are two malware detection and analysis techniques: cannot catch up to the speed of malware generation.
static analysis and dynamic analysis. The static analysis in- Machine learning has become more potent because its
vestigates the malware without executing them [3]. This type highly developed algorithms can solve most problems en-
of analysis utilizes various information, such as Application countered in almost every field. Several methods extract
Programming Interface (API) calls, the entropy of files, and elements from malicious software, such as API calls [7],

VOLUME 4, 2016 1

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3198072

T..V Dao, H. Sato, M. Kubo et al.: A Novel Combination of Light-weight Deep Learning Model for Image-Based Malware Classification

[8], and feed them into machine learning. Some of them So far, recent studies have focused mainly on the depth and
take advantage of Natural Language Processing (NLP) to width of Neural Networks and increase amount of features
solve strings element for detection [9] and classification tasks but have not yet focused on enriching the quanlity of object
[10]. Existing malware classification research uses machine features. This paper aims to gather as many worthwhile
learning techniques like Support Vector Machine (SVM) features as possible while keeping small model architecture
[11], K-Nearest Neighbor [12], and Random Forest [13]. by utilizing CNN and combining it with a new type of
Another alternative to the machine learning-based method Variational Autoencoder enhanced by the Attention mecha-
for malware classification is the vision-based approach nism, which we call “AVAE”. The AVAE can provide more
[14-39]. Although attackers use obfuscation techniques to discriminate features, map and refine the original feature
achieve spoofing, malware variants from the same family still space to latent representation.
maintain similar code and data order, which may not appear The main contribution of this paper is providing an image-
in the same location. The convolutional Neural Networks based malware classification system through feature syn-
(CNN) can extract common features from a family. Conti thesis from VAE, CNN, and attention mechanism. Because
et al. have proposed a method to visualize malware binaries the processing is merely dependent on images, the system
into a grayscale image and noticed that visual analyses of does not require in-depth knowledge of the malware and
malware binary help distinguish various regions of data from the environment to determine its behavior. Moreover, some
the image [15]. The advantage of the malware visualization classifiers can give the result in under a second, so our model
analysis is that it does not require using any decompilers or a can be applied in real-time countermeasures against malware.
dynamic running environment. Moreover, malware samples The rest of the paper is organized as follows: Section
are converted into RGB (Red, Green, Blue) in [16] by encod- 2 discusses the related work concerning some popular and
ing and arranging bytes from binary files. A color image can recent techniques in malware detection and classification.
obtain more information than a grayscale image. Section 3 illustrates the proposed model in detail. Section 4
The growth of high-performance computing, coupled with evaluates the performance of the proposed approach. Finally,
the huge CNNs architectures, made it possible to process we summarize our work in Section 5.
images at a higher level of complexity. However, recent
studies indicate that fewer parameters with a simple network
II. RELATED WORK
structure give relatively satisfactory results and can be ap-
plied to low-profile devices like IoT [17] or smartphones In this session, we investigate various new studies on image-
[18]. Taking advances from different well-known CNN ar- based malware classification, ranging from models with sim-
chitectures, Transfer learning is also applied for image-based ple structures to complex ones; some hybrid models with
malware classification [19], [24], [25], [28], [30]. By using different structural combinations have achieved high perfor-
pre-trained CNNs and fine-tuning them, several CNNs can mance in malware classification.
extract rich features more than simple ones [19]. For the first time, Nataraj et al. proposed a novel approach
Another approach that can be used to extract features of for visualizing and classifying malware using image process-
an image is Autoencoder (AE). AE is an unsupervised deep ing techniques [12]. They visualized malware as a gray-scale
learning algorithm with a unique neural network structure. image based on the observation that images of the same class
AE transforms the input into an output with minimal recon- were very similar in layout and texture. They utilize GIST
struction errors and can process with small data. However, descriptor, based on wavelet decomposition of an image, as
AE often falls into overfitting, and the problem of organizing feature extractor and k-nearest neighbor(kNN) as a classifier.
the latent space is complex. VAE is then introduced as an The paper achieved an accuracy of 97.18% on their intro-
autoencoder whose training is regularised to avoid overfitting duced dataset: Malimg, which contains 9,339 malware sam-
and ensure that the latent space has suitable properties that ples related to 25 different malware families. Other feature
enable a generative process. While VAE can represent global descriptors are also applied as HOG and HOC+GIST [22].
features through latent space, CNN capture local feature However, this method is not suitable for processing a massive
through small kernels. The combination of VAE and CNN amount of malware because of the high computational cost.
promises to obtain an overall feature of the object [32]. Naeem et al. [23] utilized a new type of feature descriptor by
However, this combination still did not achieve the expected combining and balancing collective local and global feature
performance. vectors. As a result, they achieved a high classification rate
For now, attention mechanisms [20] have been a significant of 98% on the Malimg dataset.
breakthrough in deep learning. The mechanisms have been The current research focuses on building a complex net-
widely used in image recognition, NLP, and speech recog- work model with deep CNN. For example, more than ten
nition. However, few studies on malware classification are Conv layers [2], VGG16 in [24], VGG19 in [25], or Com-
based on attention mechanisms in terms of computer vision. bining multiple CNN architectures [19]. On the other hand,
Moreover, compared to multi-head attention [20], this type [26] minimize parameters to speed up training. The proposed
of attention tends to feedforward CNN and can be applied at model achieves the accuracy, which is lower, approximately
every convolutional block in deep networks. under 1%, than the state-of-the-art result, by reducing 99.7%
2 VOLUME 4, 2016

T.V. Dao, H. Sato, M. Kubo et al.: A Novel Combination of Light-weight Deep Learning Model for Image-Based Malware Classification

the number of trainable parameters of the best model in the dataset.

comparison session. Lee et al. [1] illustrates the effectiveness of autoencoder
Verma et al. [27] try to enrich extracted Malware features by applying multiple AEs. Each AE model classifies only
by concatenating CNN features and other 35 statistical tex- one type of malware and is trained using only samples from
ture features. The numerous CNNs require high-resolution the corresponding family. As a result, the author achieves
images for training. The input image size of these networks is an accuracy of 94.03% for a system with the same AE
usually around 224x224 to 299x299 [28]. The larger the size, network structure and 97.75% with various AEs. Moreover,
the higher the computational cost. Roseline et al. [29] em- the model achieves a 0.46% improvement from 97.75% to
ployed Lightweight CNNs with merely three convolutional 98.21% when combining similar classes. However, the article
layers with an increasing depth of 16, 32 and 64. The model is still misclassified quite a lot, showing that AE has not
is optimized by Adam and utilizes Categorical Cross-entropy been effective in extracting the characteristics of image-based
loss, and the input image is resized to 32x32. With the above malware.
setting, [29] achieved an accuracy of 97.68% through 50 Burks et al. [32] inserted VAE into the handcraft Resid-
epochs. ual Network (RN), and the performance accuracy of 85%
Rezende et al. [30] transferred the first 49 layers of ResNet- increased by 2% and 6% compared with the original RN and
50 on ImageNet to the malware classification task. Frozen Generative Adversarial Network (GAN) model, respectively.
layers can be seen as learned feature extraction layers. The Awan et al. [25] applied spatial convolutional attention
author replaced the last layer with 1000 fully connected soft- called dynamic spatial convolution on VGG19 Network. This
max with 25 fully connected ones according to the number attention utilized a global average pooling (GAP) mecha-
of classes on the Malimg dataset. After 750 epochs, the paper nism, rescale the output of GAP by lambda layer, fed into
reached an average accuracy of 98.62% with 10-fold cross- dropout of rate 0.25 before Fully connected layer, the au-
validation. They also compare features extracted from Deep thor utilized Softmax as a traditional classifier of CNNs.
CNN (DCNN) with GIST features using the same kNN clas- The performance was evaluated on the Malimg dataset and
sifier. The experimental result showed ResNet-50 performed achieved an accuracy of 97.68%. Ma et al. [33] applied the
better than handcrafted GIST by 0.52% with 98.00% and attention mechanism [20] and handcrafted architecture with
97.48%, respectively. five parts: Input layer, Local Attention, Global Attention
Vasan et al. [19] utilized an ensemble of CNNs. They layer, Dense layer, and Output layer. Compared with other
assumed that different CNNs provide different semantic rep- methods, the combination of the attention mechanism and
resentations of the image; therefore, higher qualities feature CNN mechanism achieved the best classification accuracy of
is extracted than traditional methods. VGG16 and ResNet- 96.09% on Microsoft’s Kaggle dataset.
50 pre-trained on ImageNet were fine-tuned for malware B. N. Narayanan et al. [42] declare that each malicious
images. This ensemble method achieves high detection ac- program belonging to a family has a distinct pattern. The
curacy with a low false rate. authors use Principal Component Analysis (PCA) as linear
Anandhi et al. [21] introduced another type of Deep CNN dimension reduction can save the computational time and
with Densely connected networks (DensNet). DensNet com- even trade-off of losing several valuable information. As a
prises dense blocks, a composite function, and a transition result, the performance obtained is still far behind CNN.
layer. This architecture solved the vanishing gradient prob- V. S. P. Davuluru et al. [43] indicate a trade-off between
lem because of the shrinking of the gradient through a deep computational time and model complexity. The authors also
network. The author utilized DenseNet201 with 201 layers highlight the advantages of using CNN as a feature extrac-
deep and achieved an accuracy of 98.97% on the original tor. Instead of the original CNN classifier (softmax), using
Malimg dataset and 99.36% by combining similar families, SVM can overcome the drawback of the limited unbalanced
C2LOP and Swizzor. dataset.
Çayır et al. [26] proposed a simple architecture called B. N. Narayananet al. [44] proposed a novel approach of
Random CapsNet forest engineering instead of complex fusing both Natural Language Processing (NLP)-based ap-
CNN architectures. This model contains capsules similar to proach called LSTM and image-based approaches including
autoencoders, with each capsule learning how to represent simple CNN, AlexNet, ResNet, and VGG16 into a single
an instance for a given class. Although the proposed method simple architecture. The combination of several different
does not use data augmentation, data resampling, transfer CNN feature extractors is also somewhat similar to the char-
learning, and weighted loss function, it still achieved accept- acteristics of the DensNet model, concatenating intermediate
able results with an accuracy of 98.72%. layers [21]. The authors extract 9 features from each of those
Nisa et al. [31] combine the features extracted from pre- architectures, compiling a suite of 45 in total. Choosing the
trained AlexNet and Inception-V3. These fusion features are appropriate features from the total number of features in
then classified using different classifiers such as SVM, kNN, each architecture will also become an optimization problem
and Decision tree (DT). [31] achieved an accuracy of 98.7% for two different architectures. Besides, recent malware is
on the Malimg dataset. The result was improved up to 99.3% obfuscated, and the obtained opcodes sequence will be en-
when applying augmentation to turn Malimg into a balanced tangled with a lot of noise, leading to limitations in finding
VOLUME 4, 2016 3

T..V Dao, H. Sato, M. Kubo et al.: A Novel Combination of Light-weight Deep Learning Model for Image-Based Malware Classification

relationships between words and the quality of embedding

of LSTM. As a result, it will affect the assembly architecture.
Besides, observing malware visualization from the Microsoft
Malware Classification Challenge (BIG 2015) dataset, it can
seem that different families have distinctly different images
that the naked eye can distinguish. The number of families
is not too large; compared to the Malimg dataset, up to 25
families, several malware samples from different families
look the same and can not be distinguished by the human
eye. Therefore, with data of higher complexity, an additional
refine mechanism is needed; in this study, we focus on
filtering and selecting essential features so that they can be
processed with data with high similarity even if the naked
FIGURE 1. The structure of a binary file
eye cannot distinguish it.

III. PROPOSED METHOD

A. IMAGE REPRESENTATION FOR MALWARE
To visualize a malware sample as an image, we must interpret
every byte as one pixel in an image. Notice that binary files
are the hexadecimal representation of the PE of malware in
Figure 1. The first row is the offset of the memory address.
The second one represents the pair of hexadecimal. Each
hexadecimal pair is treated as a single decimal number which
serves as a pixel value of the image. The resulting array must
be organized as a 2-D array, and values are in the range
[0,255] (0: black, 255: white). The size of the image depends
on the binary file’s size. Table 1 presents different heights
for malware images due to different sizes of malware files
while fixing the width of images. Table 1 also illustrates that FIGURE 2. Samples from the Malimg dataset
converting malware into grayscale images does not require
a long time; common malicious codes less than 1Mb in size
only take no more than 0.01s to convert. B. VARIATIONAL AUTOENCODER
We then convert the grayscale images into three-channel VAE [34] is a variant of an autoencoder (AE) that also
RGB images by replicating the grayscale channels for three consists of an encoder and a decode. The autoencoder is
iterations. Figure 2 illustrates a part of the malware plot solely trained to encode and decode with as few losses
from the Malimg dataset, which Nataraj et al. [12] created. (reconstruction loss) as possible, no matter how the latent
It can be observed that images from a given family are space is organized. Therefore, it is tough to guarantee that
similar while distinct from those of a different family. New the encoder will organize the latent space smartly. More than
variants are often created by changing a small part of the that, AE often faces an overfitting problem which causes
code. Therefore, if the predecessor is reused, the result would irregular in the latent space. On the other hand, the VAE
be very similar. Furthermore, by converting malware into applies a Gaussian probability density qϕ (z|x) that makes
an image, it is possible to detect the small changes while the encoder return distribution over the latent space. VAE
keeping the comprehensive structure of samples belonging tackles the problem of the latent space irregularity problem
to the same family. by adding in the loss function a regularisation term over that
returned distribution to ensure a better organization of the
TABLE 1. Image height for different malware file sizes latent space.
Let ϕ = (W, b) and θ = (W, b’). The lost function of VAE
File size Image height Time convert(ms) includes two terms as follows:
<10 kB 32 0.105
lV AE (xi , θ, ϕ) = −Eqϕ (z|xi ) log pθ (xi | z)

10kB-30kB 64 0.312
30kB-60kB 128 0.428 (1)
60kB-100kB 256 0.571 +DKL (qϕ (z|xi ∥ p(z))
100kB-200kB 384 0.748
200kB-500kB 512 0.665
The first term is the expected negative log-likelihood of
500kB-1Mb 768 0.814 the i -th data point. This term is also called the reconstruction
>1Mb 1024 2.85 error (RE) of VAE since it forces the decoder to learn to
reconstruct the input data. The second term is the Kullback-
Leibler (KL) divergence between the encoder’s distribution
4 VOLUME 4, 2016

T.V. Dao, H. Sato, M. Kubo et al.: A Novel Combination of Light-weight Deep Learning Model for Image-Based Malware Classification

Symbol Meaning
W Weight matrix of encoder
0.5 to avoid overfitting. Moreover, we use Adam as a fine-
W’ Weight matrix of decoder tuning optimizer with a minimal learning-rate = 0.001. In the
b Bias vector of encoder AVAE model, we insert CBAM in turn between convolutional
b’ Bias vector of decoder
ϕ Parameter for training encoder
layers. In latent representation, we use the mean vector, dense
θ Parameter for training decoder µ with latent dimension sets to 100. We concatenate these
x Training dataset extracted features with a fully connected layer of CNN. Both
z Representation of the input sample
xi datapoint i − th
the CNN model and AVAE model train low-resolution image
qϕ Encoder with the size of 64x64, and the number of epochs are 50.
qθ Decoder We utilize early stopping to finish training without im-
lVAE (x i , θ, ϕ) Loss function of VAE for a datapoint x i provement after five epochs. We use the typical classifiers al-
g Deterministic function
K Number of samples that are utilized to reparameterize z gorithm of machine learning to evaluate our system. In order
to evaluate our method, we utilize 10-fold Cross-Validation.
One of the ten subsamples is held out as validation data, and
qϕ (z|x) and the expected distribution p(z). This divergence the remaining nine subsamples are used as training data. This
measures the relation of q and p [34]. In the VAE, p(z) is process is repeated ten times with each of the ten subsamples
specified as a standard normal distribution with mean zero used as validation. The average of ten results is the quality of
and standard deviation, denoted as N (0, 1). If the encoder the method.
outputs representations z different from the standard normal
distribution, it will receive a penalty in the loss. Since the IV. EXPERIMENTAL RESULTS
gradient descent algorithm is not suitable to train a VAE with A. DATASET
a random variable z sampled from p(z), the loss function of This study evaluates our model using the Malimg Dataset
the VAE is re-parameterized as follows: consisting of 9,339 malware samples of 25 different families.
K Table 2 illustrates the number of malwares in each class. It is
i 1 X clear that the Malimg dataset is unbalanced; 2,949 images
lV AE (x , θ, ϕ) = − log pθ (xi | z i,k )
K (2) represent the Allaple. A malware family, while merely 80
k=1
+DKL (qϕ (z|xi ∥ p(z)) images are present in the Skintrim. N family. The imbalanced
datasets are a communal problem in machine learning in
Where z i,k = gϕ (ϵi,k , xi ), ϵk denotes N (0, 1). general, and computer vision in particular [28], [35], [36].
After training, the latent layers of VAE can be utilized for Furthermore, imbalanced data harms the performance of
a classification task. Then, the original data is passed through the CNNs because of causing underfitting and overfitting
the encoder part of VAE to generate the latent representation. [37]. There are two standard methods to deal with imbal-
anced class distribution problems; oversampling and un-
C. ATTENTION MECHANISM dersampling. Instead of adding more samples on lacking
The structure of the attention module is described in Figure malware families, [32] utilized image augmentation, which
3. There are two sequential sub-modules: Channel Attention generates new data from classes with less population in the
Module (CAM) and Spatial Attention Module (SAM). The dataset. However, using augmentation is an extremely high
former decomposes the input tensor into two subsequent computational cost. In this study, we adopt undersampling
vectors generated by Global Average Pooling and Global to balance the Malimg dataset. Specifically, we reduce the
Max Pooling, feeding into a multi-layer perceptron with one number of malware samples in all groups to the lowest
hidden layer. After that, both vectors are merged by using sample Skintrim.N family same with [38]. The total number
element-wise summation. The latter applies Max Pooling of variants now is less than one-fourth of 2,000 compared to
and Average Pooling across channels, then concatenate them, the original Malimg dataset.
followed by a convolution layer to generate a spatial attention
map. B. CLASSIFICATION RESULT
The model can learn what and where to emphasize or We utilized some standard classifiers for the unbalanced
suppress and refines intermediate features effectively through Malimg dataset. The result is shown in Table 3. Random For-
the attention mechanism, [40]. In this paper, we apply both est (RF) classifier achieves the highest accuracy of 99.40%,
CAM and SAM. It is called Convolutional Block Attention while Nearest Centroid runs fastest with merely 0.11 seconds
Module (CBAM) [40] in the encoder part of VAE. We name with an accuracy difference of 1.26% compared to RF in the
it as Attention of Variational Autoencoder (AVAE). 10-fold Cross-Validation. Table 8 depicts a confusion matrix
that gives the detailed performance of the proposed method
D. FEATURE COMBINATION AND CLASSIFICATION using the Random Forest classifier. As can be seen, 22 out
Fig. 4 illustrates the architecture of our system. We utilize of 24 families attain F-scores greater than 90%, 88.1%, and
the lightweight CNN with merely two convolutional layers 89.2% of Swizzor.gen!E and Swizzor.gen!I, respectively.
with a kernel size is 32, followed by 64. Before flattening The balanced Malimg dataset of results is shown in Table
the pooled feature map, we apply dropout with a rate = 4. Even though the number of data is reduced dramatically,
VOLUME 4, 2016 5

T..V Dao, H. Sato, M. Kubo et al.: A Novel Combination of Light-weight Deep Learning Model for Image-Based Malware Classification

FIGURE 3. The structure of CBAM [40]

FIGURE 4. An overview of proposed method

we still achieve high accuracy of 98.40% when using the lightweight proposed model improves accuracy significantly
RF classifier. The result shows that our method can extract and saves the computational cost. Moreover, the time to
crucial features of image-based malware. Compared to the classify each malicious code only takes an average of 0.01s.
previous study, our proposed method reduces by 1% while Complex architectures such as [25], [30], [32] require high
[38] reduces four times by 4%. The results of the unbalanced image quality and computational processing capacity. The
Malimg dataset compared with the results of other studies reason for using complex networks is that the deep layers
using the same dataset are shown in Table 7. are expected to extract specific features such as ears and
eyes in image processing tasks concerned with humans. On
As shown in Table 7, the Lightweight CNNs of Roseline the other hand, the shallow layers focus on overall image
et al. [29] proposed with merely 0.83M parameters, but the features such as edges of the objects. For example, in Fig. 2,
result does not change sharply since the first-time dataset many uncomplicated elements can be found by observing the
was introduced by Nataraj et al. [12] by 0.31% from 97.18% simple grayscale of malware samples. Therefore, we focus
to 97.49%. That proves that using only a few parameters is on the first layers to extract adequate features with a smaller
not necessarily extracting enough features of the object. On image size of 64x64, still ensuring high accuracy.
the other hand, utilizing a model with enormous parameters The Malimg dataset contains many samples processed
such as ResNet-50 [30] and VGG19 [25] improved the result through obfuscation techniques such as encryption and pack-
slightly; however, it requires more computational power. ing. Among them, malware samples belonging to Adialer.C,
Nevertheless, using a sufficient number of parameters, our Autorun.K, Lolyda.AT, Malex.gen!J, VB.AT, Yuner.A are
6 VOLUME 4, 2016

T.V. Dao, H. Sato, M. Kubo et al.: A Novel Combination of Light-weight Deep Learning Model for Image-Based Malware Classification

TABLE 2. Original Malimg Dataset TABLE 4. Performance comparision for the various classifier on balanced
Malimg Dataset. Best configuration was highlighted with bold characters.

Class Family name No. of samples Percentage(%) Accuracy Time

0 Adialer.C 122 1.31 Classifier
(%) (s)
1 Agent.FYI 116 2.12 Decision Tree 94.15 2.43
2 Allaple.A 2949 31.58 k-Nearest Neighbors 98.35 0.68
3 Appaple.L 1591 17.04 Naive Bayes 96.75 0.10
4 Alueron.gen!J 198 2.12 Nearest Centroid 97.60 0.05
5 Autorun.K 106 1.14 Random Forest 98.40 10.95
6 C2LOP.gen!g 200 2.14 SVM 93.75 2.19
7 C2LOP.P 146 1.56
8 Dialplatform.B 177 1.89
9 Dontovo.A 162 1.73 TABLE 5. Comparision of accracy in term of both imbalanced and balanced
10 Fakerean 381 4.08 Malimg dataset with previous work
11 Instantaccess 431 4.62
12 Lolyda.AA1 213 2.28 Accuracy (%)
13 Lolyda.AA2 184 1.97 Study
Unbalance Malimg Balanced Malimg
14 Lolyda.AA3 123 1.32 Yajamanam et al. [38] 97.00 93.00
15 Lolyda.AT 159 1.70 This paper 99.40 98.40
16 Malex.gen!J 136 1.46
17 Obfuscator.AD 142 1.52
18 Rbot!gen 158 1.69 TABLE 6. Comparision with two families misclassification
19 Skintrim.N 80 0.86
20 Swizzor.gen!E 128 1.37 Accuracy(%)
21 Swizzor.gen!I 132 1.41 Studies
Swizzor.gen!E Swizzor.gen!I
22 VB.AT 408 4.58
Yajamanam et al. [38] 51.0 36.0
23 Wintrim.BX 97 1.04
Naeem et al. [23] 30.0 50.0
24 Yuner.A 800 8.57
Roseline et al. [29] 70.0 45.0
Çayır et al. [26] 56.3 68.8
Verma et al. [27] 87.5 81.8
TABLE 3. Performance comparision for the various classifier on unbalanced Awan et al. [25] 48.0 56.0
Malimg Dataset V. Anandhi et al. [21] 84.2 52.5
This paper 87.5 87.9
Accuracy Time
Classifier
(%) (s)
Decision Tree 98.12 7.00
k-Nearest Neighbors 99.30 10.49 sification results. We propose a feature selection method
Naive Bayes 98.16 0.32 called AVAE. AVAE consists of a small CNN, variational
Nearest Centroid 98.82 0.08 autoencoder, and an attention mechanism.
Random Forest 99.40 32.52
SVM 98.23 12.65 Experimental results show that our method could classify
malware families efficiently. Our method has achieved the
best accuracy of 99.40% with the Random Forest classifier,
packed with the same packing process, making them have while Nearest Centroid reaches nearly 99% in under a sec-
similar structure and pattern. As a result, analysts often ond. Furthermore, with merely 80 images of each family,
have difficulty distinguishing them. However, our method our method achieves a high accuracy of 98.40%, which is
can process these samples directly without unpacking, with consistent with the fact that some new families lack data.
the corresponding accuracy of 100%, 100%, 100%, 99.26%, The total time to convert malicious code into an image (with
99.75%, and 100%, respectively. The experiment indicated common malicious code under 1 Mb in size) and classify it
that our method was robust against these specific obfuscation merely takes 0.02s. We think our method will be applicable
attacks. to the existing systems from these results.
Moreover, despite achieving high total accuracy of classi- Another advantage of our method is that it can distinguish
fication, many studies have encountered an obstacle in classi- similar malware families with high accuracy even when it is
fying two family variants: Swizzor.gen!E and Swizzor.gen!I, packed. Therefore, our proposed method can help malware
which are highly similar and difficult to distinguish. The analysts reduce the time to classify variants. Furthermore,
accuracy of both families compared with other authors is when the malware family is identified, it is possible to know
shown in Table 6. We achieve the best result with 87.5% and the typical characteristics, the intended utilization, and the
87.9% accuracy, respectively. impact of the malware on the target.
In the latent space of VAE, the global features are orga-
V. CONCLUSION nized in a more planned than in AE. However, the importance
Recent studies have developed huge complex neural network of the elements has not been considered. In this stydy, we
models for malware analysis to obtain desirable features. further emphasize the importance of attention mechanism in
However, they demand more resources than the average selecting and evaluating weights for VAE to help features
system can provide. Therefore, this paper focuses on building acquire important features in latent space. At the same time
simple, lightweight models while still ensuring high clas- to ensure feature diversity, we combined light-weight CNN
VOLUME 4, 2016 7

T..V Dao, H. Sato, M. Kubo et al.: A Novel Combination of Light-weight Deep Learning Model for Image-Based Malware Classification

TABLE 7. Comparision with existing state-of-the-art algorithms

Studies Year Techniques Accuracy(%) Number of parameters(M)

Nataraj et al. [12] 2011 GIST feature + kNN 97.18 (-)
Garcia et al. [13] 2016 ANN + Random Forest 95.26 (-)
Agarap [11] 2017 GRU-SVM 84.92 (-)
Rezende et al. [30] 2017 ResNet-50 + Softmax 98.62 25.56
Yajamanam et al. [38] 2018 Deep learning + Softmax 97.00 (-)
Naeem et al. [23] 2019 Local Feature Extraction + Global Feature Extraction 98.40 (-)
Burks et al. [32] 2019 ResNet-18 + VAE 85.00 12.46
Roseline et al. [29] 2020 Lightweight CNNs 97.49 0.83
Çayır et al. [26] 2020 Capsule Networks + Softmax 98.72 (-)
Verma et al. [27] 2020 Combine first-order and second-order statistical texture features 98.58 (-)
Awan et al. [25] 2021 VGG19 + Spatial Convolutional Attetion 97.68 143.67
Nisa et al. [31] 2021 SFTA + Cosine kNN 98.70 88.26
Moussas et al. [41] 2021 Image and file features, ANN 99.13 (-)
Lee et al. [1] 2021 Multiple Autoencoders 97.75 23.81
V. Anandhi et al. [21] 2021 Gabor filter + DenseNet Markov 98.97 7.98
This paper 2022 Lightweight CNN + AVAE 99.40 3.62

TABLE 8. Unbalanced Malimg dataset confusion matrix for 10-fold cross validation using RF classifier

Class 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Precision Recall F1 Score

0 122 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.000 1.000 1.000
1 0 116 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.000 1.000 1.000
2 0 0 2948 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.000 1.000 1.000
3 0 0 0 1591 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.999 1.000 1.000
4 0 0 0 0 198 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.000 1.000 1.000
5 0 0 0 0 0 106 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.000 1.000 1.000
6 0 0 0 1 0 0 193 3 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0.949 0.930 0.935
7 0 0 0 0 0 0 1 144 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0.952 0.990 0.969
8 0 0 0 0 0 0 0 0 175 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1.000 0.988 0.994
9 0 0 0 0 0 0 0 0 0 162 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.000 1.000 1.000
10 0 0 0 0 0 0 1 0 0 0 379 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1.000 0.994 0.998
11 0 0 0 0 0 0 0 0 0 0 0 431 0 0 0 0 0 0 0 0 0 0 0 0 0 1.000 1.000 1.000
12 0 0 0 0 0 0 0 0 0 0 0 0 213 0 0 0 0 0 0 0 0 0 0 0 0 0.991 1.000 0.995
13 0 0 0 0 0 0 0 0 0 0 0 0 2 182 0 0 0 0 0 0 0 0 0 0 0 1.000 0.989 0.994
14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 122 0 0 0 0 0 0 0 1 0 0 1.000 0.992 0.996
15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 158 0 0 0 0 0 0 1 0 0 1.000 0.994 0.997
16 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 135 0 0 0 0 0 0 0 0 0.993 0.993 0.993
17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 142 0 0 0 0 0 0 0 1.000 1.000 1.000
18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 158 0 0 0 0 0 0 0.994 1.000 0.997
19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 80 0 0 0 0 0 1.000 1.000 1.000
20 0 0 0 0 0 0 1 3 0 0 0 0 0 0 0 0 0 0 0 0 112 12 0 0 0 0.899 0.876 0.881
21 0 0 0 0 0 0 3 2 0 0 0 0 0 0 0 0 0 0 1 0 10 116 0 0 0 0.908 0.877 0.892
22 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 407 0 0 0.991 0.998 0.995
23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 97 0 0.982 1.000 0.990
24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 800 1.000 1.000 1.000

to capture lower-range features. Compared to image data REFERENCES

generated by malicious code, there are not many complex [1] J. Lee and J. Lee, “A Classification System for Visualized Malware Based
factors that need a deep CNN network, such as face images on Multiple Autoencoder Models”, IEEE Access, vol. 9, pp. 144786 –
144795, Oct. 2021. DOI: 10.1109/ACCESS.2021.3122083.
or animal images in ImageNet data. The complementary [2] G, Xiao, J. Li, Y. Chen, K. Li, “MalFCS: An effective malware clas-
method from the two models helps us acquire rich and sification framework with automated feature extraction based on deep
different characteristics of the object. convolutional neural networks”, J. Parallel Distrib. Comput. Vol 141, pp.
49–58, 2020. DOI:10.1016/j.jpdc.2020.03.012.
[3] A. Moser, C. Kruegel, and E. Kirda, “Limits of static analysis for malware
We will build a new malware dataset with recent malicious detection”, Twenty-third Annual Computer Security Applications Confer-
code for future work. Additionally, we will apply the pro- ence, pp. 421 – 430, 2007. DOI: 10.1109/ACSAC.2007.21.
[4] M. Wagner, F. Fischer, R. Luh, A. Haberson, A. Rind, D.A. Keim
posed method to the IDS system to enhance the capacity for and W. Aigner, “A survey of Visualization of Systems for Malware
detection and classification of potential dangers in cyberse- Analysis”, Eurographics Conference on Visualization (EuroVis), 2015.
curity. DOI:10.2312/eurovisstar.20151114.
[5] M. Egele, T. Scholte, E. Kirda and C. Kruegel, “A survey on automated dy-
namic malware-analysis techniques and tools”, ACM Computing Surveys,
In this paper, we have built a model focusing on the issue vol.44, no.6, pp. 1-42, 2012. DOI:10.1145/2089125.2089126.
of classifying malware with simple but effective architecture. [6] Y. Ye, T. Li, D. Adjeroh and S. lyengar, “A Survey on Malware Detection
Using Data Mining Techniques”, Computer Science Review, vol. 50, no.
We think there is a possibility to apply our method to standard 41, pp. 1-40, 2017. DOI:10.1145/3073559.
image classification even with the lack of data. [7] L. Liu, B.S. Wang, B. Yu, and Q.X. Zhong, “Automatic malware classifica-

8 VOLUME 4, 2016

T.V. Dao, H. Sato, M. Kubo et al.: A Novel Combination of Light-weight Deep Learning Model for Image-Based Malware Classification

tion and new malware detection using machine learning,” emphFrontiers ing Models”, emphApplied Sciences, vol. 11, no. 14, 2021, Art. no. 6446.
Inf. Technol. Electron. Eng., vol. 18, no. 9, pp. 1336–1347, Sep. 2017. DOI:10.3390/app11146446.
DOI:10.1631/FITEE.1601325. [29] A. Roseline, G. Hari, S. Geetha, R. Krishnamurthy, “Vision-Based Mal-
[8] Q. Qian and M. Tang, “Dynamic API call sequence visualisation for ware Detection and Classification Using Lightweight Deep Learning
malware classification,” IET Inf. Secur., vol. 13, no. 4, pp. 367–377, Oct. Paradigm”, in /emphComputer Vision and Image Processing, pp. 62-73,
2018. DOI:10.1049/iet-ifs.2018.5268. 2020.
[9] M. Mimura, “An Improved Method of Detecting Macro Malware on an [30] E. Rezende, G. Ruppert, T. Carvalho, F. Ramos,P. De Geus, “Malicious
Imbalanced Dataset”, IEEE Access , vol. 8, pp. 204709 – 204717, Nov. software classification using transfer learning of RESNET-50 deep neural
2020. DOI: 10.1109/ACCESS.2020.3037330. network”, in Proceedings 16th IEEE International Conference on Machine
[10] K. Tran and H. Sato, “NLP-based approaches for malware clas- Learning and Applications, Dec. 2017. DOI:10.1109/ICMLA.2017.00-19.
sification from API sequences”, emph21st Asia Pacific Sympo- [31] M. Nisa, J.H Shah, S. Kanwal, M. Raza, M.A Khan, R. Damaševicius, T.
sium on Intelligent and Evolutionary Systems (IES), Nov. 2017. Blažauskas, “Hybrid malware classification method using segmentation-
DOI:10.1109/IESYS.2017.8233569. based fractal texture analysis and deep convolution neural network
[11] A.M. Agarap, “Towards Building an intelligent Anti-Malware System: features”, Applied Sciences, vol. 10, July. 2020, Art. no. 4966.
A Deep Learning Approach using Support Vector Machine (SVM) for DOI:10.3390/app10144966.
Malware Classification”, arXiv preprint 2017, arXiv:1801.00318. [32] R.Burks, K.A Islam, J. Li, Y. Lu, “Data augmentation
[12] L. Nataraj, S. Karthikeyan, G. Jacob and B.S. Manjunath, “Malware with generative models for improved malware detection: a
images: visualization and automatic classification”. Proceedings of the comparative study”, The IEEE 10th Annual Ubiquitous Computing,
8th International Symposium on Visualization for Cyber Security , 2011. Electronics & Mobile Communication Conference, Oct. 2019.
DOI:10.1145/2016904.2016908. DOI:10.1109/UEMCON47517.2019.8993085.
[13] F.C.C. Garcia and F.P. Muga II, “Random Forest for Malware Classifica- [33] X. Ma, S. Guo, H. Li, Z. Pan, “How to Make Attention Mechanisms
tion”, aeXiv preprint 2016, arXiv:1609.07770. More Practical in Malware Classification”, IEEE Access, Oct. 2019.
DOI:10.1109/ACCESS.2019.2948358.
[14] L. Nataraj, S. Karthikeyan and B.S. Manjunath, “SATTVA: SpArsiTy
[34] D.P Kingma and M. Welling, “Auto-encoding variantional bayes”, aeXiv
inspired classificaTion of malware Variants”. Proceedings of the 3rd ACM
preprint 2013, arXiv: 1312.6114.
Workshop on Information Hiding and Multimedia Security, pp. 135–140,
[35] Ramasubramanian and H. Shanmugasundaram, “A Review on Classifica-
2015. DOI:10.1145/2756601.2756616.
tion of Data Imbalance using BigData”, International Journal of Manag-
[15] G. Conti, E. Dean, M. Sinda, B. Sangster, “Visual reverse engineering of
ing Information Technology, vol. 13, no. 03, pp. 09-22, Aug. 2021. DOI:
binary and data files”, Visualization for Computer Security, 5th Interna-
10.5121/ijmit.2021.13302.
tional Workshop, VizSec, Jan. 2008.
[36] F. Thabtah, S. Hammoud, F. Kamalov and A. Gonsalves, “Data imbalance
[16] D.L Vu, T.K Nguyen, T.V Nguyen, T.N Nguyen, F. Massacci and P.H.
in classification: Experimental evaluation”, Information Sciences, vol. 513,
Phung, “HIT4Mal: Hybrid image transformation for malware classifica-
no. 3, Nov. 2019. DOI:10.1016/j.ins.2019.11.004.
tion”, Transactions on Emerging Telecommunications Technologies, vol.
[37] K.S Kancherla, S. Mukkamala, “Image visualization based mal-
31, no. 5, Nov. 2019. DOI:10.1002/ett.3789.
ware detection”. In Proceedings of the 2013 IEEE Symposium on
[17] H. Naeem, F. Ullah, M.R. Naeem, S. Khalid, D. Vasan, S. Jabbar, S. Saeed, Computational Intelligence in Cyber Security (CICS), April. 2013.
“Malware detection in industrial internet of things based on hybrid image DOI:10.1109/CICYBS.2013.6597204.
visualization and deep learning model”, emphAd Hoc Networks, vol. 105, [38] S. Yajamanam,V.R.S Selvin, F.D. Troia, M. Stamp, ”Deep learning versus
no. 1, May. 2020. DOI:10.1016/j.adhoc.2020.102154. gist descriptors for image-based malware classification”, 2nd International
[18] Y. Ding, X. Zhang, J. Hu, W. Xu, “Android malware detection method Workshop on Formal methods for Security Engineering , pp. 553–561, Jan.
based on bytecode image”, Journal of Ambient Intelligence and Human- 2018. DOI:10.5220/ 0006685805530561.
ized Computing, 2020. DOI:10.1007/s12652-020-02196-4. [39] D. Gibert, C. Mateu, J. Planes, R. Vicens,” Using convolutional neural
[19] D. Vasan, M. Alazab, S. Wassan, B. Safaei, Q. Zheng, “Image- networks for classification of malware represented as images”, Journal of
Based malware classification using ensemble of CNN architectures (IM- Computer Virology and Hacking Techniques , vol. 15, no. 1, pp. 15–28.
CEC)”, Computers and Security, vol. 92, May. 2020, Art. no. 101748. DOI:10.1007/s11416-018-0323-0.
DOI:10.1016/j.cose.2020.101748. [40] S. Woo, J. Park, J.Y. Lee, I. Kweon, “CBAM: Convolutional Block
[20] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, “Attention Is All You Attention Module”, in Computer Vision – ECCV 2018, pp. 3-19, Sep.
Need”, in Proc. NIPS, pp.1-11, 2017. 2018.
[21] V. Anandhi, P.Vinod, V.G. Menon, “Malware visualization and detection [41] V. Moussas, A. Andretos, “Malware Detection Based on Code Vi-
using DenseNet”, in Personal and Ubiquitous Computing, July. 2021. sualization and Two-Level Classification”, information, Mar. 2021.
DOI:10.1007/s00779-021-01581-w. DOI:10.3390/info12030118.
[22] A. Bozkir, E. Tahillopglu, M. Aydos and I. Kara, “Catch them alive: A [42] B. N. Narayanan, O. Djaneye-Boundjou and T. M. Kebede, “Performance
malware detection approach through memory forensics, manifold learning Analysis of Machine Learning and Pattern Recognition Algorithms for
and computer vison”, “Computers and Secutiry”, vol. 103, Apr. 2021, Art. Malware Classification”, 2016 IEEE National Aerospace and Electronics
No. 102166. Conference (NAECON) and Ohio Innovation Summit (OSI),, Dayton, OH,
[23] H.Naeem, B.Guo, M.R. Naeem,F. Ullah, H. Aldabbas, M.S Javed, “Identi- 2016, pp. 338-342.
fication of malicious code variants based on image visualization”, Com- [43] V. S. P. Davuluru, B.N. Narayanan and E. J. Balster, “Convolutional Neural
puters and Electrical Engineering, vol. 76, pp. 225–237, Apr. 2019. Networks as Classification Tools and Feature Extractors for Distinguishing
DOI:10.1016/j.compeleceng.2019.03.015. Malware Programs”, 2019 IEEE National Aerospace and Electronics
[24] E. Rezende, G. Ruppert, T. Carvalho, A. Theophilo, F. Ramos, P. de Geus, Conference (NAECON), Dayton, OH, USA, 2019, pp. 273-278.
“Malicious software classification using VGG16 deep neural network’s [44] B. N. Narayanan and V. S. P. Davuluru, “Ensemble Malware Classification
bottleneck features”, Information Technology - New Generations, pp. 51- System using Deep Neural Networks”,in Electronics 2020, 9 (5), 721.
59, Jan. 2018.
[25] M. Awan, M. Mohoammed, A. Yasin, A. Zain, “Image-Based Mal-
ware Classification Using VGG19 Network and Spatial Convolutional
Attention”, in Electronics, vol. 10, no. 19, Oct. 2021, Art. no. 2444.
DOI:10.3390/electronics10192444.
[26] A. Çayır, U. Ünal, H. Dağ, “Random CapsNet forest model for imbalanced
malware type classification task” in Computers and Security, vol. 102,
2021, Art. no. 102133.
[27] V. Verma, S.K Muttoo, V.B Singh, “Multiclass malware classification via
first and second order texture statistics”, in Computers and Security, vol.
97, 2020, Art. no. 101895.
[28] W.Shafai, I. Almomani and A. AlKhayer, “Visualized Malware Multi-
Classification Framework Using Fine-Tuned CNN-Based Transfer Learn-

VOLUME 4, 2016 9

T..V Dao, H. Sato, M. Kubo et al.: A Novel Combination of Light-weight Deep Learning Model for Image-Based Malware Classification

VAN TUAN DAO was born in Thai Binh province,

Viet Nam, in 1992. He received the B.E. and M.E.
degrees from Department of Computer Science,
National Defense Academy of Japan, in 2016 and
2018,respectively. He is currently pursuing the
Ph.D. degree in information security. His main re-
search interests include computer vision, artificual
intelligence, cyber security and machine learning.

HIROSHI SATO is an Associate Professor of the

Department of Computer Science at the National
Defense Academy in Japan. He obtained a degree
in Physics from Keio University in Japan and
degrees of Master and Doctor of Engineering from
Tokyo Institute of Technology in Japan. He was
previously a Research Associate at the Department
of Mathematics and Information Sciences at Os-
aka Prefecture University in Japan. His research
interests include agent-based simulation, evolu-
tionary computation, and artificial intelligence. Dr. Sato is a member of the
Japanese Society for Artificial Intelligence (JSAI), Society of Instrument and
Control Engineers (SICE), and The Institute of Electronics, Information and
Communication Engineers. IEICE).

MASAO KUBO is an Associate Professor of the

Department of Computer Science at the National
Defense Academy in Japan. He graduated from
the Precision Engineering Department, Hokkaido
University, in 1991. He received his PhD degree
in Computer Science from Hokkaido University in
1996.

10 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

The Rise of Machine Learning For Detection and Classification of Malware - Research Developments, Trends and Challenges - ScienceDirect
No ratings yet
The Rise of Machine Learning For Detection and Classification of Malware - Research Developments, Trends and Challenges - ScienceDirect
75 pages
2018 Minhash
No ratings yet
2018 Minhash
13 pages
Bounouh
No ratings yet
Bounouh
13 pages
Survey of Machine Learning Techniques Fo
No ratings yet
Survey of Machine Learning Techniques Fo
55 pages
Malware Analysis Using Machine Learning and Deep Learning Techniques
No ratings yet
Malware Analysis Using Machine Learning and Deep Learning Techniques
7 pages
Complete Tutorial On Hacking Into Paypal Accounts PDF
100% (10)
Complete Tutorial On Hacking Into Paypal Accounts PDF
3 pages
Proud Mal Static Analysis Based Malware Analysis For Exes
No ratings yet
Proud Mal Static Analysis Based Malware Analysis For Exes
13 pages
5474-Article Text-8699-1-10-20200511
No ratings yet
5474-Article Text-8699-1-10-20200511
8 pages
Classification of Malware Detection Using Machine Learning Algorithms A Survey
No ratings yet
Classification of Malware Detection Using Machine Learning Algorithms A Survey
7 pages
A Review of Deep Learning Based Malware Detection Techniques
No ratings yet
A Review of Deep Learning Based Malware Detection Techniques
19 pages
Chapter One 1.1 Background of The Study
No ratings yet
Chapter One 1.1 Background of The Study
40 pages
Deep Learning Models For Real-Time Automatic Malware Detection - Docx Main
No ratings yet
Deep Learning Models For Real-Time Automatic Malware Detection - Docx Main
17 pages
Automated Machine Learning For Deep Learning Based Malware Detection
No ratings yet
Automated Machine Learning For Deep Learning Based Malware Detection
17 pages
Catch Them Alive: Malware Detection
No ratings yet
Catch Them Alive: Malware Detection
19 pages
Timex Sinclair BASIC Primer With Graphics
100% (4)
Timex Sinclair BASIC Primer With Graphics
252 pages
HTML Exam
80% (5)
HTML Exam
3 pages
Malware Detection Using ANN
No ratings yet
Malware Detection Using ANN
10 pages
Survey of Machine Learning Techniques For Malware Analysis - ScienceDirect
No ratings yet
Survey of Machine Learning Techniques For Malware Analysis - ScienceDirect
7 pages
A Survey of The Recent Trends in Deep Le
No ratings yet
A Survey of The Recent Trends in Deep Le
30 pages
Sample Project Base Paper
No ratings yet
Sample Project Base Paper
9 pages
MeMalDet A Memory Analysis-Based Malware Detection Framework Using Deep Autoencoders and Stacked Ensemble Under Temporal Evaluations
No ratings yet
MeMalDet A Memory Analysis-Based Malware Detection Framework Using Deep Autoencoders and Stacked Ensemble Under Temporal Evaluations
20 pages
A Malicious Code Detection Method Based On Stacked Depthwise Separable Convolutions and Attention Mechanism
No ratings yet
A Malicious Code Detection Method Based On Stacked Depthwise Separable Convolutions and Attention Mechanism
27 pages
Preprints202407 1214 v1
No ratings yet
Preprints202407 1214 v1
20 pages
Malware Detection and Prevention Using Machine Learning - 25!03!23!16!20 - 14
No ratings yet
Malware Detection and Prevention Using Machine Learning - 25!03!23!16!20 - 14
6 pages
Malware Detection Using Machine Learning and Deep Learning
No ratings yet
Malware Detection Using Machine Learning and Deep Learning
10 pages
Computers 11 00160 v2
No ratings yet
Computers 11 00160 v2
15 pages
A Multi-View Feature Fusion Approach For Effective Malware Classification Using Deep Learning
No ratings yet
A Multi-View Feature Fusion Approach For Effective Malware Classification Using Deep Learning
15 pages
SSRN Id3901568
No ratings yet
SSRN Id3901568
21 pages
14th ICCCNT 2023 Paper 943
No ratings yet
14th ICCCNT 2023 Paper 943
5 pages
Reasearch 1
No ratings yet
Reasearch 1
18 pages
08 Rohit Final Malware Research Paper
No ratings yet
08 Rohit Final Malware Research Paper
13 pages
Network Malware Detection Using Deep Learning Netw
No ratings yet
Network Malware Detection Using Deep Learning Netw
26 pages
1 s2.0 S0957417423031809 Main
No ratings yet
1 s2.0 S0957417423031809 Main
22 pages
Malware Detection and Classification Based On Graph Convolutional Networks and Function Call Graphs
No ratings yet
Malware Detection and Classification Based On Graph Convolutional Networks and Function Call Graphs
11 pages
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
No ratings yet
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
8 pages
Dynamic Android Malware Category Classification
No ratings yet
Dynamic Android Malware Category Classification
8 pages
Comparison of Malware Classification Methods Using Convolutional Neural Network Based On Api Call Stream
No ratings yet
Comparison of Malware Classification Methods Using Convolutional Neural Network Based On Api Call Stream
19 pages
1 s2.0 S2667305323001436 Main
No ratings yet
1 s2.0 S2667305323001436 Main
10 pages
Mal Wares
No ratings yet
Mal Wares
48 pages
The State-of-the-Art in AI-Based Malware Detection Techniques: A Review
No ratings yet
The State-of-the-Art in AI-Based Malware Detection Techniques: A Review
18 pages
Malware Detection Using Convolutional Neural Network, A Deep Learning Framework: Comparative Analysis
No ratings yet
Malware Detection Using Convolutional Neural Network, A Deep Learning Framework: Comparative Analysis
14 pages
Ijcna 2021 o 56
No ratings yet
Ijcna 2021 o 56
18 pages
6 Thsemminiproject
No ratings yet
6 Thsemminiproject
12 pages
Malware Identification
No ratings yet
Malware Identification
28 pages
IEEE Xplore Citation Plain Text Download 2025.1.5.19.1.38
No ratings yet
IEEE Xplore Citation Plain Text Download 2025.1.5.19.1.38
9 pages
A Malware-Detection Method Using Deep Learning To
No ratings yet
A Malware-Detection Method Using Deep Learning To
24 pages
Dynamic Malware Detection in Wireless Networks Using Deep Learning
No ratings yet
Dynamic Malware Detection in Wireless Networks Using Deep Learning
16 pages
A Novel Method For Malware Detection On ML-based Visualization Technique
No ratings yet
A Novel Method For Malware Detection On ML-based Visualization Technique
41 pages
Computers 13 00059
No ratings yet
Computers 13 00059
18 pages
Applsci 12 08604 v2
No ratings yet
Applsci 12 08604 v2
21 pages
Analysis of Cyber Security Threats Using
No ratings yet
Analysis of Cyber Security Threats Using
5 pages
Document Malware
No ratings yet
Document Malware
9 pages
A Novel Ensemble-Based Approach For Windows Malware Detection
No ratings yet
A Novel Ensemble-Based Approach For Windows Malware Detection
10 pages
Integrated Malware Analysis Using Machine Learning PDF
No ratings yet
Integrated Malware Analysis Using Machine Learning PDF
8 pages
Comprehensive Review On CNN-based Malware Detection With Hybrid Optimization Algorithm
No ratings yet
Comprehensive Review On CNN-based Malware Detection With Hybrid Optimization Algorithm
13 pages
Malware Application Detection Using Machine Learning
No ratings yet
Malware Application Detection Using Machine Learning
8 pages
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
No ratings yet
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
9 pages
Malcode Detection
No ratings yet
Malcode Detection
5 pages
Malware Detection Using Machine Leaning
No ratings yet
Malware Detection Using Machine Leaning
9 pages
Malware Analysis and Classification Survey
No ratings yet
Malware Analysis and Classification Survey
9 pages
OS Unit-3 Process Synchronization & Deadlock
100% (1)
OS Unit-3 Process Synchronization & Deadlock
38 pages
A Review On The Use of Deep Learning in Android Malware Detection PDF
No ratings yet
A Review On The Use of Deep Learning in Android Malware Detection PDF
17 pages
Operations and Service Manual 69NT40-561-300 To 399: Container Refrigeration
100% (1)
Operations and Service Manual 69NT40-561-300 To 399: Container Refrigeration
154 pages
04.scaffold Manual
No ratings yet
04.scaffold Manual
6 pages
ZCP 515-33KV Twin FDR
No ratings yet
ZCP 515-33KV Twin FDR
21 pages
Schneider Electric - FTE R&D Job Description - 2022 Batch
No ratings yet
Schneider Electric - FTE R&D Job Description - 2022 Batch
32 pages
ZXR10 M6000 Series Installation Guide
No ratings yet
ZXR10 M6000 Series Installation Guide
29 pages
The ShiningRadiant Book
No ratings yet
The ShiningRadiant Book
194 pages
Process and Project Metrics
No ratings yet
Process and Project Metrics
20 pages
Unit - 4 ADC
No ratings yet
Unit - 4 ADC
40 pages
Data Communication and Network Questions and Answers PDF
No ratings yet
Data Communication and Network Questions and Answers PDF
3 pages
Samsung Np-r410 PCB Diagram
No ratings yet
Samsung Np-r410 PCB Diagram
48 pages
Internship DIU
No ratings yet
Internship DIU
33 pages
Marc and Jenssen Industrial Corporation
No ratings yet
Marc and Jenssen Industrial Corporation
6 pages
840D SL ADI4 Equip Man 0721 en-US
No ratings yet
840D SL ADI4 Equip Man 0721 en-US
90 pages
Files2Sql - Manual (PDF Library)
No ratings yet
Files2Sql - Manual (PDF Library)
32 pages
Remote Maintenance System - Highlights
No ratings yet
Remote Maintenance System - Highlights
2 pages
Manual de Instalare Detector de Soc CQR TRAPPER
No ratings yet
Manual de Instalare Detector de Soc CQR TRAPPER
2 pages
Bike of The Future-Pneumatic Bike2
No ratings yet
Bike of The Future-Pneumatic Bike2
17 pages
Subnetting Class C Addresses
No ratings yet
Subnetting Class C Addresses
6 pages
Innovation Models
No ratings yet
Innovation Models
16 pages
PDF
No ratings yet
PDF
9 pages
Lab No.3 Maham
No ratings yet
Lab No.3 Maham
9 pages
Resume Format
No ratings yet
Resume Format
2 pages
Lundahl: Tube Amplifier Interstage Transformer / Line Output Transformer LL1692A
No ratings yet
Lundahl: Tube Amplifier Interstage Transformer / Line Output Transformer LL1692A
2 pages
L1 - Course Intro
No ratings yet
L1 - Course Intro
22 pages
Solar Energy Minor
No ratings yet
Solar Energy Minor
3 pages
Security and Safety Aspects of AI in Industry Applications: Ntroduction
No ratings yet
Security and Safety Aspects of AI in Industry Applications: Ntroduction
7 pages
L 5 OSI Final D
No ratings yet
L 5 OSI Final D
16 pages
6 6 22 Revised07062022
No ratings yet
6 6 22 Revised07062022
15 pages
Assignment For Day 1 - Implementation of AI & ML For Real-World Applications - Challenges and Best Practices
No ratings yet
Assignment For Day 1 - Implementation of AI & ML For Real-World Applications - Challenges and Best Practices
4 pages
Startup Manual
No ratings yet
Startup Manual
5 pages
The Numbers Version Mini PDF
No ratings yet
The Numbers Version Mini PDF
10 pages
A Compiler and Runtime Infrastructure For Automatic Program Distribution
No ratings yet
A Compiler and Runtime Infrastructure For Automatic Program Distribution
10 pages
Timers On The ATmega168 - 328 - QEEWiki
No ratings yet
Timers On The ATmega168 - 328 - QEEWiki
9 pages
The Net Command Prompt Command Manages Almost Any
No ratings yet
The Net Command Prompt Command Manages Almost Any
5 pages
MSG Command Syntax
No ratings yet
MSG Command Syntax
3 pages
41 C Evening
No ratings yet
41 C Evening
2 pages
LG Flatron L3000a Prospecto
No ratings yet
LG Flatron L3000a Prospecto
1 page
Work Plan (Summer) - D. Bacus
No ratings yet
Work Plan (Summer) - D. Bacus
3 pages
Effective Vulnerability Management: Managing Risk in the Vulnerable Digital Ecosystem
From Everand
Effective Vulnerability Management: Managing Risk in the Vulnerable Digital Ecosystem
Chris Hughes
5/5 (1)
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
From Everand
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
Devi Prasad
No ratings yet

Ieee Access Image Malware Aug22

Uploaded by

Ieee Access Image Malware Aug22

Uploaded by

This article has been accepted for publication in IEEE Access.

An Attention Mechanism for

the number of trainable parameters of the best model in the dataset.

relationships between words and the quality of embedding

III. PROPOSED METHOD

FIGURE 3. The structure of CBAM [40]

FIGURE 4. An overview of proposed method

Class Family name No. of samples Percentage(%) Accuracy Time

TABLE 7. Comparision with existing state-of-the-art algorithms

Studies Year Techniques Accuracy(%) Number of parameters(M)

Class 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Precision Recall F1 Score

to capture lower-range features. Compared to image data REFERENCES

VAN TUAN DAO was born in Thai Binh province,

HIROSHI SATO is an Associate Professor of the

MASAO KUBO is an Associate Professor of the

You might also like