0% found this document useful (0 votes)
93 views

Image-Based Scam Detection Method Using An Attention Capsule Network

This document describes a method for detecting image-based scams using an attention capsule network (SE-CapsNet) focused on Ethereum. The method extracts features from smart contract bytecode and ABI, converts them into RGB images, and uses the images as input for an SE-CapsNet model to detect Ponzi scheme contracts. The SE-CapsNet obtains an F1 score of 98.38% for detecting contracts, outperforming methods relying only on source code or single features. The method addresses challenges like lack of source code and imbalanced data through techniques like fancy PCA for data augmentation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views

Image-Based Scam Detection Method Using An Attention Capsule Network

This document describes a method for detecting image-based scams using an attention capsule network (SE-CapsNet) focused on Ethereum. The method extracts features from smart contract bytecode and ABI, converts them into RGB images, and uses the images as input for an SE-CapsNet model to detect Ponzi scheme contracts. The SE-CapsNet obtains an F1 score of 98.38% for detecting contracts, outperforming methods relying only on source code or single features. The method addresses challenges like lack of source code and imbalanced data through techniques like fancy PCA for data augmentation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Received December 30, 2020, accepted February 4, 2021, date of publication February 16, 2021, date of current version

March 3, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3059806

Image-Based Scam Detection Method Using an


Attention Capsule Network
LINGYU BIAN 1, LINLIN ZHANG1 , KAI ZHAO1 , HAO WANG2 , AND SHENGJIA GONG1
1 School of Cyber Science and Engineering, College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
2 School of Software, Xinjiang University, Urumqi 830046, China

Corresponding author: Linlin Zhang ([email protected])


This work was supported in part by the Natural Science Foundation of Xinjiang Uygur Autonomous Region under Grant 2019D01C062,
Grant 2019D01C041, and Grant 2020D01C028; in part by the Graduate Research Innovation Project of Xinjiang Uygur Autonomous
Region under Grant XJ2020G065 and Grant XJ2019G063; in part by the National Innovation Training Project for College Students under
Grant 201910755030; in part by the National Natural Science Foundation of China under Grant 12061071; and in part by the Higher
Education of Xinjiang Uygur Autonomous Region under Grant XJEDU2017M005 and Grant XJEDU2020Y003.

ABSTRACT In recent years, the rapid development of blockchain technology has attracted much attention
from people around the world. Scammers take advantage of the pseudo-anonymity of blockchain to
implement financial fraud. The Ponzi scheme, one of the main scam methods, has defrauded investors of
large amounts of money, thereby harming their interests and hindering the application of blockchain. Unfor-
tunately, the current detection technology typically largely relies on the source code of the contract or uses
a single feature which does not fully represent the contract characteristics. In such a case, the detection
of Ponzi schemes with high efficiency becomes urgent. In this paper, we propose an image-based scam
detection method using an attention capsule network (SE-CapsNet) focused on Ethereum. The sequence of
bytecode, the opcode frequency, and the application binary interface (ABI) call are extracted as features from
the contract bytecode and ABI, further converted into grayscale images, and then mapped into three color
channels to generate RGB images, which are used as the input of the model for detecting the Ponzi scheme
contract. In addition, we employ fancy PCA for data augmentation to reduce the impact of imbalanced data on
the detection results. Experimental results show that the image-based detection method using deep learning
models can effectively detect contracts before transactions occur. Among them, our proposed SE-CapsNet
obtains great detection results, with an F1 score of 98.38%.

INDEX TERMS Blockchain, capsule network, Ethereum, Ponzi scheme, smart contract.

I. INTRODUCTION Currently, the situation is worsening with increasing fraud


After decades of development, blockchain has emerged as taking place in blockchains. According to the latest research
a technology with a wide range of applications, and it report published by Chainalysis [2], a blockchain analysis
has attracted extensive attention from both academia and company, the total value of defrauded cryptocurrency was
industry, especially in the field of cryptocurrency, where as high as 4.3 billion US dollars in 2019, and most of it
market valuations such as Bitcoin and Ether are rising at came from Ponzi schemes (up to 92%). The Ponzi scheme
increasing rates. Under the lure of huge profits, due to is a typical well-known type of pyramid scheme that usually
the pseudo-anonymity of blockchain technology, scammers promises high rates of return with little risk for investors
hidden behind pseudonymous accounts can easily complete to create the illusion of making money [3]. However, most
cryptocurrency transactions as normal traders without their investors are unable to identify scams, and once they invest,
true intentions being identified [1]. Once a financial scam the economic losses caused are generally irreversible. There-
occurs, it is difficult to track, let alone take countermeasures fore, to some extent, we can say that the Ponzi scheme has
or even recover property, and this hurts invertors heavily. damaged the reputation of the whole blockchain ecosystem,
including Ethereum.
The associate editor coordinating the review of this manuscript and Ethereum is an open-source and blockchain-based decen-
approving it for publication was Dongxiao Yu . tralized platform that enables programmers, as well as

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
33654 VOLUME 9, 2021
L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

scammers, to create versatile smart contracts and decentral- process. The main contributions of this detection task are
ized applications [4]. That is, scammers can easily create a divided into the following four aspects.
Ponzi scheme. In recent years, the number of Ponzi schemes 1) Most contracts have bytecode and ABI. By analyzing
has increased daily. Many famous Ponzi schemes, such as the bytecodes and ABIs of both Ponzi scheme contracts
PlusToken, Forsage, and FairWin, can be found on Ethereum. and non-Ponzi contracts, the problem of the lack of
Investors have lost hundreds of millions of dollars to these source code in practical applications can be solved.
Ponzi schemes. Hence, it is an urgent task to detect Ponzi Once the contract is deployed, we can check whether
schemes on Ethereum. the contract is a Ponzi scheme or not, hence the losses
Early Ponzi schemes could be found in the investment incurred by investors can be reduced.
advertisements of the Ethereum community forum. Ethereum 2) After years of research, malware detection combined
uses an account-based model, which contains two types of with code visualization has proven to be an efficient and
accounts. One is an externally owned account, and the other capable detection method. Based on the downloaded
is the so-called contract account, which will be shortened as bytecode and ABI, the bytecode sequence, the opcode
a contract in the following text. In this case, the behavior frequency sequence, and the ABI call sequence are
of a scammer is often embodied by the externally owned obtained, and the above three features are combined
account and its related transactions. Therefore, corresponding to generate RGB images. In this way, we can improve
features can be extracted based on historical transactions to the problem that a single feature cannot comprehen-
detect Ponzi schemes. Nonetheless, this method cannot be sively represent the characteristics of a Ponzi scheme
performed in real time, and it requires a large amount of contract.
relevant transaction information, such as a full dataset. 3) We use fancy PCA to enhance the data of Ponzi scheme
Smart contracts are programs that execute autonomously images, and we obtain a total of 1,600 Ponzi scheme
without a third party on Ethereum [5]. Over time, the imple- images to form a relatively balanced dataset. By using
mentation of Ponzi schemes as smart contracts has gradu- such a method, the impact of extremely imbalanced
ally become more popular. Scammers use this auto-executive data on the detection results can be reduced.
feature of smart contracts to publish a convincing project 4) We combine the Squeeze-and-Excitation (SE) block
plan, which shows that investors can obtain bonuses in time, and capsule network to detect Ponzi schemes. The SE
to gain investor trust. Therefore, we can detect Ponzi schemes block has a simple structure, and the accuracy of the
by analyzing the source codes of the smart contracts before experiment can be improved by calculating the chan-
transactions occur, and this could prevent the widespread use nel attention of the image. The capsule network can
of such Ponzi schemes. However, the following challenges capture additional information, is suitable for a small
regarding the analysis of Ponzi scheme contracts still exist. dataset, and is proven to detect Ponzi scheme contracts
1) Insufficient source code. According to Etherscan [6], efficiently on Ethereum.
only approximately 1% of smart contracts have avail- The remainder of this paper is organized as follows.
able Solidity source code [7]. In cases with insuffi- Section II introduces the research trends of related fields
cient contract source code, the selection of appropriate from three viewpoints. In Section III, we emphatically elab-
features directly affects the performance of the detec- orate on the image-based scam detection method using the
tion method. In addition, how to express the selected SE-CapsNet proposed in this paper; it is divided into three
features also needs to be carefully considered. modules and introduced in detail. We evaluate the proposed
2) Low accuracy. The existing research works have method in Section IV and summarize the study in the last
proven that deep learning technology is a feasible section of the paper.
method in the field of smart contract classification [8].
Since the code length of a smart contract is short, II. RELATED WORK
we need to choose an appropriate deep learning model A. SCAM ACCOUNT DETECTION
because it may directly affect the accuracy of our The ecosystem of Ethereum shows that the phenomenon of
experiments. scamming is becoming increasingly serious, and the detection
The purpose of this paper is to design a novel detection of scams is imminent. Existing research works [9], [10] have
method that can detect not only Ponzi schemes but also addi- focused on extracting features from the transaction history
tional types of scams in the future. Therefore, we propose an information of externally owned accounts to detect scams.
image-based scam detection method using an attention cap- For example, Wu et al. [11] proposed a novel algorithm
sule network (SE-CapsNet) based on Ethereum. First, based named trans2vec based on transaction amount and times-
on the contract address of the Ponzi scheme, the bytecode tamp, was used to extract features, and it combined with
and application binary interface (ABI) of the corresponding a one-class support vector machine to detect phishing on
contract are downloaded as basic features. Then, three types Ethereum. Toyoda et al. [12] extracted the frequencies of
of features are extracted and converted into grayscale images. patterns as key features. The experimental results established
Next, they are merged into an RGB image as an input for that approximately 83% of HYIP accounts (belonging to a
the model to complete the Ponzi scheme contract detection Ponzi scheme) can be correctly classified using the XGBoost

VOLUME 9, 2021 33655


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

algorithm. Bartoletti et al. [13] extracted a total of 32 features The conventional methods of code visualization select a
from the transaction history of Bitcoin, used downsampling single feature, for example, binary data, which could not com-
to balance the dataset, and combined supervised machine pletely reflect the structural characteristics of code. Research
learning algorithms to detect Ponzi schemes. Subsequently, related to image enhancement has slowly emerged over time.
Bartoletti et al. [14] identified a Ponzi scheme on Ethereum Sun et al. [24] enhanced malicious code images by filling
by analyzing the number of transactions and transaction binary data, ASCII character information, and PE structure
amounts, and then they evaluated the impact of the Ponzi information into three channels to form an RGB image.
scheme. This method increased the interpretability of the model and
There are certain deficiencies in the current scam detec- improved the accuracy and robustness of code detection.
tion methods for externally owned accounts, because a Fang et al. [25] mapped the entropy, bytecode and proportion
large amount of transaction data is required as the basis of sections in a DEX file to the red channel, green channel and
for feature extraction, and real-time detection is inferior. blue channel of an RGB image. Then, the color and texture
Therefore, related research based on contract codes has of the image and text were extracted as features. Eventually,
gradually emerged. Torres et al. [15] employed a symbolic the F1 score of the familial malware classification was 96%.
execution approach, defined a heuristic method for automat- The approach based on image enhancement not only
ically detecting honeypot contracts, and analyzed honeypot improve the accuracy of detection but also solve the problem
contracts from the perspectives of behavior, diversity, and of insufficient information resulting from the use of a single
activity. Chen et al. [16] collected smart contracts to obtain feature. Therefore, we extract various features based on byte-
bytecode and built a control flow graph (CFG), which was codes and ABIs of smart contract and choose the method of
used to identify non-deployable contracts and help explore image enhancement to complete the detection task.
contract transaction rules. Chen et al. [17] extracted features
from the transaction histories and opcodes of smart contracts, C. SE BLOCK AND CAPSULE NETWORK
and established a smart Ponzi scheme classification model. This paper intends to combine the SE block with the capsule
Jung et al. [18] proposed full-feature model, which combined network. In the related field of computer vision, attention
the Gini time features based on transaction behaviors with the mechanisms are divided primarily into the spatial domain and
opcode features of contract to detect Ponzi schemes. channel domain. The spatial domain is mainly used to extract
In summary, by analyzing the opcode and other features the key information of an image, while the channel domain
of a smart contract code, it is possible to analyze whether focuses on the weight of the new channel generated by the
the logic of the code is similar to those of Ponzi scheme image through the convolution kernel. The SE block [26]
contracts, and this approach plays a positive role in improving belongs to the attention mechanism of the channel domain,
the detection results. So, we use the bytecodes and ABIs and it is mainly used to learn the relationships between chan-
owned by most contracts instead of source code to detect nels with the goal of obtaining the channel attention weight.
Ponzi schemes. Moreover, the SE block is simple and compact with respect to
the calculation, and it has also achieved outstanding perfor-
B. CODE VISUALIZATION mance on the ImageNet classification task. Yu [27] combined
In more recent years, code visualization has been widely the SE block with the bottleneck layer for the classification
adopted in malware classification tasks, providing an end- of retinal diseases. The experimental results showed that the
to-end detection method that can effectively process data classification accuracy had been significantly improved with
samples; this technique has obtained remarkable classifi- a few network parameters.
cation results [19]. In 2011, Nataraj et al. [20] proposed The capsule network (CapsNet) was proposed by the
converting binary code to the values of pixels to generate Hinton team in 2017 [28]. Although a convolutional neural
a grayscale image for the classification of malware. Since network has translation invariance, the pooling layer still
then, code visualization has received widespread attention causes information loss. The capsule network is based on
from researchers. Cui et al. [21] converted malicious code structures called capsules and uses dynamic routing algo-
into grayscale images, and the problem of imbalanced data rithms between capsules. The relationships between low-
was solved using the bat algorithm. Finally, the features of level features and high-level features can be well represented
the malware images were extracted automatically by a con- by the capsule network, which can obtain a good classifica-
volutional neural network, which classified the images with tion effect with a small amount of data. The capsule network
accuracy reaching 94.5%. Cui et al. [22] further improved consists primarily of a convolutional layer, a PrimaryCaps
upon their work by using the non-dominated sorting genetic layer and a DigitCaps layer.
algorithm II (NSGA-II) to obtain a better classification effect. Research has shown that the capsule network performs
Naeem et al. [23] designed a method to convert binary well in image classification tasks, especially when the dif-
malware files into grayscale images and used two patterns, ferences among classes are small [29]–[31]. However, the
local and global, to classify malware. From the experi- capsule network is not suitable for the classification of com-
mental portion of their study, the classification accuracy plex images. Given this situation, researchers have improved
reached 98.4%. the capsule network according to application requirements.

33656 VOLUME 9, 2021


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

FIGURE 1. The framework of Ponzi scheme contract detection.

Xiang et al. [32] proposed MS-CapsNet, which added multi- contracts that include code logic through a compiler, while
scale capsule encoding units behind the convolutional layer. at the same time obtaining bytecodes and ABIs. They then
These units can extract different levels of features and encode deploy smart contracts on the Ethereum network. So, we can
them into primary capsules of different dimensions, and easily get the bytecodes and ABIs of almost all contracts. The
the improved dropout can enhance the robustness of the existing scam contract detection methods mainly use a single
capsule network. Cheng et al. [33] proposed two struc- feature, which has limited ability to express the content of the
tures, Cv-CapsNet and Cv-CapsNet++, to obtain complex- contract.
valued features and complex-valued capsules. At the same Feature visualization refers to the extraction of three types
time, the dynamic routing algorithm was extended to the of features based on bytecodes and ABIs, including byte-
complex-valued domain. Compared to other existing meth- code sequences, opcode frequency sequences and ABI call
ods, this method requires fewer trainable parameters and can sequences. The sequence of bytecodes usually used in static
be adapted to complicated datasets. analysis methods can reflect the semantic information of the
It can be seen that improving the CapsNet for differ- contract. The frequency sequence of opcodes can indicate the
ent application requirements is a feasible research method. most frequent operations of the contract. The sequence of
Considering that the number of samples in this paper is ABI calls can represent the context information of the calling
insufficient, the image structure is not too complicated; fur- contract. Using these features will greatly help the accuracy
thermore, the accuracy needs to be improved, meanwhile of detection. Then, the corresponding images are generated,
the detection speed should not be too slow, we employ the and these images are finally combined into RGB images.
SE block to improve the structure of the capsule network to The RGB images have three color channels, and these chan-
properly fit the detection objects in this paper. nels can store information and intuitively visualize contracts.
In addition, because the images generated by similar contracts
III. PROPOSED METHOD
are also similar, so we can detect scam based on images.
Fraud continues to increase year by year, especially with an
1) BYTECODE SEQUENCE
increasing number of Ponzi scheme contracts appearing on
A bytecode is composed of a string of hexadecimal digits,
Ethereum; thus, an image-based scam detection method using
and this makes it difficult to read and understand the bytecode
SE-CapsNet is proposed. The detection samples in this paper
directly. Therefore, we use the code visualization method to
are Ponzi scheme contracts and non-Ponzi scheme contracts.
convert the bytecodes into pixels, which can reduce the fea-
In the following, we refer to the Ponzi scheme and non-Ponzi
ture extraction time. We convert the bytecode into a decimal
scheme as Ponzi and non-Ponzi. The method consists of three
integer in order and employ the digit 255 to represent white
modules, which are feature visualization, data balancing, and
and 0 to represent black; therefore, a grayscale image can
model detection, respectively. The framework of the proposed
be generated that shows the sequence of a given bytecode.
Ponzi scheme contract detection method is shown in Figure 1.
Figure 2 demonstrates the process of producing bytecode
See the following Algorithm 1 for details.
sequence image.

A. FEATURE VISUALIZATION 2) OPCODE FREQUENCY SEQUENCE


Developers use the high-level programming language Solid- Our work uses the Ethereum EVM bytecode disassem-
ity [34] to write the code of smart contracts and compile smart bler named ethereum-dasm [35] to convert bytecodes into

VOLUME 9, 2021 33657


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

Algorithm 1 Image-Based Scam Detection Method Using an


Attention Capsule Network
Input: Ponzi scheme dataset: download the bytecode (Bi ) and
ABI (Ai ) of each contract i;
Output: Detection results: accuracy, precision, recall, and
F1 score;
1: Step 1: Feature visualization
2: While (Ai != null) and (Bi != null) do FIGURE 3. The process of generating opcode frequency sequence.
3: bytecode sequence = bytearray (Bi );
4: Image_BYi = visualization (bytecode algorithm, we want to obtain opcodes that are frequently used
sequence); // This method means every and helpful for the Ponzi contract detection task. To obtain
eight-bit binary number is converted to a a square image, we select the 32 opcodes with the largest
decimal number and then mapped into an TF-IDF weights from the disassembly results. The calcula-
image tion formulas of TF-IDF are as shown in (1) - (3):
5: opcode sequence = disassembly (Bi );
6: opcode weight = TF-IDF (opcode sequence); TF − IDF i,j = TF i,j ∗ IDF i (1)
7: opcode frequency sequence = Simhash Ni,j
TF i,j = P (2)
(opcode weight); Nk,j
k
8: Image_OPi = visualization (opcode frequency
D
sequence); IDF i = log( ) (3)
9: ABI call sequence = DFS (Ai ); Di + 1
10: get binary file after PV-DM (ABI call In this paper, we disassemble each contract to obtain the
sequence); corresponding opcode, thus forming an opcode document for
11: Image_ABi = visualization (binary of ABI each contract. Here, TF stands for term frequency, the numer-
call sequence); ator represents the number of occurrences of opcode i in
12: Feature_imagei = merge (Image_BYi , document j, and the denominator is the sum of occurrences
Image_OPi , Image_ABi ); of all opcode k in document j. IDF stands for inverse docu-
13: End While ment frequency, D stands for the total number of all opcode
14: End Step 1 documents, and Di stands for the total number of documents
15: Step 2: Data balancing that contain opcode i.
16: New images = Fancy PCA(Feature_imageK ); SimHash is a locality sensitive hash algorithm [36], which
// K stands for Ponzi contracts means that similar inputs yield similar outputs to maintain
17: New dataset = Feature_imagei + New images; data similarity. SimHash is used to extract the binary informa-
18: End Step 2 tion of the 32 opcodes by calculating the hash value of each
19: Step 3: Model detection opcode through the hash function. Using this method, each
20: Add SE block after the convolutional layer of opcode can be encoded into binary code with a fixed length
CapsNet to build the SE-CapsNet; of 64 bits. Then, all the binary codes representing the opcodes
21: Train the SE-CapsNet; are linked together in groups of eight binary numbers for
22: Input the test samples into the trained network to conversion to decimal numbers to generate the image of the
detect; opcode frequency sequence. The image generation process
23: End Step 3 is the same as the steps for generating bytecode sequence
24: Return the detection result. images. Figure 3 shows the process of generating an image
for the opcode frequency sequence.

3) ABI CALL SEQUENCE


The ABI is the standard way to interact with contracts in
the Ethereum ecosystem, and it is similar to the API of the
application. The calling relationship between functions and
events can be expressed effectively by the ABI, as we usually
believe that the ABI call sequences of the scam contract are
unlike with those of regular contracts. By extracting the ABI
FIGURE 2. The process of producing bytecode sequence.
call sequence from a smart contract, we can discover the
contextual relationships between ABIs. Therefore, we choose
opcodes. EVM uses single-byte opcodes, which means that the ABI call sequence as a feature in this paper.
each byte represents an operation or instruction. Based on First, we traverse the ABI of each contract with depth-first
the term frequency–inverse document frequency (TF-IDF) searches and obtain the calling order between ABIs. Then, the

33658 VOLUME 9, 2021


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

normal contracts. Thus, there is a data imbalance problem


between the Ponzi images and the non-Ponzi images gener-
ated in this paper. Unlike in the field of image recognition,
the images produced in this paper must have their integrity
ensured. Fancy PCA (PCA jittering) has the advantages of
simplicity and efficient operation [38], so it is suitable for the
rapid augmentation of Ponzi contracts.
FIGURE 4. The process of generating ABI call sequence.
Fancy PCA performs principal component analysis on the
pixel values of RGB images to obtain a 3 ∗ 3 covariance
matrix, calculates the corresponding eigenvalues and eigen-
vectors, and then arranges them in descending order. The
augmented image is composed of the pixel value of the
original RGB image plus the result of the following formula.
The calculation formula is shown as (4):
[P1 , P2 , P3 ] [α1 λ1 , α2 λ2 , α3 λ3 ]T (4)
where P1 , P2 , P3 are the eigenvectors, λ1 , λ2 , λ3 represent
the eigenvalues, and αi is a random variable drawn from a
Gaussian with mean 0 and standard deviation 0.1 in the orig-
inal paper. The new Ponzi images are generated by adjusting
the standard deviation parameters together with the original
sample of Ponzi images to form a new Ponzi dataset, further
achieve the effect of data balancing, and finally improve the
detection effect of the proposed method.

FIGURE 5. RGB image generation. C. MODEL DETECTION


The applications of deep learning in the related field of
corresponding word segmentation task is performed. Next, malware detection have achieved excellent research achieve-
we select the distributed memory model of paragraph vectors ments; in particular, the capsule network considers the
(PV-DM) [37] to generate paragraph vectors that can be used relationships between features, and this approach has advan-
as the feature. In particular, the model adds a paragraph tages when applied to small samples. This paper combines a
token that maps each paragraph to a unique vector. Each ABI channel attention mechanism, called the SE block, with the
sequence after text processing is regarded as a paragraph, and capsule network to form the SE-CapsNet model, which is
the paragraph vector of each ABI is obtained after training. mainly composed of the following four layers.
Considering that the generated image cannot be too small,
we aim to generate a 1024-dimensional paragraph vector for 1) CONVOLUTIONAL LAYER
each ABI sequence, store it as a. bin file, and then convert The first layer is a simple convolutional layer designed to
the binary number to a decimal. After image visualization is extract local features using 3 × 3 convolution kernels with
completed, the ABI call sequence composed process is shown a step size of 1 in combination with the ReLU activation
in Figure 4. function.

4) IMAGE MERGING 2) SE LAYER


The three types of sequences reflect features of three different The SE block is simple to use, it can improve the feature
states. By mapping the bytecode sequence on the R channel, extraction ability of the model, and it is conducive to clas-
the opcode frequency sequence on the G channel, the ABI sification, which consists of two operations: squeeze and
call sequence on the B channel, and finally merging the excitation. The purpose of squeeze operation is to obtain
three channels together to form an RGB image, our method the global features of a given channel. uC represents the
not only combines multiple features reasonably and orderly C-th feature map, which is output by the convolutional layer.
but also compensates for the shortcomings of single features Through global average pooling, we can obtain channel-wise
to some extent. Figure 5 illustrates how an RGB image is statistics zC . Excitation operation is the process of learning
merged. channel weights, where σ represents the sigmoid activation
function, δ denotes the ReLU function, and W1 and W2 are
B. DATA BALANCING the dimensionality-reducing and dimensionality-increasing
The data of scam contracts are usually extremely unbalanced. actions, respectively. Through the excitation operation, we
In most cases, contracts are normal. Therefore, the number can quickly learn a nonlinear interaction between channels
of Ponzi contracts is very small compared to the number of and finally obtain the learned channel weights s. Finally,

VOLUME 9, 2021 33659


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

is a Ponzi scheme. The creators marked 3590 non-Ponzi


contracts and 200 Ponzi contracts. The bytecodes and ABIs of
contracts are obtained through the API interface provided by
Etherscan [6], and a contract is removed if its bytecode or ABI
is null. There are 27 abnormal non-Ponzi contracts, and
finally 3563 non-Ponzi contracts are selected as the data for
this paper. Then, the features of the downloaded bytecodes
and ABIs are extracted and processed, and they are visualized
and converted into RGB images to obtain a relatively imbal-
anced dataset. After the augmentation of image data, Ponzi
FIGURE 6. Connection process between capsule layers.
images are obtained, forming a more balanced dataset. For
this experiment, 70% of the data are selected randomly as the
through a scale operation, the learned channel weights are training set, 10% of the data are selected as the validation set,
multiplied with the original feature maps to obtain the atten- and the remainder of the data are used as the test set.
tion feature maps as the output of the SE block. The calcula-
tion formulas are shown in (5) - (7): B. METRICS
1 XH XW For single-label image classification problems, accuracy, pre-
zC = Fsq (uC ) = uC (i, j) (5)
H ×W i=1 j=1 cision, recall, F1 score, ROC and AUC are usually cho-
s = Fex (z, W ) = σ (g (z, W )) = σ (W2 δ (W1 z)) (6) sen as evaluation metrics. Comprehensively considering the
xeC = Fscale (uC , sC ) = sC · uC (7) experimental data, this paper selects the first four metrics
for measuring the effectiveness of the proposed method. The
3) PRIMARYCAPS LAYER specific calculation formulas are as shown in (11) - (14):
After the SE block, we take each feature map with its corre-
TP + TN
sponding attention weight as the input of PrimaryCaps. The Accuracy = (11)
PrimaryCaps layer is different from the ordinary convolution TP + TN + FP + FN
TP
layer. According to its definition, after this layer, we can Precision = (12)
obtain capsules, and the capsules can also be called vectors, TP + FP
TP
which can store much information. Recall = (13)
TP + FN
4) DIGITCAPS LAYER
2 ∗ precision ∗ recall
F1 = (14)
The DigitCaps layer is used to store Ponzi and non-Ponzi precision + recall
capsules. The final output is represented by the vector. Among them, TP stands for a Ponzi contract that is actually
A squashing function is used by the capsule network. While judged as a Ponzi contract; FN denotes a real Ponzi contract
maintaining the direction of the vector, the length of the that is judged as a non-Ponzi contract; FP stands for a non-
output vector is used as the probability of the presence of Ponzi contract that is misjudged as a Ponzi contract; TN
an entity. The calculation formulas between capsule i and means that a non-Ponzi contract is correctly judged as a non-
capsule j are shown in (8) - (10): Ponzi contract.
ûj|i = Wij ui (8)
X C. EXPERIMENTAL SETUP
sj = cij ûj|i (9)
i
2 This paper incorporates the SE block and capsule network to
sj sj complete the task of detecting Ponzi contracts on Ethereum.
vj = 2 (10) The image widths of the model inputs may affect the results
1 + sj sj

of the experiments. According to the image width recommen-
where Wij represents the weight matrix, representing the dations for the various file sizes proposed in the paper in [20],
relationship between capsule i and capsule j, and ûj|i means the image width for files less than 10 KB is generally selected
the prediction that the i-th low-level capsule constitutes the as 32. Therefore, we uniformly use 32 ∗ 32 RGB images as
j-th high-level capsule. cij is the coupling coefficient obtained the input of the model.
through dynamic routing. The output vj is judged by the result In terms of experimental settings, we use the Python
of the final squashing function. The process between capsule language to build our method. For feature visualization,
layers is shown in Figure 6. we employ NumPy, Pandas, OpenCV and other Python pack-
ages for image feature extraction and processing. The mod-
IV. EXPERIMENTAL EVALUATION els are built by using Keras and TensorFlow. During the
A. DATASET experiment, the parameters of the model affect the training
This paper uses the public Ponzi scheme dataset [17], which results. Considering the actual situation regarding the type of
has manually-checked code logic to determine if the contract detection task, the number of samples, the memory size of

33660 VOLUME 9, 2021


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

TABLE 1. The parameters in the proposed method. TABLE 2. Performance evaluation of datasets with different ratios.

TABLE 3. Experimental results obtained by different models.

FIGURE 7. Comparison with different data augmentation methods. of the standard deviation αi in fancy PCA, we can obtain
many new Ponzi contract images. In real life, the number of
the CPU and the time consumption of the training process, non-Ponzi contracts is generally greater than that of Ponzi
we select appropriate parameters, as shown in Table 1. contracts. Under the assumption that this condition is met,
we explore the impact of different ratios of non-Ponzi con-
D. EXPERIMENTAL RESULTS AND ANALYSIS tracts to Ponzi contracts on the experimental results. The
1) COMPARISON WITH DIFFERENT DATA experimental results are shown in Table 2.
AUGMENTATION METHODS For imbalanced data, the accuracy metric is not applica-
Imbalanced data is the primary problem that needs to be ble, because the model may be biased towards the majority
resolved. After data screening, the ratio of Ponzi contracts class, it can easily achieve a high accuracy. In this part,
to non-Ponzi contracts is approximately 1:18, and there is we use weighted average to calculate the precision, recall,
a serious data imbalance. Existing data augmentation meth- and F1 score of extremely imbalanced data. From the table,
ods include perspective skewing, elastic distortions, rotating, we can see that without image augmentation, the original
shearing, cropping, mirroring, etc. The fancy PCA method dataset is used for Ponzi contract detection, and the model
mainly realizes image augmentation by changing the inten- almost predicts most of the test set data as non-Ponzi con-
sity of the RGB channel in the training image. The following tracts, resulting in very low precision, recall and F1 score.
figure shows the experimental results of image augmentation When the ratio between the two contract types is approxi-
using skewing, shearing, rotating, cropping, mirroring, and mately 1:5. At this time, the F1 score rise steadily, reaching
fancy PCA methods. 71.80%. In the end, when the ratio of Ponzi contracts to
As shown in Figure 7, the recall of fancy PCA is lower non-Ponzi contracts is 1:2, all evaluation metrics achieve
those that of cropping and skewing, but the best results relatively good results. At the same time, it’s consistent with
are obtained for other metrics. The F1 score is increased the fact that there are more non-Ponzi contracts than Ponzi
by 4.30% and 4.14% compared to those of cropping and contracts. From this, we can see that the expansion of the
skewing, respectively. The accuracy is approximately 2% dataset to achieve balanced data and the selection of an
higher than those of the other five methods. After analyz- appropriate data ratio are critical to the effectiveness of the
ing the experimental results and performing comprehensive experimental results.
measurements, we select the fancy PCA method for data
augmentation, as it can achieve the best effect and has a 3) COMPARISON WITH EXPERIMENTAL RESULTS OBTAINED
positive influence on the classification results. BY DIFFERENT MODELS
Next, we consider how to verify whether different mod-
2) PERFORMANCE EVALUATION OF DATASETS WITH els have an impact on the detection of Ponzi schemes.
DIFFERENT RATIOS Nine models, Random Forest, XGBoost, AdaBoost,
When the number of Ponzi contracts is insufficient, this seri- LightGBM, VGGNet, ResNet, MiniGoogLeNet, MobileNet,
ously affects the classification effect of the model, resulting and DenseNet, are selected. We can see the detection results
in large classification errors and an extremely low detection obtained by different models in the comparative experiment
rate with respect to Ponzi contracts. By changing the value in Table 3.

VOLUME 9, 2021 33661


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

TABLE 4. The SE-CapsNet model compared with the CapsNet. TABLE 5. Performance comparison only based on contract information.

As seen from the above table, the image-based scam detec-


tion method has achieved good results in both machine learn-
ing and deep learning methods. The XGBoost performs better
in machine learning models. The detection method based on
deep learning can obtain a higher F1 score, and this means
that deep learning can be applied to Ponzi contract detection.
However, due to the complex structures of models such
as DenseNet and MobileNet, under the same training
conditions, the results are not yet optimal. In particular,
SE-CapsNet yields good experimental results in most eval-
uation metrics. The accuracy is 0.71% higher than that of
VGGNet, while the F1 score is also improved by 0.99%
compared to that of MiniGoogLeNet. The performance
improvement is probably due to the architecture of the
FIGURE 8. The time consumption of each step.
SE-CapsNet. The SE-CapsNet model can not only retain a
large amount of information such as position, but can also
highlight the key point of channel information through the
SE block. To verify whether the introduction of the SE block
has an impact on the effectiveness of the model, in the next
step, we would like to compare the SE-CapsNet model with
only CapsNet to verify its detection effect. Table 4 shows the
results of this experiment.
An analysis of the experimental results shows that the accu-
racy of the SE-CapsNet model in the experiment is 98.97%,
and the F1 score reaches 98.38%. SE-CapsNet obtains desir-
able classification results. Compared with the CapsNet model
FIGURE 9. Performance comparison with different features.
alone, the accuracy is improved by 0.54%, while the F1 score
is increased by 0.51%. Therefore, we can conclude that the
introduction of the SE block has a certain role in promoting 5) TIME CONSUMPTION
the classification performance of the proposed method for To further explore the efficiency of the experiment,
detecting Ponzi schemes contracts. we recorded the time consumption of each step: feature
visualization, data balancing and model detection. The exper-
4) COMPARISON BETWEEN DIFFERENT imental results are shown in Figure 8.
DETECTION METHODS Through the above figure, we can see that blue, yellow
Based on the research of the paper in [17], this paper proposes and green bars represent the time consumption of feature
a variety of features based on contract information. In recent visualization, data balancing and model detection, respec-
years, the amount of related research has gradually increased. tively. The processing time of the data balancing module is
However, some researchers use both contract and transaction the shortest, while the feature visualization component takes
features for detecting. The use of transaction features cannot up a large amount of time. Among them, the time needed
achieve the purpose of discovering the Ponzi contracts in to extract the opcodes and convert them into the opcode fre-
time. Below, we only use contract information to compare quency sequences is about 16 minutes. We know from further
our proposed method with the corresponding methods in prior calculations that the time consumption of processing each
work. contract is approximately 0.39 s, among which the feature
As shown in Table 5, our method can be used to visualization portion takes 0.27 s.
detect Ponzi contracts as soon as they are deployed to the
blockchain. It has an F1 score of 98%, which is improved 6) INFLUENCES OF DIFFERENT FEATURES ON THE
from 82%, 95% and 96% in prior works. Through exper- EXPERIMENTAL RESULTS
iments, we can see that our proposed method is not only To clarify the effects of the bytecode sequence, the opcode
effective for early Ponzi scheme detection but also improves frequency sequence and the ABI call sequence on the per-
the accuracy of detection. formance of the experiment, we set the pixels of the above

33662 VOLUME 9, 2021


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

FIGURE 10. The Ponzi and non-Ponzi images obtained by Grad-CAM calculation.

three types of feature images to 0 respectively to obtain TABLE 6. Analysis on the detection results of honeypot contract.
all-black images. And then combine the black images with the
remaining feature images to form the input of the SE-CapsNet
model. Next, we determine which feature is most important to
the experimental results. The following is a comparison graph
of the experiments using different features.
As shown in Figure 9, Ponzi contracts detection without the
opcode frequency sequence feature achieves poor experimen- that affect the classification results of the Ponzi images are
tal results. The F1 score is 86.78%, and the accuracy reaches mainly concentrated in the right and middle areas, showing
90.71%. It can be shown that the opcode frequency sequence an overall sporadic distribution trend. Grad-CAM calculates
can effectively enhance the experimental performance of the that the highlighted pixels of non-Ponzi images are mostly
model. Subsequently, we can see that there is not much concentrated in the tail line. We can see that the main pixels
difference between the results of the experiments without determined using the opcode frequency sequence and the ABI
the bytecode sequence and without the ABI call sequence. call sequence features are similar. The calculation result of the
Their accuracy rates still exceed 95%, and although these non-Ponzi images is highlighted in the left area, with regular
two features are not the most important features affecting the intervals. Highlighted pixels of Ponzi images can be clearly
performance of the model, they are still an indispensable part observed in the first row and the right area. One can see that
of the method proposed in this paper. there are obvious logical differences between Ponzi contracts
and non-Ponzi contracts. In a case where a given contract has
7) FEATURE INTERPRETABILITY ANALYSIS a source code, it is possible to analyze the end of the source
As the research on interpretability deepened, numerous inter- code, the most frequent opcode sequence, and the front part
pretable models have emerged, making the mysterious black of the ABI call sequence, thereby enabling highly efficient
box of neural networks easy for humans to understand to smart contract scam detection.
some extent. Grad-CAM [40], which is a technology that
can provide the visual interpretation, is mainly adopted in 8) CASE STUDY
this paper. Using this method, the pixels that influence the To demonstrate whether the proposed method can be used
category can be obtained and highlighted on the original to detect other fraudulent accounts, next, we detect a new
image. type of smart contract called ‘‘honeypot’’ in Ethereum. Scam-
To analyze the extracted features clearly, this paper uses mers deliberately design contracts with a flaw to entice
a neural network (CNN) to train the images of the byte- some greedy users to exploit that flaw, thereby draining the
code sequence, the opcode frequency sequence and the ABI funds of users and leading to irreparable losses [41]. This
call sequence then carries out Grad-CAM calculations, visu- paper examines the 1124 honeypot contract accounts from
ally displaying the differences between Ponzi contracts and the HONEYBADGER project [15] as the fraud dataset for
non-Ponzi contracts. The above figures show the graphs of the case study. Combined with 3563 benign accounts, after
the Ponzi contract and the non-Ponzi contract obtained by feature visualization, the SE-CapsNet model is used to detect
Grad-CAM calculations. honeypot contracts. The results of the case study are shown
From Figure 10, we know that although the three features in Table 6.
selected are different, the images generated by the Ponzi From the experimental results, we can see that the accuracy
contract and the non-Ponzi contract have regularity. From of the SE-CapsNet model reaches 97.67%, and the F1 score
the perspective of the bytecode sequence features, the pixels reaches 94.44%. It can be seen from the case study that the

VOLUME 9, 2021 33663


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

method proposed in this paper can detect not only Ponzi con- [10] L. Bian, L. Zhang, K. Zhao, and F. Shi, ‘‘Ethereum malicious account
tracts with imbalanced samples but also relatively balanced detection method based on LightGBM,’’ Netinfo Secur., vol. 20, no. 4,
pp. 73–80, Apr. 2020.
honeypot contracts, and this has certain research value for the [11] J. Wu, Q. Yuan, D. Lin, W. You, W. Chen, C. Chen, and
detection of other scam accounts in Ethereum. Z. Zheng, ‘‘Who are the phishers? Phishing scam detection on ethereum
via network embedding,’’ 2019, arXiv:1911.09259. [Online]. Available:
http://arxiv.org/abs/1911.09259
V. CONCLUSION [12] K. Toyoda, T. Ohtsuki, and P. T. Mathiopoulos, ‘‘Identification of high
The trend towards using smart contracts to execute scams is yielding investment programs in bitcoin via transactions pattern analy-
becoming increasingly severe. A Ponzi scheme is a typical sis,’’ in Proc. IEEE Global Commun. Conf. (GLOBECOM), Dec. 2017,
pp. 1–6.
scam method in Ethereum. It is difficult to detect Ponzi [13] M. Bartoletti, B. Pes, and S. Serusi, ‘‘Data mining for detecting bit-
schemes in real time with traditional transaction-based meth- coin ponzi schemes,’’ in Proc. Crypto Valley Conf. Blockchain Technol.
ods. Therefore, this paper only uses contract information (CVCBT), Jun. 2018, pp. 75–84.
[14] M. Bartoletti, S. Carta, T. Cimoli, and R. Saia, ‘‘Dissecting ponzi schemes
for Ponzi schemes detection. The proposed method uses the on ethereum: Identification, analysis, and impact,’’ Future Gener. Comput.
bytecode and ABI of the contract for detection and analysis Syst., vol. 102, pp. 259–277, Jan. 2020.
to improve upon the limitation resulting from only using [15] C. Ferreira Torres, M. Steichen, and R. State, ‘‘The art of the scam: Demys-
tifying honeypots in ethereum smart contracts,’’ 2019, arXiv:1902.06976.
the source code of the contract. After feature visualization, [Online]. Available: http://arxiv.org/abs/1902.06976
SE-CapsNet is used to detect Ponzi schemes in Ethereum. [16] T. Chen, T. Hu, J. Chen, X. Zhang, Z. Li, Y. Zhang, X. Luo, A. Chen,
The detection results are enhanced over those of other detec- K. Yang, B. Hu, T. Zhu, and S. Deng, ‘‘DataEther: Data exploration
framework for ethereum,’’ in Proc. IEEE 39th Int. Conf. Distrib. Comput.
tion methods. However, there are two shortcomings to the Syst. (ICDCS), Jul. 2019, pp. 1369–1380.
method in this paper. One is that the training time required for [17] W. Chen, Z. Zheng, E. C.-H. Ngai, P. Zheng, and Y. Zhou, ‘‘Exploiting
the model is relatively long, and the other is that the number blockchain data to detect smart ponzi schemes on ethereum,’’ IEEE Access,
vol. 7, pp. 37575–37586, Mar. 2019.
of available Ponzi contracts is insufficient. In the future, [18] E. Jung, M. Le Tilly, A. Gehani, and Y. Ge, ‘‘Data mining-based ethereum
we may consider continuing to improve the experiment, col- fraud detection,’’ in Proc. IEEE Int. Conf. Blockchain (Blockchain),
lecting additionally Ponzi samples, appropriately increasing Jul. 2019, pp. 266–273.
[19] A. Kartel, E. Novikova, and A. Volosiuk, ‘‘Analysis of visualiza-
the mapping relationships between features to enrich the fea- tion techniques for malware detection,’’ in Proc. IEEE Conf. Rus-
ture space, and optimizing the detection process in this paper sian Young Researchers Electr. Electron. Eng. (EIConRus), Jan. 2020,
(for example, using a two-sample test for detection [42]). pp. 337–340.
[20] L. Nataraj, S. Karthikeyan, G. Jacob, and B. S. Manjunath, ‘‘Malware
At the same time, we can extend the applicability of the images: Visualization and automatic classification,’’ in Proc. IEEE 8th Int.
proposed method to other types of fraud detection, such as Symp. Vis. Cyber Secur., Jul. 2011, pp. 1–7.
ransomware and fake token sales. [21] Z. Cui, F. Xue, X. Cai, Y. Cao, G.-G. Wang, and J. Chen, ‘‘Detection
of malicious code variants based on deep learning,’’ IEEE Trans. Ind.
Informat., vol. 14, no. 7, pp. 3187–3196, Jul. 2018.
ACKNOWLEDGMENT [22] Z. Cui, L. Du, P. Wang, X. Cai, and W. Zhang, ‘‘Malicious code detection
The authors would like to thank reviewers for their precious based on CNNs and multi-objective algorithm,’’ J. Parallel Distrib. Com-
put., vol. 129, pp. 50–58, Jul. 2019.
remarks and comments. [23] H. Naeem, B. Guo, M. R. Naeem, F. Ullah, H. Aldabbas, and M. S. Javed,
‘‘Identification of malicious code variants based on image visualization,’’
REFERENCES Comput. Electr. Eng., vol. 76, pp. 225–237, Jun. 2019.
[24] B. Sun, P. Zhang, M. Cheng, X. Li, and Q. Li, ‘‘Malware detection method
[1] J. Wu, J. Liu, W. Chen, H. Huang, Z. Zheng, and Y. Zhang, based on enhanced code images,’’ J. Tsinghua Univ. (Sci. Technol.), vol. 60,
‘‘Detecting mixing services via mining bitcoin transaction network no. 5, pp. 386–392, Apr. 2020.
with hybrid motifs,’’ 2020, arXiv:2001.05233. [Online]. Available: [25] Y. Fang, Y. Gao, F. Jing, and L. Zhang, ‘‘Android malware familial
http://arxiv.org/abs/2001.05233 classification based on DEX file section features,’’ IEEE Access, vol. 8,
[2] Chainalysis. (Jan. 2020). The 2020 State of Crypto Crime. [Online]. pp. 10614–10627, Jan. 2020.
Available: https://blog.chainalysis.com/reports/cryptocurrency-crime- [26] J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, ‘‘Squeeze-and-excitation
2020-report networks,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 8,
[3] T. Moore, J. Han, and R. Clayton, ‘‘The postmodern Ponzi scheme: Empir- pp. 2011–2023, Aug. 2020.
ical analysis of high-yield investment programs,’’ in Proc. 16th Int. Conf. [27] H. Yu, ‘‘Research on classification of retinal diseases based on SE-block,’’
Financial Cryptogr. Data Secur., Mar. 2012, pp. 41–56. M.S. thesis, College Softw., Jilin Univ., Jilin, China, 2019.
[4] Q. Bai, C. Zhang, Y. Xu, X. Chen, and X. Wang, ‘‘Evolution of ethereum: [28] S. Sabour, N. Frosst, and G. E. Hinton, ‘‘Dynamic routing between cap-
A temporal graph perspective,’’ 2020, arXiv:2001.05251. [Online]. Avail- sules,’’ in Proc. 31st Neural Inf. Process. Syst., Dec. 2017, pp. 3856–3866.
able: http://arxiv.org/abs/2001.05251 [29] Z. Dong and S. Lin, ‘‘Research on image classification based on capsnet,’’
[5] S. Rouhani and R. Deters, ‘‘Security, performance, and applica- in Proc. IEEE 4th Adv. Inf. Technol., Electron. Autom. Control Conf.
tions of smart contracts: A systematic survey,’’ IEEE Access, vol. 7, (IAEAC), Dec. 2019, pp. 1023–1026.
pp. 50759–50779, Apr. 2019. [30] R. Mukhometzianov and J. Carrillo, ‘‘CapsNet comparative performance
[6] (Jun. 2019). Etherscan. [Online]. Available: https://etherscan.io/ evaluation for image classification,’’ 2018, arXiv:1805.11195. [Online].
[7] W. Joon-Wie Tann, X. Jie Han, S. Sen Gupta, and Y.-S. Ong, Available: http://arxiv.org/abs/1805.11195
‘‘Towards safer smart contracts: A sequence learning approach to [31] S. Bonheur, D. Štern, C. Payer, M. Pienn, H. Olschewski, and M. Urschler,
detecting security threats,’’ 2018, arXiv:1811.06632. [Online]. Available: ‘‘Matwo-CapsNet: A multi-label semantic segmentation capsules net-
http://arxiv.org/abs/1811.06632 work,’’ in Proc. 22nd Int. Conf. Med. Image Comput. Comput.-Assist.
[8] G. Tian, Q. Wang, Y. Zhao, L. Guo, Z. Sun, and L. Lv, ‘‘Smart contract Intervent., Oct. 2019, pp. 664–672.
classification with a bi-LSTM based approach,’’ IEEE Access, vol. 8, [32] C. Xiang, L. Zhang, Y. Tang, W. Zou, and C. Xu, ‘‘MS-CapsNet: A novel
pp. 43806–43816, Mar. 2020. multi-scale capsule network,’’ IEEE Signal Process. Lett., vol. 25, no. 12,
[9] S. Farrugia, J. Ellul, and G. Azzopardi, ‘‘Detection of illicit accounts pp. 1850–1854, Dec. 2018.
over the ethereum blockchain,’’ Expert Syst. Appl., vol. 150, Jul. 2020, [33] X. Cheng, J. He, J. He, and H. Xu, ‘‘Cv-CapsNet: Complex-valued capsule
Art. no. 113318. network,’’ IEEE Access, vol. 7, pp. 85492–85499, Jun. 2019.

33664 VOLUME 9, 2021


L. Bian et al.: Image-Based Scam Detection Method Using an Attention Capsule Network

[34] (Apr. 2020). Solidity. [Online]. Available: https://solidity. KAI ZHAO received the Ph.D. degree in com-
readthedocs.io/en/v0.6.6/ puter software and theory from Wuhan University,
[35] (Jul. 2020). Ethereum-Dasm. [Online]. Available: https:// in 2011. He is currently an Associate Professor
github.com/tintinweb/ethereum-dasm with the School of Cyber Science and Engineering,
[36] M. S. Charikar, ‘‘Similarity estimation techniques from rounding algo- College of Information Science and Engineering,
rithms,’’ in Proc. 34th Annu. ACM Symp. Theory Comput. (STOC), 2002, Xinjiang University. His current research interests
pp. 380–388. include software security, big data analysis, medi-
[37] Q. V. Le and T. Mikolov, ‘‘Distributed representations of sentences and
cal information processing, and so on.
documents,’’ in Proc. 31st Int. Mach. Learn., Jun. 2014, pp. 1188–1196.
[38] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification
with deep convolutional neural networks,’’ in Proc. 25th Neural Inf. Pro-
cess. Syst., Dec. 2012, pp. 1097–1105.
[39] J. Peng and G. Xiao, ‘‘Detection of smart Ponzi schemes using opcode,’’
in Proc. 2nd BlockSys, Aug. 2020, pp. 192–204.
[40] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and
D. Batra, ‘‘Grad-CAM: Visual explanations from deep networks via
gradient-based localization,’’ Int. J. Comput. Vis., vol. 128, no. 2,
pp. 336–359, Feb. 2020.
[41] R. Camino, C. F. Torres, M. Baden, and R. State, ‘‘A data science approach
for detecting honeypots in ethereum,’’ in Proc. IEEE Int. Conf. Blockchain
Cryptocurrency (ICBC), May 2020, pp. 1–9.
[42] R. Gao, F. Liu, J. Zhang, B. Han, T. Liu, G. Niu, and M. Sugiyama, HAO WANG received the B.S. degree in internet
‘‘Maximum mean discrepancy is aware of adversarial attacks,’’ 2020, of things engineering from the China University
arXiv:2010.11415. [Online]. Available: http://arxiv.org/abs/2010.11415 of Petroleum, East China, in 2018. He is currently
pursuing the master’s degree with the School of
Software, Xinjiang University. His research inter-
est includes blockchain applications.

LINGYU BIAN received the B.S. degree in


network engineering from Xinjiang University,
in 2018, where she is currently pursuing the
master’s degree with the School of Cyber Science
and Engineering, College of Information Science
and Engineering. Her research interests include
blockchain security and data analysis.

LINLIN ZHANG received the Ph.D. degree in SHENGJIA GONG received the B.S. degree in
computer software and theory from Wuhan Uni- computer science and technology from Huang-
versity, in 2009. She is currently an Associate gang Normal University, in 2018. He is currently
Professor with the School of Cyber Science and pursuing the master’s degree with the School of
Engineering, College of Information Science and Cyber Science and Engineering, College of Infor-
Engineering, Xinjiang University. Her current mation Science and Engineering, Xinjiang Uni-
research interests include software security, big versity. His research interest includes blockchain
data analysis, and so on. applications.

VOLUME 9, 2021 33665

You might also like