Adversarial Attacks Against Binary Similarity Systems
Adversarial Attacks Against Binary Similarity Systems
ABSTRACT Binary analysis has become essential for software inspection and security assessment. As the
number of software-driven devices grows, research is shifting towards autonomous solutions using deep
learning models. In this context, a hot topic is the binary similarity problem, which involves determining
whether two assembly functions originate from the same source code. However, it is unclear how deep
learning models for binary similarity behave in an adversarial context. In this paper, we study the resilience of
binary similarity models against adversarial examples, showing that they are susceptible to both targeted and
untargeted (w.r.t. similarity goals) attacks performed by black-box and white-box attackers. We extensively
test three state-of-the-art binary similarity solutions against (i) a black-box greedy attack that we enrich
with a new search heuristic, terming it Spatial Greedy, and (ii) a white-box attack in which we repurpose
a gradient-guided strategy used in attacks to image classifiers. Interestingly, the target models are more
susceptible to black-box attacks than white-box ones, exhibiting greater resilience in the case of targeted
attacks.
INDEX TERMS Adversarial attacks, binary analysis, binary code models, binary similarity, black-box
attacks, greedy, white-box attacks.
In spite of the wealth of works identifying similar functions 59.68% for GMN, and 60.68% for SAFE. However, in the
with ever improving accuracy, we found that an extensive untargeted scenario, these percentages increased to 53.89%
study on the resilience of (DNN-based) binary similarity for Gemini, 93.81% for GMN, and 90.62% for SAFE. Our
solutions against adversarial attacks is missing. Indeed, analysis shows that all target models are more resilient to
we believe binary similarity systems are an attractive target our white-box procedure; we believe this is largely due to the
for an adversary. As examples, an attacker: (1) may hide a inherent challenges of conducting gradient-based attacks on
malicious function inside a firmware by making it similar to models that use discrete representations.
a benign white-listed function, as similarly done in malware
misclassification attacks [17]; (2) may make a plagiarized A. CONTRIBUTIONS
function dissimilar to the original one, analogously to source This paper proposes the following contributions:
code authorship attribution attacks [18]; or, we envision, (3)
may replace a function—entirely or partially, as in forward • we propose to study the problem of adversarial attacks
porting of bugs [19]—with an old version known to have a against binary similarity systems, identifying targeted
vulnerability and make the result dissimilar from the latter. and untargeted attack opportunities;
In this context, we can define an attack targeted when • we investigate black-box attacks against DNN-based
the goal is to make a rogue function be the most similar binary similarity systems, exploring an instruction
to a target, as with example (1). Conversely, an attack is insertion technique based on a greedy optimizer. Where
untargeted when the goal is to make a rogue function the applicable, we enhance it in a gray-box fashion for effi-
most dissimilar from its original self, as with examples (2) ciency, using partial knowledge of the model sensitivity
and (3). In both scenarios, the adversarial instance has to to instruction types;
preserve the semantics (i.e., execution behavior) of the rogue • we propose Spatial Greedy, a fully black-box attack
function as in the original. that matches or outperforms gray-box greedy by using
In this paper, we aim to close this gap by proposing and a novel search heuristic for guiding the choice of the
evaluating techniques for targeted and untargeted attacks candidates’ instructions used during the attack;
using both black-box (where adversaries have access to the • we investigate white-box attacks against DNN-based
similarity model without knowing its internals) and white- binary similarity systems, exploring a gradient-guided
box (where they know also its internals) methods. search strategy for inserting instructions;
For the black-box scenario, we adopt a greedy optimizer to • we conduct an extensive experimental evaluation of our
modify a function by inserting a single assembly instruction techniques in different attack scenarios against three
to its body at each optimization step. Where applicable, systems backed by largely different models and with
we consider an enhanced gray-box [17] variant that, lever- high performance in recent studies [23].
aging limited knowledge of the model, chooses only between
instructions that the model treats as distinct. II. RELATED WORKS
We then enrich the greedy optimizer with a novel In this section, we first discuss loosely related approaches for
black-box search heuristic, where we transform the discrete attacking image classifiers and natural language processing
space of assembly instructions into a continuous space using (NLP) models; then, we describe attacks against source code
a technique based on instruction embeddings [20]. We call models. Finally, we discuss prominent attacks against models
this enhanced black-box attack Spatial Greedy. When using for binary code analysis.
our heuristic, the black-box attack is on par or outperforms
the gray-box greedy attack, without requiring any knowledge A. ATTACKS TO IMAGE CLASSIFIERS AND NLP MODELS
of the model. For the white-box scenario, we repurpose Historically, the first adversarial attacks targeted image
a method for adversarial attacks on images that relies on classifiers. The crucial point for these attacks is to insert
gradient descent [21] and use it to drive instruction insertion inside a clean image instance a perturbation that should not
decisions. be visible to the human eye while being able to fool the target
We test our techniques against three binary similarity model, as first pointed out by [12] and [13].
systems—Gemini [9], GMN [22], and SAFE [4]—focusing Most of the attacks modify the original instances using
on three research questions: (RQ1) determining whether the gradient-guided methods. In particular, when computing an
target models are more robust against targeted or untargeted adversarial example, they keep the weights constant while
attacks; (RQ2) assessing whether the target models exhibit altering the starting input in the direction of the gradient that
greater resilience to black-box or white-box approaches; mimizes (or maximizes, depending on whether the attack
and (RQ3) exploring how target models influence the is targeted or untargeted) the loss function of the attacked
effectiveness of our attacks. Our results indicate that all the model. The FGSM attack [13] explicitly implements this
three models are inherently more vulnerable to untargeted technique. Other attacks, such as the Carlini-Wagner [14]
attacks. In the targeted scenario, the best attack technique one, generate a noise that is subject to Lp -norm constraints
mislead the target models in 31.6% of instances for Gemini, to preserve similarity to original objects.
As observed in Section III-B, adversarial examples genera- binary. We emphasize that binary similarity systems analyze
tion is possibly easier in the image domain than in the textual executable code, meaning these attacks are ineffective in our
one, due to the continuous representation of the original scenario.
objects. In the NLP domain, the inputs are discrete objects, Pierazzi et al. [17] explore transplanting binary code
a fact that prevents any direct application of gradient-guided gadgets into a malicious Android program to avoid detection.
methods for adversarial examples generation. Ideally, check The attack follows a gradient-guided search strategy based on
perturbations to fool deep models for language analysis a greedy optimization. In the initialization phase, they mine
should be grammatically correct and semantically coherent from benign binaries code gadgets that modify features that
with the original instance. the classifier uses to compute its classification score. In the
One of the earliest methodologies for attacking NLP attack phase, they pick the gadgets that can mostly contribute
models is presented in [16]. The authors propose attacks to to the (mis)classification of the currently analyzed malware
mislead deep learning-based reading comprehension systems sample; they insert gadgets in order of decreasing negative
by inserting perturbations in the form of new sentences inside contribution, repeating the procedure until misclassification
a paragraph, so as to confuse the target model while maintain- occurs. To preserve program semantics, gadgets are injected
ing intact the original correct answer. The attacks proposed into never-executed code portions. Differently from our main
in [24] and [25] focus on finding replacement strategies contribution, their attack is only applicable in a targeted
for words composing the input sequence. Intuitively, valid white-box scenario.
substitutes should be searched through synonyms; however, Lucas et al. [32] target malware classifiers analyzing
this strategy could fall short in considering the context raw bytes. They propose a functionality-preserving iterative
surrounding the word to substitute. Works like [26] and [27] procedure viable for both black-box and white-box attackers.
further investigate this idea using BERT-based models for At every iteration, the attack determines a set of applicable
identifying accurate word replacements. perturbations for every function in the binary and applies a
randomly selected one (following a hill-climbing approach
B. ATTACKS AGAINST MODELS FOR SOURCE CODE in the black-box scenario or using the gradient in the white-
ANALYSIS box one). Done via binary rewriting, the perturbations are
This section covers some prominent attacks against models local and include instruction reordering, register renaming,
that work on source code. and replacing instructions with equivalent ones of identical
The general white-box attack of [28] iteratively substitutes length. The results show that these perturbations can be
a target variable name in all of its occurrences with effective even against (ML-based) commercial antivirus
an alternative name until a misclassification occurs. The products, leading the authors to advocate for augmenting
attack against plagiarism detection from [18] uses genetic such systems with provisions that do not rely on ML.
programming to augment a program with code lines picked In the context of binary similarity, though, we note that
from a pool and validated for program equivalence by these perturbations would have limited efficacy if done on
checking that an optimizer compiler removes them. The a specific pair of functions: for example, both instruction
attack against clone detectors from [29] combines several reordering and register renaming would go completely
semantics-preserving perturbations of source code using unnoticed by Gemini and GMN (Section VIII-A and VIII-B).
different optimization heuristic strategies. Furthermore, since [32] is mainly designed for models that
We highlight that these approaches have limited applicabil- classify binary programs, it is not directly applicable in our
ity in the binary similarity scenario, as their perturbations may scenario, where the output of the model is a real value
not survive compilation (e.g., variable renaming) or result representing the distance between the two inputs.
in marginal differences in compiled code (e.g., turning a MAB-Malware [33] is a reinforcement learning-based
while-loop into a for-loop). approach for generating adversarial examples against PE mal-
ware classifiers in a black-box context. Adversarial examples
are generated through a multi-armed bandit (MAB) model
C. ATTACKS AGAINST MODELS FOR BINARY CODE
that has to keep the sample in a single, non-evasive state
ANALYSIS
when selecting actions while learning reward probabilities.
We complete our review of related works by covering The goal of the optimization strategy is to maximize the
research on evading ML-based models for analysis of binary total reward. The set of applicable perturbations (which
code. can be considered as actions) are standard PE manipulation
techniques from prior works: header manipulation, section
1) ATTACKS AGAINST MALWARE DETECTORS insertion and manipulation (e.g., adding trailing byte), and
Attacks such as [30] and [31] to malware detectors based in-place randomization of an instruction sequence (i.e.,
on convolutional neural networks add perturbations in a new replacing it with a semantically equivalent one). Each action
non-executable section appended to a Windows PE binary. is associated with a specific content—a payload—added
Both use gradient-guided methods for choosing single-byte to the malware when the action is selected. An extensive
perturbations to mislead the model in classifying the whole evaluation is conducted on two popular ML-based classifiers
and three commercial antivirus products. As outlined for Different attack types may suit different scenarios best.
other works, our scenario does not allow for the application A white-box attack, for example, could be attempted on
of this approach for two primary reasons. Firstly, this attack an open-source malware classifier. Conversely, a black-
is specifically designed to target classifiers. Secondly, many box attack would suit also a model hosted on a remote
of the proposed transformations are ineffective when applied server to interrogate, as with a commercial cloud-based
to binary similarity systems. antivirus.
A. THREAT MODEL
The focus of this work is to create adversarial instances
that attack a model at inference time (i.e., we do not
investigate attacks at training time). Following the description
provided in Section III-A, we consider two different attack
scenarios: respectively, a black-box and a white-box one.
In the first case, the adversary has no knowledge of the target
binary similarity model; nevertheless, we assume they can
perform an unlimited number of queries to observe the output
produced by the model. In the second case, we assume that the
attacker has perfect knowledge of the target binary similarity
model.
FIGURE 3. Examples of semantics-preserving perturbations that do not
alter the binary CFG layout. We modify the assembly snippet in (a) by B. PROBLEM DEFINITION
applying, in turn, (b) Instruction Reordering, (c) Semantics-Preserving
Rewriting, and (d) Register Renaming. Altered instructions are in red.
Let sim be a similarity function that takes as input two
functions, f1 and f2 , and returns a real number, the similarity
score between them, in [0, 1].
Among CFG-preserving perturbations, we identify: We define two binary functions to be semantically equiv-
• (IR) Instruction Reordering: reorder independent alent if they are two implementations of the same abstract
instructions in the function; functionality. We assume that there exists an adversary that
• (SPR) Semantics-Preserving Rewriting: substitute a wants to attack the similarity function. The adversary can
sequence of instructions with a semantically equivalent mount two different kind of attacks:
sequence; • Targeted attack. Given two binary functions, f1 (identi-
• (DSL) Modify the Data-Section Layout: modify the fied as source) and f2 (identified as target), the adversary
memory layout of the .data section and update all the wants to find a binary function fadv semantically
global memory offsets referenced by instructions; equivalent to f1 such that: sim(fadv , f2 ) ≥ τt , where τt is a
FIGURE 4. Overall workflow of the black-box ε-greedy perturbation-selection strategy in the targeted scenario.
FIGURE 5. Toy example describing how the source function f1 is modified during the various steps of our Spatial Greedy attack. We first identify the set
of available positions and initialize the candidates’ set CAND (a). Then, we enumerate all the possible perturbations (b) and choose one according to the
ε-greedy strategy while updating CAND according to the Spatial Greedy heuristic (c). This process (d) is repeated until a successful adversarial example is
generated or we reach a maximum number of iterations.
success threshold1 chosen by the attacker depending on is as dissimilar as possible from its original version, as in the
the victim at hand. example scenarios (2) and (3) also from Section I.
• Untargeted attack. Given a binary function f1 , the
adversary goal consists of finding a binary function fadv C. PERTURBATION SELECTION
semantically equivalent to f1 such that: sim(f1 , fadv ) ≤ Given a binary function f1 , our attack consists in applying to
τu . The threshold τu is the analogous of the previous case it perturbations that do not alter its semantics.
for the untargeted attack scenario. To study the feasibility of our approach, we choose dead
Loosely speaking, in case of targeted attack, the attacker branch addition (DBA) among the suitable perturbations
wants to create an adversarial example that is as similar as outlined in Section III-C. We find DBA adequate for this
possible to a specific function, as in the example scenario (1) study for two reasons: it is sufficiently expressive so as to
presented in Section I. In case of untargeted attack, the goal affect heterogeneous models (which may not hold for others2 )
of the attacker consists of creating an adversarial example that and its implementation complexity for an attacker is fairly
2 For example, basic block-local transformations such as IR and RR
1 Although f
adv and f2 are similar for the model, they are not semantically would have limited efficacy on models that study an individual block for
equivalent: this is precisely the purpose of an attack that wants to fool the its instruction types and counts or other coarse-grained abstractions. This is
model to consider them as such, while they are not. the case with Gemini and GMN that we attack in this paper.
limited. Nonetheless, other choices remain possible, as we of the one that the standard greedy strategy picks, and with
will further discuss in Section XIII. probability 1 − ε the one representing the local optimum.
At each application, our embodiment of DBA inserts in the In case of targeted attack, the objective function is the
binary code of f1 one or more instructions in a new or existing similarity between fadv and the target function f2 (formally,
basic block guarded by a branch that is never taken at runtime sim(fadv , f2 )) while it is the negative of the similarity between
(i.e., we use an always-false branch predicate). fadv and the original function in case of untargeted attack
Such a perturbation can be done at compilation time (formally, −sim(f1 , fadv )). In the following, we only discuss
or on an existing binary function instance. For our study, the maximization strategy followed by targeted attacks;
we apply DBA during compilation by adding placeholder mutatis mutandis, the same rationale holds for untargeted
blocks as inline assembly, which eases the generation of attacks.
many adversarial examples from a single attacker-controlled
code. State-of-the-art binary rewriting techniques would 1) LIMITATIONS OF THE COMPLETE ENUMERATION
work analogously over already-compiled source functions. STRATEGY
We currently do not attempt to conceal the nature of our
At each step, Greedy enumerates all the applicable pertur-
branch predicates for preprocessing robustness, which [17]
bations computing the marginal increase of the objective
discusses as something that attackers should be wary of to
function, thus resulting in selecting an instruction in by
mount stronger attacks. We believe off-the-shelf obfuscations
enumerating all the possible instructions of the considered set
(e.g., opaque predicates, mixed boolean-arithmetic expres-
of candidates CAND for each position bl ∈ BLK.
sions) or more complex perturbation choices may improve
Unfortunately, the Instruction Set Architecture (ISA) of a
our approach in this respect. Nevertheless, our main goal was
modern CPU may consist of a large number of instructions.
to investigate its feasibility in the first place.
To give an example, consider the x86-64 ISA: according
to [37], it has 981 unique mnemonics and a total of
V. BLACK-BOX ATTACK: SOLUTION OVERVIEW 3,684 instruction variants (without counting register operand
In this section, we describe our black-box attack. We first choices for them). Therefore, it would be unfeasible to have
introduce our baseline (named Greedy), highlighting its a CAND set that covers all possible instructions of an ISA.
limitations. We then move to our main contribution in the This means that the size of CAND must be limited. One
black-box scenario (named Spatial Greedy). Figure 4 depicts possibility is to use hand-picked instructions. However, this
a general overview of our black-box approach. approach has two problems. Such a set could not cover
all the possible behaviors of the ISA, missing fundamental
A. GREEDY
aspects (for example, leaving vector instructions uncovered);
furthermore, this effort has to be redone for a new ISA. There
The baseline black-box approach we consider for attacking
is also a more subtle pitfall: a set of candidates fixed in
binary function similarity models consists of an iterative
advance could include instructions that the specific binary
perturbation-selection rule that follows a greedy optimization
similarity model under attack deems as not significant.
strategy. Starting from the original sample f1 , we iter-
On specific models, it may still be possible to use a
atively apply perturbations T1 , T2 , . . . , Tk selected from
small set of candidates profitably, enabling a gray-box
a set of available ones, generating a series of instances
attack strategy for Greedy. In particular, one can restrict the
fadv1 , fadv2 , . . . , fadvk . This procedure ends upon generating
set of instructions to the ones that effectively impact the
an example fadv meeting the desired similarity threshold,
features extracted by the attacked model (which obviously
otherwise the attack fails after δ̄ completed iterations.
requires knowledge of the features it uses; hence, the gray-
For instantiating Greedy using DBA, we reason on a set of
box characterization). In such cases, this strategy is equivalent
positions BLK for inserting dead branches in function f1 and a
to the black-box Greedy attack that picks from all the
set of instructions CAND, which we call the set of candidates.
instructions in the ISA, but computationally much more
Each perturbation consists of a ⟨bl, in⟩ pair made of the
efficient.
branch bl ∈ BLK and an instruction in ∈ CAND to check
insert in the dead code block guarded by bl.
The naive perturbation-selection rule (i.e., greedy) at each B. SPATIAL GREEDY
step selects the perturbation that, in case of targeted attack, In this section, we extend the baseline approach by intro-
locally maximizes the relative increase of the objective ducing a fully black-box search heuristic. To differentiate
function. Conversely, for an untargeted attack, the optimizer between the baseline solution and the heuristic-enhanced one,
selects the perturbation that locally maximizes the relative we name the latter Spatial Greedy.
decrease of the objective function. When using this heuristic, the black-box attack overcomes
This approach, however, may be prone to finding local all the limitations discussed for Greedy using an adaptive
optima. To avoid this problem, we choose as our Greedy base- procedure that dynamically updates the set of candidates
line an ε-greedy perturbation-selection rule. Here, we select according to a feedback from the model under attack without
with a small probability ε a suboptimal perturbation instead requiring any knowledge of it.
exploitation. With the random instructions, we randomly Given two binary functions f1 and f2 , we aim to find a
sample the solution space to escape from a possibly local perturbation δ that minimizes the loss function of simv , which
optimum found for the objective function. With the selected corresponds to maximize simv (λ(f1 )) + δ, λ(f2 )). To do so,
instructions, we exploit the part of the space that in the past we use an iterative strategy where, during each iteration,
has brought the best solutions. Figure 6 provides a pictorial we solve the following optimization problem:
representation of the update procedure.
minL(simv (λ(f1 ) + δ, λ(f2 )), θ) + ϵ||δ||p , (1)
We present the complete description of Spatial Greedy
in case of targeted attack in Algorithm 1 together with a where L is the loss function, θ are the weights of the target
simplified execution example in Figure 5. The first step (a) model, and ϵ is a coefficient in [0, ∞).
consists in identifying the positions BLK where to introduce We randomly initialize the perturbation δ and then update
dead branches (function getPositions(f1 , B) at line 3) and it at each iteration by a quantity given by the negative gradient
initializing the set of candidates CAND with N random of the loss function L. The vector δ has several components
instructions (function getRandomInstructions(N ) at line 4). equal to zero and it is crafted so that it modifies only the
Then, during the iterative procedure (d), we first enumerate (dead) instructions in the added blocks. The exact procedure
all the possible perturbations (b). Then (c), we apply the depends on the target model: we return to this aspect in
perturbation-selection rule according to the value of ε, and Section VIII.
we get the top-k greedy perturbations (line 20) as depicted in Notice that the procedure above allows us to find a
Figure 6. Finally, we update the set of candidates (line 21). perturbation in the feature space, while our final goal is to
find a problem-space perturbation to modify the function f1 .
VI. WHITE-BOX ATTACK: SOLUTION OVERVIEW Therefore, we derive from the perturbation δ a problem-space
As pointed out in Section III-A, in a white-box scenario the perturbation δp . The exact technique is specific to the model
attacker has a perfect knowledge of the target deep learning we are attacking, as we further discuss in Section VIII.
model, including its loss function and gradients. We discuss The common idea behind all technique instances is to find
next how we can build on them to mount an attack. the problem-space perturbation δp whose representation in
the feature space is the closest to δ. Essentially, we use a
A. GRADIENT-GUIDED CODE ADDITION METHOD rounding-based inverse strategy to solve the inverse feature
White-box adversarial attacks have been largely investigated mapping problem that accounts to rounding the feature space
against image classifiers by the literature, resulting in vector to the closest vector that corresponds to an object
valuable effectiveness [13]. Our attack strategy for binary in the problem space. The generated adversarial example is
similarity derives from the design pattern of the PGD fadv = f1 + δp . As for the black-box scenario, the process
attack [21], which iteratively targets image classifiers. ends whenever we reach a maximum number of iterations or
We call our proposed white-box attack Gradient-guided the desired threshold for the similarity value.
Code Addition Method (GCAM). It consists in applying a set
of perturbations using a gradient-guided strategy. In the case VII. COMPARISON BETWEEN THE ATTACKS
of a targeted attack, our goal is to minimize the loss function In this section, we present a more direct comparison between
of the attacked model on the given input while keeping the three proposed attack methodologies.
the perturbation size small and respecting the semantics- We summarize in Table 1 the key differences according to
preserving constraint. We achieve this by using the Lp -norm four interesting aspects: attacker’s knowledge, perturbation
as soft constraint. On the other hand, for an untargeted attack, type, usage of the candidates’ set, and usage of an additional
we aim to maximize the loss function while also keeping the instruction embedding model.
size of the perturbation small. From a technical perspective, GCAM is a white-box attack
Because of the inverse feature mapping problem, gradient that assumes an attacker having a complete knowledge of the
optimization-based approaches cannot be directly applied target model’s internals. Contrarily, both Spatial Greedy and
in our context (Section III-B). We need a further (hard) Greedy are black-box approaches, meaning that they can be
constraint that acts on the feature-space representation of easily adapted to attack any binary similarity model, without
the input binary function. This constraint strictly depends on having any prior knowledge. This distinction according to
the target model: we will further investigate its definition the attacker’s knowledge underlines a more subtle difference
in Section VIII. In the following, we focus on the loss among the approaches; indeed, while the two black-box
minimization strategy argued for targeted attacks. As before, attacks operate in the problem space producing valid adver-
we can easily adapt the same concepts to the untargeted case. sarial examples, GCAM initially produces perturbations in
We can describe a DNN-based model for binary similarity the feature space, which must then be converted into problem
as the concatenation of the two functions λ and simv . space objects using a rounding process.
In particular, λ is the function that maps a problem-space Looking at more practical aspects, both Greedy and Spatial
object to a feature vector (i.e., the feature mapping function Greedy depend on the concept of candidates’ set, while
discussed in Section III-B), while simv is the neural network GCAM leverages the internals of the target model to guide
computing the similarity given the feature vectors. the choice of the instructions to insert into the function
FIGURE 7. GCAM attack against Gemini. Once obtained the initial CFG of the function f1 , we initialize an
empty dead branch in one of the available positions (a). In particular, each node is represented as a feature
vector v , which is the linear combination of three embedding vectors corresponding to three different
categories of instructions (green block). We then iteratively apply the gradient descent to modify the
coefficients nj associated to the instruction vectors (b), obtaining a vector of non-integer values. Finally,
we round the obtained coefficients to the closest integer values (c) and, (d), we insert into the dead branch as
many instructions belonging to the class j as specified by the coefficient nj .
according to the objective function. Specifically, GCAM can another. This approach allows us to test the generality of our
potentially utilize the entire set of instructions encountered solution. Specifically, the three models we selected can be
by the target model during training, while the black-box distinguished by the following features:
methods are constrained to a predetermined set of instructions • NN architecture: Both Gemini and GMN are
that can be tested during each iteration. As highlighted in GNN-based models while SAFE is a RNN-based one.
Section V-A1, the usage of a manually-crafted candidates’ set • Input representation: Both Gemini and GMN repre-
represents the main weakness of the Greedy procedure, which sent functions through their CFGs while SAFE uses the
we addressed with the Spatial Greedy heuristic proposing an linear disassembly.
adaptive set based on the usage of instruction embeddings. • Feature mapping process: Both Gemini and GMN use
Finally, when considering Spatial Greedy, it is important manual features from the CFG nodes, while SAFE learns
to note that one should train from scratch an instruction features using an instruction embedding model.
embedding model to effectively apply the embedding based In the following, we provide an overview of the internal
search heuristic. However, we remark that the model has to be workings of the models and then discuss specific provisions
trained only once and then it can be reused for all the attacks for the Greedy (Section V-A) and GCAM (Section VI)
against binary for a certain ISA. attacks. Notably, Spatial Greedy needs no adaptations.
that converts the ACFG into an embedding vector, obtained Differently from solutions based on standard GNNs (e.g.,
by aggregating the embedding vectors of individual ACFG Gemini), which compare embeddings built separately for
nodes. The similarity score for two functions is given by the each graph, GMN computes the distance between two graphs
cosine similarity of their ACFG embedding vectors. as it attempts to match them. In particular, while in a standard
GNN the embedding vector for a node captures properties of
1) GREEDY ATTACK its neighborhood only, GMN also accounts for the similarity
Each ACFG node contributes a vector of 8 manually selected with nodes from the other graph.
features. Five of these features depend on the characteristics
1) GREEDY ATTACK
of the instructions in the node, while the others on the graph
topology. The model distinguishes instructions from an ISA Similarly to the case of Gemini, each node of the graph
only for how they contribute to these 5 features. This enables consists of a vector of manually-engineered features. In par-
a gray-box variant of our Greedy attack: we measure the ticular, each node is a bag of 200 elements, each of
robustness of Gemini using a set of candidates CAND of which represents a class of assembly instructions, grouped
only five instructions, carefully selected for covering the five according to their mnemonics. The authors do not specify
features. Later in the paper, we use this variant as the baseline why they only consider these mnemonics among all the
approach for a comparison with Spatial Greedy. available ones in the x86-64 ISA. Analogously to Gemini,
when testing the robustness of this model against the Greedy
approach we devise a gray-box variant by considering a
2) GCAM ATTACK
set of candidates CAND of 200 instructions, each of which
As described in the previous section, some of the components belonging to one and only one of the considered classes.
of a node feature vector v depend on the instructions inside the
corresponding basic block. As Gemini maps all possible ISA 2) GCAM ATTACK
instructions into 5 features, we can associate each instruction Our white-box attack operates analogously to what we
with a deterministic modification of v represented as a vector presented in Section VIII-A2 and illustrated in Figure 7.
u. We select five categories of instructions and for each Similarly to the Gemini case, each dead branch adds a node
category cj we compute the modification uj that will be to the CFG while the feature mapping function transforms
applied to the feature vector v. We selected the categories so each CFG node into a feature vector. The feature vector is a
as to cover the aforementioned features. bag of the instructions contained in the node, where assembly
When we introduce in the block an instruction belonging instructions are divided into one of 200 categories using the
to category cj , we add its corresponding uj modification to mnemonics.
the feature vector v. Therefore, inserting instructions inside
the block modifies the feature
P vector v by adding to it a C. SAFE
linear combination vector j nj uj , where nj is the number SAFE [4] is an embedding-based similarity model. It rep-
of instructions of category cj added. Our perturbation δ acts resents functions in the problem space as sequences of
on the feature vector of the function only in the components assembly instructions. It first converts assembly instructions
corresponding to the added dead branches, by modifying the into continuous vectors using an instruction embedding
coefficients of the linear combination above. model based on the word2vec [20] word embedding tech-
Since negative coefficients are meaningless, we avoid nique. Then, it supplies such vectors to a bidirectional
them by adding to the optimization problem appropriate self-attentive recurrent neural network (RNN), obtaining an
constraints. Moreover, we solve the optimization problem embedding vector for the function. The similarity between
without forcing the components of δ to be integers, as this two functions is the cosine similarity of their embedding
would create an integer programming problem. Therefore, vectors.
at the end of the iterative optimization process, we get our
problem-space perturbation δp by rounding to the closest 1) GREEDY ATTACK
positive integer value each component of δ. It is immediate The Greedy attack against SAFE follows the black-box
to obtain from δp the problem-space perturbation to insert approach described in Section V-A. Since SAFE does not use
in our binary function f1 . Indeed, in each dead block, manually engineered features, we cannot select a restricted
we must add as many instructions belonging to a category set of instructions that generates all vectors of the feature
as the corresponding coefficient in δp . We report a simplified space for a gray-box variant. We test its resilience against
example of the GCAM procedure against Gemini in Figure 7. the Greedy approach considering a carefully designed list
of candidates CAND composed of random and hand-picked
B. GMN instructions, meaning that the baseline is a black-box attack.
Graph Matching Network (GMN) [22] computes the sim-
ilarity between two graph structures. When functions are 2) GCAM ATTACK
represented through their CFGs, GMN offers state-of-the-art In the feature space, we represent a binary function as
performance for the binary similarity problem [22], [23]. a sequence of instruction embeddings belonging to a
TABLE 2. Evaluation metrics with τt = 0.80 relative to the black-box attacks against the three target models in the targeted scenario. Spatial Greedy (SG)
is evaluated using parameters ε = 0.1 and r = 0.75. Greedy (G) is evaluated using ε = 0.1. G* is the gray-box version of Greedy: when such a version is
available (Section VIII), we show it instead of G. When examining G against SAFE, a set of candidates of size 400 is considered.
TABLE 3. Evaluation metrics with τu = 0.50 relative to the black-box attacks against the three target models in the untargeted scenario. Spatial Greedy
(SG) is evaluated using parameters ε = 0.1 and r = 0.75. Greedy (G) is evaluated using ε = 0.1. Similarly to Table 2, G* is the gray-box version of Greedy
where applicable. When examining G against SAFE, a set of candidates of size 400 is considered.
TABLE 4. Evaluation metrics with τt = 0.80 for the white-box targeted attack against the three target models. The GCAM attack is executed up to 20k
iterations for Gemini and up to 1k for GMN and SAFE.
TABLE 5. Evaluation metrics with τu = 0.50 for the white-box untargeted attack against the three target models. The GCAM attack is executed up to 20k
iterations for Gemini and up to 1k for GMN and SAFE.
further confirmed by the normalized increment N-inc metric; peak A-rate at τu = 0.50 is 53.89% for Gemini, 91.62% for
at a comparison of the results, the impact of the candidates GMN, and 90.62% for SAFE.
selected using Spatial Greedy is more consistent if compared The number of instructions M-size needed for generat-
to the one of the candidates selected using the baseline ing valid adversarial examples further confirms the weak
approach. We omit a discussion of the untargeted case for resilience of the target models to untargeted attacks. When
brevity. considering the worst setting according to M-size (i.e., C4),
Comparing Spatial Greedy with Greedy, we measure on while we need only few instructions for untargeted attacks
the Targ dataset an average A-rate increase of 2.27 and a at τu = 0.50 (i.e., 11.35 for Gemini, 4.14 for GMN, and
decreased M-size by 0.46 instructions across all configu- 7.64 for SAFE), we need a significantly higher number of
rations and models. When considering the Untarg, Spatial added instructions for targeted attacks (i.e., 51.85 for Gemini,
Greedy sees an average A-rate increase of 1.75, whereas the 40.99 for GMN, and 40.31 for SAFE) at τt = 0.80.
average M-size is smaller by 0.16 instructions. We report
detailed results in Table 6. Takeaway: On all the attacked models, both targeted
Restricted Set Experiments: Finally, we perform a further and untargeted attacks are feasible, especially using
experiment between the black-box version of Greedy and Spatial Greedy (see also RQ2). Their resilience
Spatial Greedy, considering a candidates’ set of smaller size; against untargeted attacks is significantly lower.
in particular, we consider a set of 50 instructions which,
in the case of Greedy, is a subset of the one considered
for the previously detailed experiments. Our hypothesis D. RQ2: BLACK-BOX VS. WHITE-BOX ATTACKS
is that the smaller the size of the candidates’ set the higher An interesting finding from our tests is that the white-box
the difference in terms of A-rate in favour of Spatial Greedy. strategy does not always outperform the black-box one.
We highlight that we applied the black-box version of Greedy Figure 12 depicts a comparison in the targeted scenario
also when targeting Gemini and GMN. In the following, between Spatial Greedy and GCAM for the attack success
we refer to the results obtained in the targeted case when rate A-rate, average similarity A-sim, and normalized
considering the C4 scenario. increment N-inc metrics. The figure shows how different
When targeting SAFE, there is a significant difference, values of the success attack threshold τt can influence the
with an A-rate of 30.74% for Greedy and 49.70% for Spatial considered metrics. On GMN and SAFE, Spatial Greedy is
Greedy. A similar trend is seen with Gemini, where Greedy
shows an A-rate of 9.58% compared to 25.55% for Spatial
Greedy, and with GMN, where Greedy’s A-rate is 37.13%
versus 49% for Spatial Greedy.
TABLE 6. Difference between SG and G for the A-rate and M-size in the four settings C1-C4, averaged on the three models. Where applicable, we consider
the gray-box version of Greedy.
more effective than GCAM, resulting in significantly higher E. RQ3: IMPACT OF FEATURES EXTRACTION AND
A-rate values, while the two perform similarly on Gemini. ARCHITECTURES ON ATTACKS
Interestingly, in contrast with the evaluation based on the As detailed in Section VIII, we can distinguish the target
A-rate metric, both the A-sim and N-inc values highlight a models according to three aspects (NN architecture, input
coherent behavior among the three target models. Generally, representation, and feature mapping process). Here, we are
adversarial examples generated using Spatial Greedy exhibit interested in investigating whether these aspects can influ-
a higher A-sim value than the white-box ones (considering ence the performance of our attacks or not.
τt = 0.80, we have 0.86 vs. 0.86 for Gemini, 0.93 vs. 0.84 for In the targeted black-box scenario (Figure 9 and Table 2),
GMN, and 0.92 vs. 0.85 for SAFE). Looking at N-inc, SAFE and GMN are the weakest among the three considered
we face a completely reversed situation; the metric is better models, as the peak attack success rate A-rate at τt = 0.80 is
in the adversarial samples generated using GCAM (0.62 for 60.68% for SAFE, 59.68% for GMN, and 27.54% for Gemini
Gemini, 0.79 for GMN, and 0.71 for SAFE) compared to (C4 setting). These results highlight that our attack is sensible
those from Spatial Greedy (0.27 for Gemini, 0.79 for GMN, to some aspects of the target model, in particular to the feature
and 0.55 for SAFE). These two observations lead us to the mapping process and the DNN architecture employed by
hypothesis that the black-box attack is more effective against the considered models. To assess this insight, we conduct
pairs of binary functions that exhibit high initial similarity different analyses to check whether our Spatial Greedy
values and can potentially reach a high final similarity. On the attack is exploiting some hidden aspects of the considered
other side, GCAM is particularly effective against pairs that models to guide the update of the candidates’ set. First,
are very dissimilar at the beginning. we check whether or not there exists a correlation between
For the untargeted scenario, our results (Tables 3 and 5) the number of initial instructions composing the function
for the A-rate metric considering τu = 0.50 show that Spatial f1 and the obtained final similarity value. This analysis is
Greedy has a slight advantage on GCAM. For Spatial Greedy, particularly interesting for SAFE, as this model computes the
we have best-setting values of 53.89% for Gemini, 91.62% similarity between two functions by only considering their
for GMN, and 90.62% for SAFE; for GCAM, we have first 150 instructions; the results of this study are reported
39.52% for Gemini, 84.63% for GMN, and 88.42% for in Figure 13. From the plots it is visible that there exists
SAFE. a negative correlation between the final similarity and the
In our experiments, GCAM performed worse than the initial number of instructions composing the function we
black-box strategy, which may look puzzling since theo- are modifying, also confirmed by the Pearson’s r correlation
retically a white-box attack should be more potent than a coefficient highlighting that this negative correlation is
black-box one. We believe this behavior is due to the inverse almost moderate for SAFE (with r = −0.38) while it is
feature mapping problem. Hence, we conducted a GCAM weak for both Gemini and GMN (with r = −0.25 and
attack exclusively in the feature space by eliminating all r = −0.22 respectively). These results confirm that when
constraints needed to identify a valid potential sample in the Spatial Greedy modifies a function that is initially small
problem space (i.e., non-negativity of coefficients for Gemini (particularly composed by less than 150 instructions), then
and GMN, rounding to genuine instruction embeddings for our adversarial example and the function f2 are more likely
SAFE). As a result, GCAM achieved an A-rate between to have a final similarity value near 1 when targeting SAFE
92.90% and 99.81% in targeted scenarios and between rather then the two other models.
97.01% and 100% in untargeted ones. Then, since both Gemini and GMN implements a feature
mapping function which deeply looks at the particular assem-
Takeaway: In our tests, the Spatial Greedy black-box bly instructions composing the single blocks, we conduct
attack is on par or beats the white-box GCAM a further analysis to assess whether or not the instructions
attack based on a rounding inverse strategy. Further inserted by Spatial Greedy trigger the features required by
investigation is needed to confirm if this result the two considered models. In particular, for each inserted
will hold for more refined inverse feature mapping instruction, we check whether or not it is mapped over the
techniques and when attacking other models. features considered by the two models and, for each adver-
sarial example, we calculate the percentage of instructions
FIGURE 13. Correlation between initial number of instructions of the function f1 and the similarity between
the generated adversarial example fadv and the target function f2 . The considered adversarial examples are
generated in the targeted scenario, using our Spatial Greedy approach (ε = 0.1, r = 0.75, and |CAND| = 400)
in setting C4. We also reported an equal number of samples randomly drawn from a uniform distribution.
satisfying this condition. In the targeted black-box scenario The small M-size results in the untargeted scenario prevent
using Spatial Greedy, we find that on average, 90.83% of us to obtain meaningful results when running these analyses
the inserted instructions for Gemini and 100% for GMN are in this context, so we decided to not report the obtained
mapped to the considered features. results.
To verify how the particular architecture implemented For the white-box attack, we believe that the different
by the model affects the performance of Spatial Greedy, levels of robustness among the considered models are mainly
we checked how the instructions inserted by our procedure due to their feature mapping processes. As mentioned
are distributed across the various dead branches. Our in Section X-D, we evaluated a variant of the GCAM
hypothesis is that when targeting GNN-based models (as attack solely in the feature space, removing all constraints
Gemini and GMN), our attack should span the inserted necessary for producing adversarial examples valid in the
instructions across the various dead branches; on the contrary, problem space. For both Gemini and GMN, we removed
the position of the block should not influence the choice of the the non-negativity constraint of coefficients, and for SAFE,
attack when targeting a RNN as SAFE. For all the considered we eliminated the rounding to real instruction embeddings
models, the block where our attack inserts the majority of the constraint. In the C4 targeted scenario, the unconstrained ver-
instructions for each adversarial example is the one closest sion of GCAM increases the A-rate of the standard GCAM
to beginning of the function. However, while this is evident on Gemini from 31.60% to 96%, on GMN from 35.33%
for GMN and SAFE (where the first block contains most to 99.81%, and on SAFE from 21.76% to 92.90%. This
of the inserted instruction in 313 and 205 of the considered demonstrates that the performance of our attack is primarily
examples respectively), in Gemini the inserted instructions influenced by the feature mapping method rather than the
are more uniformly distributed across the various dead specific model architectures. The results on the untargeted
branches. We report the complete distribution in Figure 14. scenario confirm this hypothesis, as the unbounded GCAM
To further validate these results, we calculated the entropy reaches an A-rate near 100% for both GMN and SAFE, while
of the generated adversarial examples, resulting in values a value of 97.01% against Gemini.
of 2.94 for Gemini, 2.77 for GMN, and 2.68 for SAFE.
Higher entropy suggests a more uniform distribution of Take away: When considering the black-box sce-
inserted instructions across dead branches, while lower values nario, the particular architecture seems to influence
indicate concentration in specific blocks. These entropy the position where the instructions are inserted.
values reinforce our previous conclusions. We highlight that In general, the particular feature mapping process
these results are partially coherent with our initial hypothesis; adopted by the models seems playing a crucial role
indeed, the first block is the closest to the prologue of the in the choice of the instructions.
function, which plays a key role for both SAFE and GMN. In the white-box scenario, the feature mapping
Indeed, as mentioned in [44], SAFE primarily targets function processes adopted by the models prevent in reaching
prologues, which explains why our attack inserts most optimal A-rate results.
instructions into the first block, as it is closest to the function
prologue; For GMN, since prologue instructions typically
follow specific compiler patterns, the nodes containing these XI. MIRAI CASE STUDY
instructions are likely to match, prompting Spatial Greedy to We complement our evaluation with a case study examining
insert most instructions into the dead branches closest to the our attacks in the context of disguising functions from
prologue. malware.
FIGURE 14. Distribution of the blocks where, for each adversarial example (successful or not), our Spatial
Greedy attack inserts most of the instructions. The considered adversarial examples are generated in the
targeted scenario, using our Spatial Greedy approach (ε = 0.1, r = 0.75, and |CAND| = 400) in setting C4.
other two models. Compared to the main evaluation results, patterns of compiler-generated code or not, using models for
targeted attacks have worse performance than untargeted ones checking compiler provenance [45], [46].
also on this dataset. Moreover, successful untargeted attacks Adversarial training [13], [47] is the standard solution
continue to require fewer instructions: in particular, across for increasing the robustness of an already trained model;
all models, a successful black-box targeted attack needs on however, while it could improve the robustness against our
average 42.63 instructions, whereas the untargeted one adds methodologies, there is no guarantee that the retrained models
on average 5.27 instructions. would be robust against zero-day attacks. To overcome these
limitations, techniques based on randomized defenses [48]
XII. PRACTICAL IMPACTS AND POSSIBLE could be considered. In particular [48] proposes a method-
COUNTERMEASURES ology to increase the robustness of DNN classifiers against
In this section we discuss the practical impacts of our paper adversarial examples by introducing random noise inside
and possible countermeasures. the input representation during both training and inference.
While originally designed for the computer vision scenario,
this method has been adapted to the malware classification
A. PRACTICAL IMPACTS
domain, by randomly substituting [49] and deleting [50] bytes
The findings in Section X-B reveal that the evaluated
from the input sample. However, the applicability of these
binary similarity systems are susceptible to both targeted
approaches in the binary similarity domain has not been
and untargeted attacks, though their resilience differs. These
studied yet and must focus on manipulating directly assembly
systems show higher robustness against targeted attacks,
instructions or CFG nodes.
with an average A-rate of 49.43%, whereas the average
A more promising approach consists of analyzing only
A-rate for untargeted attacks is 79.44%. From a practical
a subset of the instructions from the input functions; the
perspective, as detailed in Section I we can consider the three
rationale is that this could thwart the attack by partially
main uses cases of binary similarity systems: vulnerability
destroying the pattern of instructions introduced by the
detection, plagiarism detection, and malware detection. The
adversary. Similarly to [51], one could learn the function
results in the untargeted scenario imply that when having an
representation by focusing only on some random portions of
attacker that is trying to substitute a function with an older,
the input. A more refined approach could consist of filtering
vulnerable version or make a plagiarized function dissimilar
out instructions using techniques such as data-flow analysis
to the original one, then they succeed in nearly 80% of
and micro-trace execution, to concentrate solely on the ones
cases. This suggests that current binary similarity models are
with the highest semantic importance. However, one has to
unfit for tasks such as vulnerability detection or authorship
keep in mind that refined analyses at the pre-processing stage
identification when used in a context that could be subject to
could introduce significant delays that would partially nullify
adversarial attacks (as example, but not limited to, when used
the speed advantages of using DNNs solutions over symbolic
in security sensitive scenarios). To remark on this our results
execution ones.
in Section XI practically show that, when an attacker creates
Finally, one could use an ensemble of all the target models
a new variant of a malicious function without targeting any
combined with a majority voting approach to determine the
specific benign function, then the models fail in recognising
final similarity. As discussed in the evaluation, the various
it as similar to any known malicious sample in nearly 78%
attacked models respond differently to our attacks. This
of the cases. In contrast, the considered models show greater
suggests that an ensemble model could be a feasible defense.
resistance when the attacker is trying to create a variant of
its input matching a specific target function. This implies XIII. LIMITATIONS AND FUTURE WORKS
that the considered models are more resistant when facing an In this paper, we have seen how adding dead code is a natural
attacker trying to make a malicious function closely resemble and effective way to realize appreciable perturbations for a
a specific whitelisted function rather than when the attacker selection of heterogeneous binary similarity systems.
is hiding the malevolent function. However, it is important In Section IV-C, we acknowledged how, in the face of
to note that even in this scenario, as reported in Section XI, defenders that pre-process code with static analysis, our
an attacker can bypass the binary similarity detection system implementation would be limited from having the inserted
in more than half of the cases. dead blocks guarded by non-obfuscated branch predicates.
Furthermore, we highlight that all the approaches we
B. COUNTERMESURES propose consist of inserting into dead branches sequences of
Given these results, even though our primary focus has been instructions that do not present any data-dependency, which
on the attack side, it is important to investigate potential make them easier to detect.
defensive strategies. Our experiments suggest that, depending on the charac-
A typical approach consists of using a classifier as detector teristics of a given model and pair of functions, the success
to distinguish between clean examples and adversarial ones. of an attack may be affected by factors like the initial
In our context, one could use an anomaly detection model difference in code size and CFG topology, among others.
to check whether the input function’s code follows common In this respect, it could be interesting to explore how to
alternate our dead-branch addition perturbation, for example, [10] J. Pewny, F. Schuster, L. Bernhard, T. Holz, and C. Rossow, ‘‘Leveraging
with the insertion of dead fragments within existing blocks. semantic signatures for bug search in binary programs,’’ in Proc. 30th
Annu. Comput. Secur. Appl. Conf. (ACSAC), 2014, pp. 406–415.
We believe both limitations could be addressed in future [11] X. Yuan, P. He, Q. Zhu, and X. Li, ‘‘Adversarial examples: Attacks
work with implementation effort, whereas the main goal of and defenses for deep learning,’’ IEEE Trans. Neural Netw. Learn. Syst.,
this paper was to show that adversarial attacks against binary vol. 30, no. 9, pp. 2805–2824, Sep. 2019.
[12] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow,
similarity systems are a concrete possibility. To enhance our
and R. Fergus, ‘‘Intriguing properties of neural networks,’’ 2013,
attacks, we could explore more complex patching imple- arXiv:1312.6199.
mentation strategies based on binary rewriting or a modified [13] I. J. Goodfellow, J. Shlens, and C. Szegedy, ‘‘Explaining and harnessing
compiler back-end. Such studies may then include also adversarial examples,’’ 2014, arXiv:1412.6572.
[14] N. Carlini and D. Wagner, ‘‘Towards evaluating the robustness of neural
other performant similarity systems, such as Asm2Vec [8] or networks,’’ in Proc. IEEE Symp. Secur. Privacy (SP), May 2017, pp. 39–57.
jTrans [52]. [15] J. Li, S. Qu, X. Li, J. Szurley, J. Z. Kolter, and F. Metze, ‘‘Adversarial
music: Real world audio adversary against wake-word detection system,’’
in Proc. 32nd Annu. Conf. Neural Inf. Process. Syst. (NeurIPS), 2019,
XIV. CONCLUSION pp. 11908–11918.
We presented the first study on the resilience of code models [16] R. Jia and P. Liang, ‘‘Adversarial examples for evaluating reading
for binary similarity to black-box and white-box adversarial comprehension systems,’’ in Proc. 22nd Conf. Empirical Methods Natural
Lang. Process. (EMNLP), 2017, pp. 2021–2031.
attacks, covering targeted and untargeted scenarios. Our tests
[17] F. Pierazzi, F. Pendlebury, J. Cortellazzi, and L. Cavallaro, ‘‘Intriguing
highlight that current state-of-the-art solutions in the field properties of adversarial ml attacks in the problem space,’’ in Proc. 41st
(Gemini, GMN, and SAFE) are not robust to adversarial IEEE Symp. Secur. Privacy (SP), 2020, pp. 1332–1349.
attacks crafted for misleading binary similarity models. Fur- [18] B. Devore-McDonald and E. D. Berger, ‘‘Mossad: Defeating software
plagiarism detection,’’ in Proc. ACM Program. Lang. (OOPSLA), vol. 4,
thermore, their resilience against untargeted attacks appears Jun. 2020, pp. 1–28.
significantly lower in our tests. Our black-box Spatial Greedy [19] A. Hazimeh, A. Herrera, and M. Payer, ‘‘Magma: A ground-truth fuzzing
technique also shows that an instruction-selection strategy benchmark,’’ in Proc. ACM Meas. Anal. Comput. Syst., 2020, vol. 4, no. 3,
pp. 1–29.
guided by a dynamic exploration of the entire ISA is more
[20] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, ‘‘Distributed
effective than using a fixed set of instructions. We hope to representations of words and phrases and their compositionality,’’ in
encourage follow-up studies by the community to improve Proc. 27th Annu. Conf. Neural Inf. Process. Syst. (NeurIPS), 2013,
the robustness and performance of these systems. pp. 3111–3119.
[21] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu,
‘‘Towards deep learning models resistant to adversarial attacks,’’ 2017,
ACKNOWLEDGMENT arXiv:1706.06083.
This work has been carried out while Gianluca Capozzi [22] Y. Li, C. Gu, T. Dullien, O. Vinyals, and P. Kohli, ‘‘Graph matching
networks for learning the similarity of graph structured objects,’’ in Proc.
was enrolled in the Italian National Doctorate on Artificial Int. Conf. Mach. Learn., 2019, pp. 3835–3845.
Intelligence run by Sapienza University of Rome. [23] A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y. Fratantonio, M. Mansouri,
and D. Balzarotti, ‘‘How machine learning is solving the binary function
similarity problem,’’ in Proc. 31st USENIX Secur. Symp. (SEC), 2022,
REFERENCES pp. 2099–2116.
[1] T. Dullien and R. Rolles, ‘‘Graph-based comparison of executable objects [24] N. Mrkšic, D. Ó Séaghdha, B. Thomson, M. Gašic, L. M. Rojas-Barahona,
(English version),’’ in Proc. Symp. sur la sécurité des Technol. de P.-H. Su, D. Vandyke, T.-H. Wen, and S. Young, ‘‘Counter-fitting word
l’information et des Commun. (SSTIC), 2005, vol. 5, no. 1, p. 3. vectors to linguistic constraints,’’ in Proc. Conf. North Amer. Chapter
[2] W. M. Khoo, A. Mycroft, and R. Anderson, ‘‘Rendezvous: A search engine Assoc. Comput. Linguistics: Human Lang. Technol., 2016, pp. 142–148.
for binary code,’’ in Proc. 10th Work. Conf. Mining Softw. Repositories [25] S. Ren, Y. Deng, K. He, and W. Che, ‘‘Generating natural language
(MSR), May 2013, pp. 329–338. adversarial examples through probability weighted word saliency,’’
[3] S. Alrabaee, P. Shirani, L. Wang, and M. Debbabi, ‘‘SIGMA: A semantic in Proc. 57th Annu. Meeting Assoc. Comput. Linguistics, 2019,
integrated graph matching approach for identifying reused functions in pp. 1085–1097.
binary code,’’ Digit. Invest., vol. 12, pp. S61–S71, Mar. 2015. [26] D. Li, Y. Zhang, H. Peng, L. Chen, C. Brockett, M.-T. Sun, and B.
[4] L. Massarelli, G. A. Di Luna, F. Petroni, L. Querzoni, and R. Baldoni, Dolan, ‘‘Contextualized perturbation for textual adversarial attack,’’ in
‘‘Function representations for binary similarity,’’ IEEE Trans. Dependable Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human
Secure Comput., vol. 19, no. 4, pp. 2259–2273, Jul. 2022. Lang. Technol., 2021, pp. 5053–5069.
[5] Y. David, N. Partush, and E. Yahav, ‘‘Statistical similarity of binaries,’’ [27] L. Li, R. Ma, Q. Guo, X. Xue, and X. Qiu, ‘‘BERT-ATTACK: Adversarial
in Proc. 37th ACM SIGPLAN Conf. Program. Lang. Design Implement., attack against BERT using BERT,’’ in Proc. Conf. Empirical Methods
Jun. 2016, pp. 266–280. Natural Lang. Process. (EMNLP), 2020, pp. 6193–6202.
[6] Y. David, N. Partush, and E. Yahav, ‘‘Similarity of binaries through re- [28] N. Yefet, U. Alon, and E. Yahav, ‘‘Adversarial examples for models
optimization,’’ in Proc. 38th ACM SIGPLAN Conf. Program. Lang. Design of code,’’ in Proc. ACM Program. Lang. (OOPSLA), vol. 4, Jun. 2020,
Implement., Jun. 2017, pp. 79–94. pp. 1–30.
[7] M. Egele, M. Woo, P. Chapman, and D. Brumley, ‘‘Blanket execution: [29] W. Zhang, S. Guo, H. Zhang, Y. Sui, Y. Xue, and Y. Xu, ‘‘Challenging
Dynamic similarity testing for program binaries and components,’’ in Proc. machine learning-based clone detectors via semantic-preserving code
23rd USENIX Secur. Symp. (SEC), 2014, pp. 303–317. transformations,’’ IEEE Trans. Softw. Eng., vol. 49, no. 5, pp. 3052–3070,
[8] S. H. H. Ding, B. C. M. Fung, and P. Charland, ‘‘Asm2 Vec: Boosting static May 2023.
representation robustness for binary clone search against code obfuscation [30] B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, C. Eckert,
and compiler optimization,’’ in Proc. IEEE Symp. Secur. Privacy (SP), and F. Roli, ‘‘Adversarial malware binaries: Evading deep learning for
May 2019, pp. 472–489. malware detection in executables,’’ in Proc. 26th Eur. Signal Process. Conf.
[9] X. Xu, C. Liu, Q. Feng, H. Yin, L. Song, and D. Song, ‘‘Neural (EUSIPCO), Sep. 2018, pp. 533–537.
network-based graph embedding for cross-platform binary code similarity [31] F. Kreuk, A. Barak, S. Aviv-Reuven, M. Baruch, B. Pinkas, and J. Keshet,
detection,’’ in Proc. ACM SIGSAC Conf. Comput. Commun. Secur., ‘‘Adversarial examples on discrete sequences for beating whole-binary
Oct. 2017, pp. 363–376. malware detection,’’ 2018, arXiv:1802.04528.
[32] K. Lucas, M. Sharif, L. Bauer, M. K. Reiter, and S. Shintre, ‘‘Malware [52] H. Wang, W. Qu, G. Katz, W. Zhu, Z. Gao, H. Qiu, J. Zhuge, and C. Zhang,
makeover: Breaking ML-based static analysis by modifying executable ‘‘JTrans: Jump-aware transformer for binary code similarity detection,’’
bytes,’’ in Proc. 16th ACM Asia Conf. Comput. Commun. Secur. in Proc. 31st ACM SIGSOFT Int. Symp. Softw. Test. Anal. (ISSTA), 2022,
(AsiaCCS), 2021, pp. 744–758. pp. 1–13.
[33] W. Song, X. Li, S. Afroz, D. Garg, D. Kuznetsov, and H. Yin, ‘‘MAB-
malware: A reinforcement learning framework for blackbox generation of
adversarial malware,’’ in Proc. 17th ACM Asia Conf. Comput. Commun.
Secur. (AsiaCCS), 2022, pp. 990–1003.
[34] L. Jia, B. Tang, C. Wu, Z. Wang, Z. Jiang, Y. Lai, Y. Kang, N. Liu, and J. GIANLUCA CAPOZZI received the master’s
Zhang, ‘‘FuncFooler: A practical black-box attack against learning-based degree in engineering in computer science from
binary code similarity detection methods,’’ 2022, arXiv:2208.14191. the Sapienza University of Rome, Italy, in 2021,
[35] B. Biggio and F. Roli, ‘‘Wild patterns: Ten years after the rise of adversarial where he is currently pursuing the Ph.D. degree.
machine learning,’’ Pattern Recognit., vol. 84, pp. 317–331, Dec. 2018. His main research interest includes adversarial
[36] P. Borrello, D. C. D’Elia, L. Querzoni, and C. Giuffrida, ‘‘Constantine: machine learning against neural network models
Automatic side-channel resistance using efficient control and data flow for binary analysis.
linearization,’’ in Proc. ACM SIGSAC Conf. Comput. Commun. Secur.,
Nov. 2021, pp. 715–733.
[37] S. Heule, E. Schkufza, R. Sharma, and A. Aiken, ‘‘Stratified synthesis:
Automatically learning the x86–64 instruction set,’’ in Proc. 37th
ACM SIGPLAN Conf. Program. Lang. Des. Implement. (PLDI), 2016,
pp. 237–250.
[38] Z. L. Chua, S. Shen, P. Saxena, and Z. Liang, ‘‘Neural nets can learn DANIELE CONO D’ELIA received the Ph.D.
function type signatures from binaries,’’ in Proc. 26th USENIX Secur. degree in engineering in computer science from
Symp. (SEC), 2017, pp. 99–116. the Sapienza University of Rome, in 2016. He is
[39] H. Dai, B. Dai, and L. Song, ‘‘Discriminative embeddings of latent variable currently a tenure-track Assistant Professor with
models for structured data,’’ in Proc. 33rd Int. Conf. Mach. Learn. (ICML), the Sapienza University of Rome. His research
vol. 48, 2016, pp. 2702–2711. activities span several fields across software and
[40] J. Pennington, R. Socher, and C. Manning, ‘‘GloVe: Global vectors for systems security, with contributions in the analysis
word representation,’’ in Proc. 19th Conf. Empirical Methods Natural of adversarial code and in the design of program
Lang. Process. (EMNLP), 2014, pp. 1532–1543.
analyses and transformations to make software
[41] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, ‘‘Enriching word
more secure.
vectors with subword information,’’ Trans. Assoc. Comput. Linguistics,
vol. 5, pp. 135–146, Dec. 2017.
[42] C. Guo, J. R. Gardner, Y. You, A. G. Wilson, and K. Q. Weinberger, ‘‘Sim-
ple black-box adversarial attacks,’’ in Proc. 36th Int. Conf. Mach. Learn.
(ICML), vol. 97, 2019, pp. 2484–2493. GIUSEPPE ANTONIO DI LUNA received the
[43] J. Chen, M. I. Jordan, and M. J. Wainwright, ‘‘HopSkipJumpAttack: A Ph.D. degree. After the Ph.D. study, he did a post-
query-efficient decision-based attack,’’ in Proc. 41st IEEE Symp. Secur. doctoral research with the University of Ottawa,
Privacy (SP), 2020, pp. 1277–1294. Canada, working on fault tolerant distributed algo-
[44] W. K. Wong, H. Wang, Z. Li, and S. Wang, ‘‘BinAug: Enhancing binary rithms, distributed robotics, and algorithm design
similarity analysis with low-cost input repairing,’’ in Proc. IEEE/ACM 46th for programmable particles. In 2018, he started a
Int. Conf. Softw. Eng., vol. 9, Feb. 2024, pp. 1–13.
postdoctoral research with Aix-Marseille Univer-
[45] L. Massarelli, G. A. Di Luna, F. Petroni, L. Querzoni, and R. Baldoni,
sity, France, where he worked on dynamic graphs.
‘‘Investigating graph embedding neural networks with unsupervised
features extraction for binary analysis,’’ in Proc. Workshop Binary Anal.
Currently, he is performing research on applying
Res., 2019, pp. 1–11. NLP techniques to the binary analysis domain.
[46] X. He, S. Wang, Y. Xing, P. Feng, H. Wang, Q. Li, S. Chen, and K. Sun, He is an Associate Professor with the Sapienza University of Rome, Italy.
‘‘BinProv: Binary code provenance identification without disassembly,’’
in Proc. 25th Int. Symp. Res. Attacks, Intrusions Defenses, Oct. 2022,
pp. 350–363.
[47] K. Lucas, S. Pai, W. Lin, L. Bauer, M. K. Reiter, and M. Sharif, LEONARDO QUERZONI received the Ph.D.
‘‘Adversarial training for raw-binary malware classifiers,’’ in Proc. 32nd degree with a thesis on efficient data routing algo-
USENIX Secur. Symp., 2023, pp. 1163–1180.
rithms for publish/subscribe middleware systems,
[48] J. Cohen, E. Rosenfeld, and J. Z. Kolter, ‘‘Certified adversarial robustness
in 2007. He is a Full Professor with the Sapienza
via randomized smoothing,’’ in Proc. 36th Int. Conf. Mach. Learn. (ICML),
University of Rome, Italy. He has authored
vol. 97, 2019, pp. 1310–1320.
[49] D. Gibert, G. Zizzo, and Q. Le, ‘‘Towards a practical defense against adver- more than 80 papers published in international
sarial attacks on deep learning-based malware detectors via randomized scientific journals and conferences. His research
smoothing,’’ 2023, arXiv:2308.08906. interests range from cyber security to distributed
[50] Z. Huang, N. G. Marchant, K. Lucas, L. Bauer, O. Ohrimenko, and systems, in particular binary similarity, distributed
B. I. P. Rubinstein, ‘‘RS-Del: Edit distance robustness certificates for stream processing, dependability, and security in
sequence classifiers via randomized deletion,’’ in Proc. 36th Annu. Conf. distributed systems. In 2017, he received the Test of Time Award from the
Neural Inf. Process. Syst. (NeurIPS), 2023, pp. 1–36. ACM International Conference on Distributed Event-Based Systems for the
[51] D. Gibert, L. Demetrio, G. Zizzo, Q. Le, J. Planes, and B. Biggio, paper TERA: Topic-Based Event Routing for Peer-to-Peer Architectures,
‘‘Certified adversarial robustness of machine learning-based malware published, in 2007.
detectors via (De)Randomized smoothing,’’ 2024, arXiv:2405.00392.
Open Access funding provided by ‘Università degli Studi di Roma La Sapienza 2’ within the CRUI CARE Agreement