0% found this document useful (0 votes)
45 views23 pages

Adversarial Attacks Against Binary Similarity Systems

This paper investigates the vulnerability of deep learning models used for binary similarity analysis against adversarial attacks, highlighting their susceptibility to both targeted and untargeted attacks from black-box and white-box attackers. The authors propose new techniques, including a black-box attack called Spatial Greedy, and conduct extensive experiments on three state-of-the-art binary similarity systems, revealing that the models are more vulnerable to untargeted attacks. The findings emphasize the need for improved resilience in binary similarity systems to counter adversarial threats.

Uploaded by

Munnu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views23 pages

Adversarial Attacks Against Binary Similarity Systems

This paper investigates the vulnerability of deep learning models used for binary similarity analysis against adversarial attacks, highlighting their susceptibility to both targeted and untargeted attacks from black-box and white-box attackers. The authors propose new techniques, including a black-box attack called Spatial Greedy, and conduct extensive experiments on three state-of-the-art binary similarity systems, revealing that the models are more vulnerable to untargeted attacks. The findings emphasize the need for improved resilience in binary similarity systems to counter adversarial threats.

Uploaded by

Munnu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Received 11 September 2024, accepted 22 October 2024, date of publication 30 October 2024, date of current version 11 November 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3488204

Adversarial Attacks Against Binary


Similarity Systems
GIANLUCA CAPOZZI , DANIELE CONO D’ELIA , GIUSEPPE ANTONIO DI LUNA ,
AND LEONARDO QUERZONI
Department of Computer, Control, and Management Engineering Antonio Ruberti, Sapienza University of Rome, 00185 Rome, Italy
Corresponding author: Gianluca Capozzi ([email protected])
This work was supported in part by the Ministero dell’ Università e della Ricerca (MUR) National Recovery and Resilience Plan funded by
the European Union–NextGenerationEU through the Project SEcurity and RIghts In the CyberSpace (SERICS) under Grant PE00000014
and through the project Rome Technopole under Grant ECS00000024, in part by Sapienza Ateneo under Project RM1221816C1760BF and
Project AR1221816C754C33, and in part by the Amazon Web Services (AWS) Cloud Credit Program.

ABSTRACT Binary analysis has become essential for software inspection and security assessment. As the
number of software-driven devices grows, research is shifting towards autonomous solutions using deep
learning models. In this context, a hot topic is the binary similarity problem, which involves determining
whether two assembly functions originate from the same source code. However, it is unclear how deep
learning models for binary similarity behave in an adversarial context. In this paper, we study the resilience of
binary similarity models against adversarial examples, showing that they are susceptible to both targeted and
untargeted (w.r.t. similarity goals) attacks performed by black-box and white-box attackers. We extensively
test three state-of-the-art binary similarity solutions against (i) a black-box greedy attack that we enrich
with a new search heuristic, terming it Spatial Greedy, and (ii) a white-box attack in which we repurpose
a gradient-guided strategy used in attacks to image classifiers. Interestingly, the target models are more
susceptible to black-box attacks than white-box ones, exhibiting greater resilience in the case of targeted
attacks.

INDEX TERMS Adversarial attacks, binary analysis, binary code models, binary similarity, black-box
attacks, greedy, white-box attacks.

I. INTRODUCTION techniques for binary similarity generalize, as they are


An interesting problem that currently is a hot topic in the able to find similarities between semantically similar
security and software engineering research communities [1], functions.
[2], [3], is the binary similarity problem. That is, to determine We can distinguish binary similarity solutions between the
if two functions in assembly code are compiled from the same ones that use deep neural networks (DNNs), like [4], [8], and
source code [4]: if so, the two functions are said similar. This [9], and the ones that do not, like [1] and [10]. Nearly all of
problem is far from trivial: it is well-known that different the most recent works rely on DNNs, which offer in practice
compilers and optimization levels radically change the shape state-of-the-art performance while being computationally
of the generated assembly code. inexpensive. This aspect is particularly apparent when
Binary similarity has many applications, including pla- compared with solutions that build on symbolic execution or
giarism detection, malware detection and classification, other computationally intensive techniques.
and vulnerability detection [5], [6], [7]. It can also be A drawback of DNN-based solutions is their sensitivity
a valid aid for a reverse engineer as it helps with the to adversarial attacks [11] where an adversary crafts
identification of functions taken from well-known libraries or an innocuously looking instance with the purpose of
open-source software. Recent research [4] shows that misleading the target neural network model. Successful
adversarial attacks are well-documented for DNNs that
The associate editor coordinating the review of this manuscript and process, for example, images [12], [13], [14], audio and video
approving it for publication was Mahmoud Elish . samples [15], and text [16].
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 12, 2024 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 161247
G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

In spite of the wealth of works identifying similar functions 59.68% for GMN, and 60.68% for SAFE. However, in the
with ever improving accuracy, we found that an extensive untargeted scenario, these percentages increased to 53.89%
study on the resilience of (DNN-based) binary similarity for Gemini, 93.81% for GMN, and 90.62% for SAFE. Our
solutions against adversarial attacks is missing. Indeed, analysis shows that all target models are more resilient to
we believe binary similarity systems are an attractive target our white-box procedure; we believe this is largely due to the
for an adversary. As examples, an attacker: (1) may hide a inherent challenges of conducting gradient-based attacks on
malicious function inside a firmware by making it similar to models that use discrete representations.
a benign white-listed function, as similarly done in malware
misclassification attacks [17]; (2) may make a plagiarized A. CONTRIBUTIONS
function dissimilar to the original one, analogously to source This paper proposes the following contributions:
code authorship attribution attacks [18]; or, we envision, (3)
may replace a function—entirely or partially, as in forward • we propose to study the problem of adversarial attacks
porting of bugs [19]—with an old version known to have a against binary similarity systems, identifying targeted
vulnerability and make the result dissimilar from the latter. and untargeted attack opportunities;
In this context, we can define an attack targeted when • we investigate black-box attacks against DNN-based
the goal is to make a rogue function be the most similar binary similarity systems, exploring an instruction
to a target, as with example (1). Conversely, an attack is insertion technique based on a greedy optimizer. Where
untargeted when the goal is to make a rogue function the applicable, we enhance it in a gray-box fashion for effi-
most dissimilar from its original self, as with examples (2) ciency, using partial knowledge of the model sensitivity
and (3). In both scenarios, the adversarial instance has to to instruction types;
preserve the semantics (i.e., execution behavior) of the rogue • we propose Spatial Greedy, a fully black-box attack
function as in the original. that matches or outperforms gray-box greedy by using
In this paper, we aim to close this gap by proposing and a novel search heuristic for guiding the choice of the
evaluating techniques for targeted and untargeted attacks candidates’ instructions used during the attack;
using both black-box (where adversaries have access to the • we investigate white-box attacks against DNN-based
similarity model without knowing its internals) and white- binary similarity systems, exploring a gradient-guided
box (where they know also its internals) methods. search strategy for inserting instructions;
For the black-box scenario, we adopt a greedy optimizer to • we conduct an extensive experimental evaluation of our
modify a function by inserting a single assembly instruction techniques in different attack scenarios against three
to its body at each optimization step. Where applicable, systems backed by largely different models and with
we consider an enhanced gray-box [17] variant that, lever- high performance in recent studies [23].
aging limited knowledge of the model, chooses only between
instructions that the model treats as distinct. II. RELATED WORKS
We then enrich the greedy optimizer with a novel In this section, we first discuss loosely related approaches for
black-box search heuristic, where we transform the discrete attacking image classifiers and natural language processing
space of assembly instructions into a continuous space using (NLP) models; then, we describe attacks against source code
a technique based on instruction embeddings [20]. We call models. Finally, we discuss prominent attacks against models
this enhanced black-box attack Spatial Greedy. When using for binary code analysis.
our heuristic, the black-box attack is on par or outperforms
the gray-box greedy attack, without requiring any knowledge A. ATTACKS TO IMAGE CLASSIFIERS AND NLP MODELS
of the model. For the white-box scenario, we repurpose Historically, the first adversarial attacks targeted image
a method for adversarial attacks on images that relies on classifiers. The crucial point for these attacks is to insert
gradient descent [21] and use it to drive instruction insertion inside a clean image instance a perturbation that should not
decisions. be visible to the human eye while being able to fool the target
We test our techniques against three binary similarity model, as first pointed out by [12] and [13].
systems—Gemini [9], GMN [22], and SAFE [4]—focusing Most of the attacks modify the original instances using
on three research questions: (RQ1) determining whether the gradient-guided methods. In particular, when computing an
target models are more robust against targeted or untargeted adversarial example, they keep the weights constant while
attacks; (RQ2) assessing whether the target models exhibit altering the starting input in the direction of the gradient that
greater resilience to black-box or white-box approaches; mimizes (or maximizes, depending on whether the attack
and (RQ3) exploring how target models influence the is targeted or untargeted) the loss function of the attacked
effectiveness of our attacks. Our results indicate that all the model. The FGSM attack [13] explicitly implements this
three models are inherently more vulnerable to untargeted technique. Other attacks, such as the Carlini-Wagner [14]
attacks. In the targeted scenario, the best attack technique one, generate a noise that is subject to Lp -norm constraints
mislead the target models in 31.6% of instances for Gemini, to preserve similarity to original objects.

161248 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

As observed in Section III-B, adversarial examples genera- binary. We emphasize that binary similarity systems analyze
tion is possibly easier in the image domain than in the textual executable code, meaning these attacks are ineffective in our
one, due to the continuous representation of the original scenario.
objects. In the NLP domain, the inputs are discrete objects, Pierazzi et al. [17] explore transplanting binary code
a fact that prevents any direct application of gradient-guided gadgets into a malicious Android program to avoid detection.
methods for adversarial examples generation. Ideally, check The attack follows a gradient-guided search strategy based on
perturbations to fool deep models for language analysis a greedy optimization. In the initialization phase, they mine
should be grammatically correct and semantically coherent from benign binaries code gadgets that modify features that
with the original instance. the classifier uses to compute its classification score. In the
One of the earliest methodologies for attacking NLP attack phase, they pick the gadgets that can mostly contribute
models is presented in [16]. The authors propose attacks to to the (mis)classification of the currently analyzed malware
mislead deep learning-based reading comprehension systems sample; they insert gadgets in order of decreasing negative
by inserting perturbations in the form of new sentences inside contribution, repeating the procedure until misclassification
a paragraph, so as to confuse the target model while maintain- occurs. To preserve program semantics, gadgets are injected
ing intact the original correct answer. The attacks proposed into never-executed code portions. Differently from our main
in [24] and [25] focus on finding replacement strategies contribution, their attack is only applicable in a targeted
for words composing the input sequence. Intuitively, valid white-box scenario.
substitutes should be searched through synonyms; however, Lucas et al. [32] target malware classifiers analyzing
this strategy could fall short in considering the context raw bytes. They propose a functionality-preserving iterative
surrounding the word to substitute. Works like [26] and [27] procedure viable for both black-box and white-box attackers.
further investigate this idea using BERT-based models for At every iteration, the attack determines a set of applicable
identifying accurate word replacements. perturbations for every function in the binary and applies a
randomly selected one (following a hill-climbing approach
B. ATTACKS AGAINST MODELS FOR SOURCE CODE in the black-box scenario or using the gradient in the white-
ANALYSIS box one). Done via binary rewriting, the perturbations are
This section covers some prominent attacks against models local and include instruction reordering, register renaming,
that work on source code. and replacing instructions with equivalent ones of identical
The general white-box attack of [28] iteratively substitutes length. The results show that these perturbations can be
a target variable name in all of its occurrences with effective even against (ML-based) commercial antivirus
an alternative name until a misclassification occurs. The products, leading the authors to advocate for augmenting
attack against plagiarism detection from [18] uses genetic such systems with provisions that do not rely on ML.
programming to augment a program with code lines picked In the context of binary similarity, though, we note that
from a pool and validated for program equivalence by these perturbations would have limited efficacy if done on
checking that an optimizer compiler removes them. The a specific pair of functions: for example, both instruction
attack against clone detectors from [29] combines several reordering and register renaming would go completely
semantics-preserving perturbations of source code using unnoticed by Gemini and GMN (Section VIII-A and VIII-B).
different optimization heuristic strategies. Furthermore, since [32] is mainly designed for models that
We highlight that these approaches have limited applicabil- classify binary programs, it is not directly applicable in our
ity in the binary similarity scenario, as their perturbations may scenario, where the output of the model is a real value
not survive compilation (e.g., variable renaming) or result representing the distance between the two inputs.
in marginal differences in compiled code (e.g., turning a MAB-Malware [33] is a reinforcement learning-based
while-loop into a for-loop). approach for generating adversarial examples against PE mal-
ware classifiers in a black-box context. Adversarial examples
are generated through a multi-armed bandit (MAB) model
C. ATTACKS AGAINST MODELS FOR BINARY CODE
that has to keep the sample in a single, non-evasive state
ANALYSIS
when selecting actions while learning reward probabilities.
We complete our review of related works by covering The goal of the optimization strategy is to maximize the
research on evading ML-based models for analysis of binary total reward. The set of applicable perturbations (which
code. can be considered as actions) are standard PE manipulation
techniques from prior works: header manipulation, section
1) ATTACKS AGAINST MALWARE DETECTORS insertion and manipulation (e.g., adding trailing byte), and
Attacks such as [30] and [31] to malware detectors based in-place randomization of an instruction sequence (i.e.,
on convolutional neural networks add perturbations in a new replacing it with a semantically equivalent one). Each action
non-executable section appended to a Windows PE binary. is associated with a specific content—a payload—added
Both use gradient-guided methods for choosing single-byte to the malware when the action is selected. An extensive
perturbations to mislead the model in classifying the whole evaluation is conducted on two popular ML-based classifiers

VOLUME 12, 2024 161249


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

and three commercial antivirus products. As outlined for Different attack types may suit different scenarios best.
other works, our scenario does not allow for the application A white-box attack, for example, could be attempted on
of this approach for two primary reasons. Firstly, this attack an open-source malware classifier. Conversely, a black-
is specifically designed to target classifiers. Secondly, many box attack would suit also a model hosted on a remote
of the proposed transformations are ineffective when applied server to interrogate, as with a commercial cloud-based
to binary similarity systems. antivirus.

2) ATTACKS AGAINST BINARY SIMILARITY MODELS


Concurrently to our work, a publicly available technical
report proposes FuncFooler [34] as a black-box algorithm
for attempting untargeted attacks against ranking systems
(i.e., top-k most similar functions) based on binary similarity.
The key idea behind the attack is to insert instructions
likely to push the source function below the top results
returned by the search engine. Insertion points are fixed:
specifically, CFG nodes that dominate the exit points of
a function. The algorithm picks the instructions directly
from those functions with the least similarity in the pool
under analysis, then it compensates for their side effects
through additional insertions. Differently from their goal FIGURE 1. A feature mapping function maps problem-space objects into
to attack binary similarity-based ranking systems, our goal feature vectors. The two boxed binary functions implement similar
functionalities and are mapped to two points close in the feature
is to directly attack the similarity function implemented space.
by the target model; additionally, differently from their
black-box approach designed only for untargeted attacks,
we propose methodologies for assessing the robustness of
the considered systems against both targeted and untargeted B. INVERSE FEATURE MAPPING PROBLEM
attacks, extending the evaluation to white-box attacks. In the following, we refer to the input domain as problem
space and to all its instances as problem-space objects.
III. BACKGROUND
Deep learning models can manipulate only continuous
problem-space objects. When inputs have a discrete rep-
In this section, we provide background knowledge for
resentation, a first phase must map them into continuous
adversarial attacks against models for code analysis. Then,
we introduce a categorization of semantics-preserving pertur- instances. The phase usually relies on a feature mapping
function (Figure 1) whose outputs are feature vectors. The set
bations for binary functions.
of all possible feature vectors is known as the feature space.
Traditional white-box attacks against deep learning models
A. ADVERSARIAL KNOWLEDGE
solve an optimization problem in the feature space by
We can describe a deep learning model through different minimizing an objective function in the direction fol-
aspects: training data, layers architecture, loss function, and lowing its negative gradient [21]. When the optimization
weights parameters. Having complete or partial knowledge ends, they obtain a feature vector that corresponds to a
about such elements can facilitate an attack from a compu- problem-space object representing the generated adversarial
tational point of view. According to seminal works in the example.
area [17], [35], we can distinguish between: Unfortunately, given a feature vector, it is not always
• white-box attacks, where the attacker has perfect knowl- possible to obtain its problem-space representation. This
edge of the target model, including all the dimensions issue is called the inverse feature mapping problem [17].
mentioned before. These type of attacks are realistic For code models, the feature mapping function is neither
when the adversary has direct access to the model (e.g., invertible nor differentiable. Therefore, one cannot under-
an open-source malware classifier); stand how to modify an original problem-space object to
• gray-box attacks, where the attacker has partial knowl- obtain the given feature vector. In particular, the attacker
edge of the target model. For example, they have has to employ approximation techniques that create a
knowledge about feature representation (e.g., categories feasible problem-space object from a feature vector. Ulti-
of features relevant for feature extraction); mately, mounting an attack requires a manipulation of a
• black-box attacks: the attacker has zero knowledge of problem-space object via perturbations guided by either
the target model. Specifically, the attacker is only aware gradient-space attacks (as in the white-box case above) or
of the task the model was designed for and has a rough ‘‘gradient-free’’ optimization techniques (as with black-box
idea of what potential perturbations to apply to cause attacks). We discuss perturbations specific to our context
some feature changes [35]. next.

161250 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

• (RR) Register Renaming: change all the occurrences of


a register as instruction operand with a register currently
not in use or swap the use of two registers.
Figure 3 shows examples of their application. As for
perturbations that affect the (binary-level) CFG layout,
we can identify the ones that involve adding or deleting nodes:

FIGURE 2. Taxonomy of semantics-preserving perturbations suitable for


• (DBA) Dead Branch Addition: add dead code in a basic
the proposed attacks. Acronyms are spelled out in the body of the paper. block guarded by an always-false branch;
• (NS) Node Split: split a basic block without altering the
semantics of its instructions (e.g., the original block will
C. SEMANTICS-PRESERVING PERTURBATIONS OF
jump to the one introduced with the split);
PROBLEM-SPACE OBJECTS
• (NM) Node Merge: merge two basic blocks when
semantics can be preserved. For example, by using
In this section, we discuss how to modify problem-space
predicated execution to linearize branch-dependent
objects in the specific case of binary code models working
assignments as conditional mov instructions [36].
on functions. To this end, we review and extend perturbations
from prior works [17], [32], [33], identifying those suitable And the ones that leave the graph structure unaltered:
for adversarial manipulation of functions.
• (CP) Complement Predicates: change the predicate of
For our purpose, we seek to modify an original binary
a conditional branch and the branch instruction with
function f into an adversarial binary example fadv that
their negated version;
preserves the semantics of f ; intuitively, this restricts the
• (IBR) Independent Blocks Reordering: change the
set of available perturbations for the adversary. We report
order in which independent basic blocks appear in the
a taxonomy of possible semantics-preserving perturbations
binary representation of the function.
in Figure 2, dividing them according to how they affect the
binary layout of the function’s control-flow graph (CFG).
IV. THREAT MODEL AND PROBLEM DEFINITION
In this section, we define our threat model together with the
problem of attacking binary similarity models.

A. THREAT MODEL
The focus of this work is to create adversarial instances
that attack a model at inference time (i.e., we do not
investigate attacks at training time). Following the description
provided in Section III-A, we consider two different attack
scenarios: respectively, a black-box and a white-box one.
In the first case, the adversary has no knowledge of the target
binary similarity model; nevertheless, we assume they can
perform an unlimited number of queries to observe the output
produced by the model. In the second case, we assume that the
attacker has perfect knowledge of the target binary similarity
model.
FIGURE 3. Examples of semantics-preserving perturbations that do not
alter the binary CFG layout. We modify the assembly snippet in (a) by B. PROBLEM DEFINITION
applying, in turn, (b) Instruction Reordering, (c) Semantics-Preserving
Rewriting, and (d) Register Renaming. Altered instructions are in red.
Let sim be a similarity function that takes as input two
functions, f1 and f2 , and returns a real number, the similarity
score between them, in [0, 1].
Among CFG-preserving perturbations, we identify: We define two binary functions to be semantically equiv-
• (IR) Instruction Reordering: reorder independent alent if they are two implementations of the same abstract
instructions in the function; functionality. We assume that there exists an adversary that
• (SPR) Semantics-Preserving Rewriting: substitute a wants to attack the similarity function. The adversary can
sequence of instructions with a semantically equivalent mount two different kind of attacks:
sequence; • Targeted attack. Given two binary functions, f1 (identi-
• (DSL) Modify the Data-Section Layout: modify the fied as source) and f2 (identified as target), the adversary
memory layout of the .data section and update all the wants to find a binary function fadv semantically
global memory offsets referenced by instructions; equivalent to f1 such that: sim(fadv , f2 ) ≥ τt , where τt is a

VOLUME 12, 2024 161251


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

FIGURE 4. Overall workflow of the black-box ε-greedy perturbation-selection strategy in the targeted scenario.

FIGURE 5. Toy example describing how the source function f1 is modified during the various steps of our Spatial Greedy attack. We first identify the set
of available positions and initialize the candidates’ set CAND (a). Then, we enumerate all the possible perturbations (b) and choose one according to the
ε-greedy strategy while updating CAND according to the Spatial Greedy heuristic (c). This process (d) is repeated until a successful adversarial example is
generated or we reach a maximum number of iterations.

success threshold1 chosen by the attacker depending on is as dissimilar as possible from its original version, as in the
the victim at hand. example scenarios (2) and (3) also from Section I.
• Untargeted attack. Given a binary function f1 , the
adversary goal consists of finding a binary function fadv C. PERTURBATION SELECTION
semantically equivalent to f1 such that: sim(f1 , fadv ) ≤ Given a binary function f1 , our attack consists in applying to
τu . The threshold τu is the analogous of the previous case it perturbations that do not alter its semantics.
for the untargeted attack scenario. To study the feasibility of our approach, we choose dead
Loosely speaking, in case of targeted attack, the attacker branch addition (DBA) among the suitable perturbations
wants to create an adversarial example that is as similar as outlined in Section III-C. We find DBA adequate for this
possible to a specific function, as in the example scenario (1) study for two reasons: it is sufficiently expressive so as to
presented in Section I. In case of untargeted attack, the goal affect heterogeneous models (which may not hold for others2 )
of the attacker consists of creating an adversarial example that and its implementation complexity for an attacker is fairly
2 For example, basic block-local transformations such as IR and RR
1 Although f
adv and f2 are similar for the model, they are not semantically would have limited efficacy on models that study an individual block for
equivalent: this is precisely the purpose of an attack that wants to fool the its instruction types and counts or other coarse-grained abstractions. This is
model to consider them as such, while they are not. the case with Gemini and GMN that we attack in this paper.

161252 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

limited. Nonetheless, other choices remain possible, as we of the one that the standard greedy strategy picks, and with
will further discuss in Section XIII. probability 1 − ε the one representing the local optimum.
At each application, our embodiment of DBA inserts in the In case of targeted attack, the objective function is the
binary code of f1 one or more instructions in a new or existing similarity between fadv and the target function f2 (formally,
basic block guarded by a branch that is never taken at runtime sim(fadv , f2 )) while it is the negative of the similarity between
(i.e., we use an always-false branch predicate). fadv and the original function in case of untargeted attack
Such a perturbation can be done at compilation time (formally, −sim(f1 , fadv )). In the following, we only discuss
or on an existing binary function instance. For our study, the maximization strategy followed by targeted attacks;
we apply DBA during compilation by adding placeholder mutatis mutandis, the same rationale holds for untargeted
blocks as inline assembly, which eases the generation of attacks.
many adversarial examples from a single attacker-controlled
code. State-of-the-art binary rewriting techniques would 1) LIMITATIONS OF THE COMPLETE ENUMERATION
work analogously over already-compiled source functions. STRATEGY
We currently do not attempt to conceal the nature of our
At each step, Greedy enumerates all the applicable pertur-
branch predicates for preprocessing robustness, which [17]
bations computing the marginal increase of the objective
discusses as something that attackers should be wary of to
function, thus resulting in selecting an instruction in by
mount stronger attacks. We believe off-the-shelf obfuscations
enumerating all the possible instructions of the considered set
(e.g., opaque predicates, mixed boolean-arithmetic expres-
of candidates CAND for each position bl ∈ BLK.
sions) or more complex perturbation choices may improve
Unfortunately, the Instruction Set Architecture (ISA) of a
our approach in this respect. Nevertheless, our main goal was
modern CPU may consist of a large number of instructions.
to investigate its feasibility in the first place.
To give an example, consider the x86-64 ISA: according
to [37], it has 981 unique mnemonics and a total of
V. BLACK-BOX ATTACK: SOLUTION OVERVIEW 3,684 instruction variants (without counting register operand
In this section, we describe our black-box attack. We first choices for them). Therefore, it would be unfeasible to have
introduce our baseline (named Greedy), highlighting its a CAND set that covers all possible instructions of an ISA.
limitations. We then move to our main contribution in the This means that the size of CAND must be limited. One
black-box scenario (named Spatial Greedy). Figure 4 depicts possibility is to use hand-picked instructions. However, this
a general overview of our black-box approach. approach has two problems. Such a set could not cover
all the possible behaviors of the ISA, missing fundamental
A. GREEDY
aspects (for example, leaving vector instructions uncovered);
furthermore, this effort has to be redone for a new ISA. There
The baseline black-box approach we consider for attacking
is also a more subtle pitfall: a set of candidates fixed in
binary function similarity models consists of an iterative
advance could include instructions that the specific binary
perturbation-selection rule that follows a greedy optimization
similarity model under attack deems as not significant.
strategy. Starting from the original sample f1 , we iter-
On specific models, it may still be possible to use a
atively apply perturbations T1 , T2 , . . . , Tk selected from
small set of candidates profitably, enabling a gray-box
a set of available ones, generating a series of instances
attack strategy for Greedy. In particular, one can restrict the
fadv1 , fadv2 , . . . , fadvk . This procedure ends upon generating
set of instructions to the ones that effectively impact the
an example fadv meeting the desired similarity threshold,
features extracted by the attacked model (which obviously
otherwise the attack fails after δ̄ completed iterations.
requires knowledge of the features it uses; hence, the gray-
For instantiating Greedy using DBA, we reason on a set of
box characterization). In such cases, this strategy is equivalent
positions BLK for inserting dead branches in function f1 and a
to the black-box Greedy attack that picks from all the
set of instructions CAND, which we call the set of candidates.
instructions in the ISA, but computationally much more
Each perturbation consists of a ⟨bl, in⟩ pair made of the
efficient.
branch bl ∈ BLK and an instruction in ∈ CAND to check
insert in the dead code block guarded by bl.
The naive perturbation-selection rule (i.e., greedy) at each B. SPATIAL GREEDY
step selects the perturbation that, in case of targeted attack, In this section, we extend the baseline approach by intro-
locally maximizes the relative increase of the objective ducing a fully black-box search heuristic. To differentiate
function. Conversely, for an untargeted attack, the optimizer between the baseline solution and the heuristic-enhanced one,
selects the perturbation that locally maximizes the relative we name the latter Spatial Greedy.
decrease of the objective function. When using this heuristic, the black-box attack overcomes
This approach, however, may be prone to finding local all the limitations discussed for Greedy using an adaptive
optima. To avoid this problem, we choose as our Greedy base- procedure that dynamically updates the set of candidates
line an ε-greedy perturbation-selection rule. Here, we select according to a feedback from the model under attack without
with a small probability ε a suboptimal perturbation instead requiring any knowledge of it.

VOLUME 12, 2024 161253


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

Algorithm 1 Spatial Greedy procedure (targeted case)


Input: source function f1 , target function f2 , similarity threshold τt , max
number of dead branches B, max number of instructions to be inserted δ̄, max
number of instructions to be tested N , max number of random instructions r,
max number of neighbours c, probability of selecting a random perturbation
ε.
Output: adversarial sample fadv .
Definitions:
• The function getPositions(f1 , B) identifies B positions inside f1 where
it is possible to insert dead branches.
• The function getRandomInstructions(N ) samples uniformly N instruc-
tions from the entire ISA.
• The operator ⊕ indicates the insertion into a function of a certain
instruction into a specific block.
• The function selectGreedy(·) takes as input a vector of pairs
⟨⟨bl, in⟩, currSim⟩ and returns the ⟨bl, in⟩ perturbation associated
to the maximum currSim value.
• The function selectRandom(·) takes as input a vector of pairs
⟨⟨bl, in⟩, currSim⟩ and returns a perturbation uniformly sampled.
• The function getTopK(·, K ) takes as input a vector of pairs
FIGURE 6. Dynamic update of the set of candidates. The mov instruction ⟨⟨bl, in⟩, currSim⟩ and returns the instructions associated to the top-
is the greedy action for the current iteration and is mapped to the blue K greedy actions.
point in the instruction embedding space. The set of candidates is • The function updateInstructions(·, r, c) takes as input a vector of
updated selecting c/k neighbours of the considered top-k perturbation instructions and returns a vector containing c of their neighbours and
(represented in red), c − c/k instructions among the closest neighbours r instructions sampled uniformly at random.
of the remaining top-k greedy perturbations, and rN random instructions.
1: fadv ← f1
2: instr ← 0
In Spatial Greedy, we extend the ε-greedy perturbation- 3: BLK ← getPositions(f1 , B) a
4: CAND ← getRandomInstructions(N )
selection strategy by adaptively updating the set of candidates 5: sim ← sim(fadv , f2 )
that we use at each iteration. Using instructions embedding 6: while sim ≤ τt AND instr < δ̄ do
techniques, we transform each instruction in ∈ CAND into 7: iterSim ← sim
8: iterBlock ← ⟨⟩
a vector of real values. This creates vectors that partially 9: testedPerts ← [ ]
preserve the semantics of the original instructions. 10: for ⟨bl, in⟩ ∈ BLK × CAND do
Chua et al. [38] showed that such vectors may be grouped 11: f adv ← fadv ⊕ ⟨bl, in⟩
12: currSim ← sim(f adv , f2 ) b
by instruction semantics, creating a notion of proximity 13: testedPerts.append(⟨⟨bl, in⟩, currSim⟩)
between instructions: for example, vectors representing
arithmetic instructions are in a cluster, vectors representing 14: prob ← uniform(0, 1) d
15: if prob < ε then
branches in another, and so on. 16: iterPert, iterSim ← selectGreedy(testedPerts)
Here, at each step, we populate a portion of the set of 17: else
c
candidates by selecting the instructions that are close, in the 18: iterPert, iterSim ← selectRandom(testedPerts)
19: fadv ← fadv ⊕ iterPert
embedding metric space, to instructions that have shown a 20: elected ← getTopK(testedPerts, K )
good impact on the objective function. The remaining portion 21: CAND ← updateInstructions(elected, r, c)
of the set is composed of random instructions. We discuss our 22: sim ← iterSim
23: instr ← instr + 1
choices for instruction embedding techniques and dynamic 24: return fadv
candidates selection in the following.
In the experimental section, for the black-box realm,
we will compare Spatial Greedy against the Greedy approach,
Let N be the size of the set of candidates CAND. Initially,
opting for the computationally efficient gray-box flavor of
we fill it with N random instructions. Then, at each iteration
Greedy when allowed by the specific model under study.
of the ε-greedy procedure, we update CAND by replacing
the current instructions with rN random instructions, where
1) INSTRUCTION EMBEDDING SPACE
r ∈ [0, 1), and c instructions we select among the closest
We embed assembly instructions into numeric vectors using neighbors of the instructions composing the top-k greedy
an instruction embedding model [20]. Given such a model actions of the last iteration.
M and a set I of assembly instructions, we map each i ∈ I In case of a targeted attack, the top-k greedy perturbations
to a vector of real values ⃗i ∈ Rn , using M . The model is are the k perturbations that, at the end of the last iteration,
such that, for two instructions having similar semantics, the achieved the highest increase of the objective function.
embeddings it produces will be close in the metric space. To keep the size of the set stable at value N , we take the closest
c/k neighbors of each top-k action.3
2) DYNAMIC SELECTION OF THE SET OF CANDIDATES The rationale of having r random and c selected
The process for updating the set of candidates for each instructions is seeking a balance between exploration and
iteration of the ε-greedy perturbation-selection procedure
represents the focal point of Spatial Greedy. 3 We also apply rounding so that we can work with integer numbers.

161254 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

exploitation. With the random instructions, we randomly Given two binary functions f1 and f2 , we aim to find a
sample the solution space to escape from a possibly local perturbation δ that minimizes the loss function of simv , which
optimum found for the objective function. With the selected corresponds to maximize simv (λ(f1 )) + δ, λ(f2 )). To do so,
instructions, we exploit the part of the space that in the past we use an iterative strategy where, during each iteration,
has brought the best solutions. Figure 6 provides a pictorial we solve the following optimization problem:
representation of the update procedure.
minL(simv (λ(f1 ) + δ, λ(f2 )), θ) + ϵ||δ||p , (1)
We present the complete description of Spatial Greedy
in case of targeted attack in Algorithm 1 together with a where L is the loss function, θ are the weights of the target
simplified execution example in Figure 5. The first step (a) model, and ϵ is a coefficient in [0, ∞).
consists in identifying the positions BLK where to introduce We randomly initialize the perturbation δ and then update
dead branches (function getPositions(f1 , B) at line 3) and it at each iteration by a quantity given by the negative gradient
initializing the set of candidates CAND with N random of the loss function L. The vector δ has several components
instructions (function getRandomInstructions(N ) at line 4). equal to zero and it is crafted so that it modifies only the
Then, during the iterative procedure (d), we first enumerate (dead) instructions in the added blocks. The exact procedure
all the possible perturbations (b). Then (c), we apply the depends on the target model: we return to this aspect in
perturbation-selection rule according to the value of ε, and Section VIII.
we get the top-k greedy perturbations (line 20) as depicted in Notice that the procedure above allows us to find a
Figure 6. Finally, we update the set of candidates (line 21). perturbation in the feature space, while our final goal is to
find a problem-space perturbation to modify the function f1 .
VI. WHITE-BOX ATTACK: SOLUTION OVERVIEW Therefore, we derive from the perturbation δ a problem-space
As pointed out in Section III-A, in a white-box scenario the perturbation δp . The exact technique is specific to the model
attacker has a perfect knowledge of the target deep learning we are attacking, as we further discuss in Section VIII.
model, including its loss function and gradients. We discuss The common idea behind all technique instances is to find
next how we can build on them to mount an attack. the problem-space perturbation δp whose representation in
the feature space is the closest to δ. Essentially, we use a
A. GRADIENT-GUIDED CODE ADDITION METHOD rounding-based inverse strategy to solve the inverse feature
White-box adversarial attacks have been largely investigated mapping problem that accounts to rounding the feature space
against image classifiers by the literature, resulting in vector to the closest vector that corresponds to an object
valuable effectiveness [13]. Our attack strategy for binary in the problem space. The generated adversarial example is
similarity derives from the design pattern of the PGD fadv = f1 + δp . As for the black-box scenario, the process
attack [21], which iteratively targets image classifiers. ends whenever we reach a maximum number of iterations or
We call our proposed white-box attack Gradient-guided the desired threshold for the similarity value.
Code Addition Method (GCAM). It consists in applying a set
of perturbations using a gradient-guided strategy. In the case VII. COMPARISON BETWEEN THE ATTACKS
of a targeted attack, our goal is to minimize the loss function In this section, we present a more direct comparison between
of the attacked model on the given input while keeping the three proposed attack methodologies.
the perturbation size small and respecting the semantics- We summarize in Table 1 the key differences according to
preserving constraint. We achieve this by using the Lp -norm four interesting aspects: attacker’s knowledge, perturbation
as soft constraint. On the other hand, for an untargeted attack, type, usage of the candidates’ set, and usage of an additional
we aim to maximize the loss function while also keeping the instruction embedding model.
size of the perturbation small. From a technical perspective, GCAM is a white-box attack
Because of the inverse feature mapping problem, gradient that assumes an attacker having a complete knowledge of the
optimization-based approaches cannot be directly applied target model’s internals. Contrarily, both Spatial Greedy and
in our context (Section III-B). We need a further (hard) Greedy are black-box approaches, meaning that they can be
constraint that acts on the feature-space representation of easily adapted to attack any binary similarity model, without
the input binary function. This constraint strictly depends on having any prior knowledge. This distinction according to
the target model: we will further investigate its definition the attacker’s knowledge underlines a more subtle difference
in Section VIII. In the following, we focus on the loss among the approaches; indeed, while the two black-box
minimization strategy argued for targeted attacks. As before, attacks operate in the problem space producing valid adver-
we can easily adapt the same concepts to the untargeted case. sarial examples, GCAM initially produces perturbations in
We can describe a DNN-based model for binary similarity the feature space, which must then be converted into problem
as the concatenation of the two functions λ and simv . space objects using a rounding process.
In particular, λ is the function that maps a problem-space Looking at more practical aspects, both Greedy and Spatial
object to a feature vector (i.e., the feature mapping function Greedy depend on the concept of candidates’ set, while
discussed in Section III-B), while simv is the neural network GCAM leverages the internals of the target model to guide
computing the similarity given the feature vectors. the choice of the instructions to insert into the function

VOLUME 12, 2024 161255


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

TABLE 1. Comparison and underlying principles of the three attack techniques.

FIGURE 7. GCAM attack against Gemini. Once obtained the initial CFG of the function f1 , we initialize an
empty dead branch in one of the available positions (a). In particular, each node is represented as a feature
vector v , which is the linear combination of three embedding vectors corresponding to three different
categories of instructions (green block). We then iteratively apply the gradient descent to modify the
coefficients nj associated to the instruction vectors (b), obtaining a vector of non-integer values. Finally,
we round the obtained coefficients to the closest integer values (c) and, (d), we insert into the dead branch as
many instructions belonging to the class j as specified by the coefficient nj .

according to the objective function. Specifically, GCAM can another. This approach allows us to test the generality of our
potentially utilize the entire set of instructions encountered solution. Specifically, the three models we selected can be
by the target model during training, while the black-box distinguished by the following features:
methods are constrained to a predetermined set of instructions • NN architecture: Both Gemini and GMN are
that can be tested during each iteration. As highlighted in GNN-based models while SAFE is a RNN-based one.
Section V-A1, the usage of a manually-crafted candidates’ set • Input representation: Both Gemini and GMN repre-
represents the main weakness of the Greedy procedure, which sent functions through their CFGs while SAFE uses the
we addressed with the Spatial Greedy heuristic proposing an linear disassembly.
adaptive set based on the usage of instruction embeddings. • Feature mapping process: Both Gemini and GMN use
Finally, when considering Spatial Greedy, it is important manual features from the CFG nodes, while SAFE learns
to note that one should train from scratch an instruction features using an instruction embedding model.
embedding model to effectively apply the embedding based In the following, we provide an overview of the internal
search heuristic. However, we remark that the model has to be workings of the models and then discuss specific provisions
trained only once and then it can be reused for all the attacks for the Greedy (Section V-A) and GCAM (Section VI)
against binary for a certain ISA. attacks. Notably, Spatial Greedy needs no adaptations.

VIII. TARGET SYSTEMS A. GEMINI


In this section, we illustrate the three models we attacked: Gemini [9] represents functions in the problem space through
Gemini [9], GMN [22], and SAFE [4]. their Attributed Control Flow Graph (ACFG). An ACFG is a
We selected the models by conducting a literature control flow graph where each basic block consists of a vector
review [23] to identify plausible candidates. We then ana- of manual features (i.e., node embeddings).
lyzed the characteristics of existing binary similarity systems, The focal point of this approach consists of a graph
choosing models that are fundamentally different from one neural network (GNN) based on the Structure2vec [39] model

161256 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

that converts the ACFG into an embedding vector, obtained Differently from solutions based on standard GNNs (e.g.,
by aggregating the embedding vectors of individual ACFG Gemini), which compare embeddings built separately for
nodes. The similarity score for two functions is given by the each graph, GMN computes the distance between two graphs
cosine similarity of their ACFG embedding vectors. as it attempts to match them. In particular, while in a standard
GNN the embedding vector for a node captures properties of
1) GREEDY ATTACK its neighborhood only, GMN also accounts for the similarity
Each ACFG node contributes a vector of 8 manually selected with nodes from the other graph.
features. Five of these features depend on the characteristics
1) GREEDY ATTACK
of the instructions in the node, while the others on the graph
topology. The model distinguishes instructions from an ISA Similarly to the case of Gemini, each node of the graph
only for how they contribute to these 5 features. This enables consists of a vector of manually-engineered features. In par-
a gray-box variant of our Greedy attack: we measure the ticular, each node is a bag of 200 elements, each of
robustness of Gemini using a set of candidates CAND of which represents a class of assembly instructions, grouped
only five instructions, carefully selected for covering the five according to their mnemonics. The authors do not specify
features. Later in the paper, we use this variant as the baseline why they only consider these mnemonics among all the
approach for a comparison with Spatial Greedy. available ones in the x86-64 ISA. Analogously to Gemini,
when testing the robustness of this model against the Greedy
approach we devise a gray-box variant by considering a
2) GCAM ATTACK
set of candidates CAND of 200 instructions, each of which
As described in the previous section, some of the components belonging to one and only one of the considered classes.
of a node feature vector v depend on the instructions inside the
corresponding basic block. As Gemini maps all possible ISA 2) GCAM ATTACK
instructions into 5 features, we can associate each instruction Our white-box attack operates analogously to what we
with a deterministic modification of v represented as a vector presented in Section VIII-A2 and illustrated in Figure 7.
u. We select five categories of instructions and for each Similarly to the Gemini case, each dead branch adds a node
category cj we compute the modification uj that will be to the CFG while the feature mapping function transforms
applied to the feature vector v. We selected the categories so each CFG node into a feature vector. The feature vector is a
as to cover the aforementioned features. bag of the instructions contained in the node, where assembly
When we introduce in the block an instruction belonging instructions are divided into one of 200 categories using the
to category cj , we add its corresponding uj modification to mnemonics.
the feature vector v. Therefore, inserting instructions inside
the block modifies the feature
P vector v by adding to it a C. SAFE
linear combination vector j nj uj , where nj is the number SAFE [4] is an embedding-based similarity model. It rep-
of instructions of category cj added. Our perturbation δ acts resents functions in the problem space as sequences of
on the feature vector of the function only in the components assembly instructions. It first converts assembly instructions
corresponding to the added dead branches, by modifying the into continuous vectors using an instruction embedding
coefficients of the linear combination above. model based on the word2vec [20] word embedding tech-
Since negative coefficients are meaningless, we avoid nique. Then, it supplies such vectors to a bidirectional
them by adding to the optimization problem appropriate self-attentive recurrent neural network (RNN), obtaining an
constraints. Moreover, we solve the optimization problem embedding vector for the function. The similarity between
without forcing the components of δ to be integers, as this two functions is the cosine similarity of their embedding
would create an integer programming problem. Therefore, vectors.
at the end of the iterative optimization process, we get our
problem-space perturbation δp by rounding to the closest 1) GREEDY ATTACK
positive integer value each component of δ. It is immediate The Greedy attack against SAFE follows the black-box
to obtain from δp the problem-space perturbation to insert approach described in Section V-A. Since SAFE does not use
in our binary function f1 . Indeed, in each dead block, manually engineered features, we cannot select a restricted
we must add as many instructions belonging to a category set of instructions that generates all vectors of the feature
as the corresponding coefficient in δp . We report a simplified space for a gray-box variant. We test its resilience against
example of the GCAM procedure against Gemini in Figure 7. the Greedy approach considering a carefully designed list
of candidates CAND composed of random and hand-picked
B. GMN instructions, meaning that the baseline is a black-box attack.
Graph Matching Network (GMN) [22] computes the sim-
ilarity between two graph structures. When functions are 2) GCAM ATTACK
represented through their CFGs, GMN offers state-of-the-art In the feature space, we represent a binary function as
performance for the binary similarity problem [22], [23]. a sequence of instruction embeddings belonging to a

VOLUME 12, 2024 161257


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

written in C language: binutils, curl, gsl, libconfig, libhttp,


and openssl. We compile the programs for an x86-64
architecture using the gcc 9.2.1 compiler with -O0 opti-
mization level on Ubuntu 20.04. We filter out all functions
with less than six instructions. As a result, we obtain a
dataset of code representative of real-world software, with
source programs used in the evaluation or training of binary
similarity solutions (e.g., [4], [8], [9], [23]), and that could
be potential targets for the exemplary scenarios outlined in
Section I.
To evaluate the robustness of the three target models
against our proposed approaches, we used a dataset made
of 500 pairs of binary functions sampled from the general
dataset. The dataset, which we call Targ, consists of pairs
of random functions. In its construction, a function cannot
FIGURE 8. GCAM attack against SAFE. Once obtained the initial linear be considered more than once as a source function but may
disassembly of the function f1 , we map each instruction to its embedding appear multiple times as a target. The functions within a
(a) using the embedding matrix em, obtaining the feature space
representation of f1 . Then, we initialize the perturbation by inserting into pair differ at most by 1345 instructions, and on average by
the feature space representation of f1 the embedding vector adv 135.27 instructions.
associated to a real instruction (b) uniformly sampled from the
embeddings in em. We then iteratively modify adv by applying the
In the untargeted scenario, source and target functions
gradient descent (c). Finally, we approximate the obtained adv vector to have to coincide. For these attacks, we use the dataset Untarg
the closest embedding in em (d) and we insert its corresponding
instruction into f1 (e).
composed by the 500 functions used as source in the Targ
dataset. Being pairs made of identical functions, they are
trivially balanced for instructions and CFG nodes.
predefined metric space. The perturbation δ is a sequence
of real-valued vectors initialized with embeddings of real B. DATASET USED FOR SPATIAL GREEDY
random instructions; each dead block contains four of As described in Section V-B1, in Spatial Greedy we use
such vectors. In the optimization process, we modify each an instruction embedding model to induce a metric space
embedding ij ∈ δ by a small quantity given by the negative over assembly instructions. We opt for word2vec [20]; the
gradient of the loss function L. In other words, every time reader may wonder whether this choice may unfairly favor
we optimize the objective function, we alter each ij ∈ δ Spatial Greedy when attacking SAFE, which uses word2vec
by moving it in the negative direction identified through the in its initial instruction embedding stage. We conducted
gradient. additional experiments for SAFE using two other models,
Since during optimization we modify instruction embed- GloVe [40] and fastText [41]. The three models perform
dings in terms of their single components, we have no almost identically in targeted attacks, while in untargeted
guarantee that the obtained vectors are embeddings of ones fastText occasionally outperforms the others by a small
real instructions. For this reason, after the optimization margin. For the sake of generality, in the paper evaluation we
process, we compute the problem-space perturbation δp by will report and discuss results for word2vec only. For each
approximating, at each iteration, the vectors in δ to the closest of the considered models, we use the following parameters
embeddings in the space of real instruction embeddings. during training: embedding size 100, window size 8, word
At this point, it is straightforward to obtain from the frequency 8 and learning rate 0.05. We train these models
approximated perturbation δp the instructions that should be using assembly instructions as tokens. We use as training set
added to the binary function f1 ; indeed, each vector in δp a corpus of 23,181,478 assembly instructions, extracted from
corresponds to the embedding of a real instruction that will be 291,688 binary functions collected by compiling various
inserted into the function f1 . We report a simplified example system libraries with the same setup of the previous section.
of the GCAM procedure against SAFE in Figure 8. One aspect worth emphasizing is that Spatial Greedy
uses embeddings unrelated to the binary similarity model
IX. DATASETS AND IMPLEMENTATION being targeted. We trained the Spatial Greedy embedding
In this section, we discuss the evaluation datasets model using distinct dataset and parameters compared
and the corpus for training the embedding model of to SAFE, whereas neither GMN nor Gemini incorporate
Spatial Greedy. a layer that converts a single instruction into a feature
vector. Spatial Greedy relies on embeddings to enhance the
A. ATTACK DATASET instruction insertion process during the attack by clustering
We test our approaches by considering pairs of binary the instruction space, independently of the underlying model
functions randomly extracted from 6 open-source projects being attacked.

161258 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

C. IMPLEMENTATION DETAILS • Average Similarity (A-sim): obtained final similarity


We implement our attacks in Python in about 3100 LOC. values;
In the hope to foster research in the area, we make the code • Normalized Increment (N-inc): similarity increments
available upon request to fellow researchers. normalized with respect to the initial value; only used
An aspect that is worth mentioning for the black-box for targeted attacks;
attacks involves the application of the perturbation ⟨bl, in⟩ • Normalized Decrement (N-dec): similarity decrements
chosen at each iteration. Modifying the binary representation normalized with respect to the initial value; only used for
of the function every time incurs costs (recompilation in our untargeted attacks.
case; binary rewriting in alternative implementations) that we Support metrics are computed over the set of samples that
may avoid through a simulation. In particular, we directly successfully mislead the model (according to the success
modify the data structures that the target models use for conditions outlined in Section X-A).
feature mapping when parsing the binary, simulating the As an example, let us consider a targeted attack against
presence of newly inserted instructions. The authors imple- three pairs of functions with initial similarities 0.40, 0.50, and
mented these models in tensorflow or pytorch, which allows 0.60. After the attack we reach final similarities that are 0.75,
us to keep our modifications rather limited. In preliminary 0.88, and 0.94, by inserting respectively 4, 7, and 12 inline
experiments, we have verified that the similarity values from assembly instructions. We deem an attack as successful if the
our simulation are comparable with those we would have final similarity is above τt = 0.80 (the reason will be clear
obtained by recompiling the modified functions output by in the next section). In this example, we have an A-rate of
our attacks. Finally, to avoid recalculating the adversarial 66.66%, a M-size of 9.5, an A-sim of 0.91, and a N-inc of
examples for various thresholds, we selected two fixed values 0.81.
for our optimizer to satisfy: τt = 0.96 for the targeted case The N-inc is the average of the formula below over the
and τu = 0.50 for the untargeted one. For the evaluation in samples that successfully mislead the model:
Section X, we consider the adversarial examples generated final similarity−initial similarity
by inserting the perturbations obtained at the end of the (2)
1−initial similarity
simulation process into the corresponding functions and
compiling them into object files. The denominator for the fraction above is the maximum
Finally, we want to highlight that each black-box iteration possible increment for the analyzed pair: we use it to
could be considered as a single query to the target model. normalize the obtained increment. Intuitively, the value of this
This is possible because we are querying the model in batch metric is related with the initial similarities of the successfully
mode, giving it in input a set of functions that are processed attacked pair. Consider a targeted attack where a pair exhibits
together. This implies that when setting a maximum number a final similarity of 0.80. When the normalized increment is
of iterations, we are implicitly limiting the number of queries 0.7, their initial similarity is 0.33 (from Equation 2); when the
that the attacker can perform, following the approaches normalized increment is 0.3, we have a much higher 0.7 initial
adopted in [42] and [43]. similarity.
The comparison between A-sim and the success threshold
X. EVALUATION
gives us insights on the ability of the attack to reach high
In this section, we evaluate our attacks and investigate the similarity values. In the aforementioned example, the A-sim
following research questions: value of 0.91 shows that when the attack is able to exceed
the success threshold, it has actually an easy time to bring the
RQ1: Are the three target models more robust against similarity around the value of 0.91.
targeted or untargeted attacks? Evaluation Outline: We test our black-box and white-bock
RQ2: Are the three target models more robust against attacks against each target model in both the targeted and the
black-box or white-box approaches? untargeted scenario. As discussed in Section IX-A, we use
RQ3: What is the impact of feature extracting method- dataset Targ for the former and dataset Untarg for the latter.
ologies and model architectures on the performance
and the behaviour of our attacks? A. SETUP
In this section, we describe the attack parameters selected for
our experimental evaluation.
Performance Metrics: Our main evaluation metric is the Successful Attacks:
Attack success rate (A-rate), that is the percentage of An attack is successful depending on the similarity value
adversarial samples that successfully mislead the target between the adversarial example and the target function. For
model. We complement our investigation by collecting a set a targeted attack, the similarity score has to be increased
of support metrics to gain qualitative and quantitative insights during the attack until it trespasses a success threshold τt . For
into the attacking process: an untargeted attack, this score, which is initially 1, has to
• Modification size (M-size): number of inserted inline decrease until it is below a success threshold τu . Operatively,
assembly instructions; the values of such thresholds are determined by the way

VOLUME 12, 2024 161259


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

TABLE 2. Evaluation metrics with τt = 0.80 relative to the black-box attacks against the three target models in the targeted scenario. Spatial Greedy (SG)
is evaluated using parameters ε = 0.1 and r = 0.75. Greedy (G) is evaluated using ε = 0.1. G* is the gray-box version of Greedy: when such a version is
available (Section VIII), we show it instead of G. When examining G against SAFE, a set of candidates of size 400 is considered.

the similarity score is used in practice. In our experimental


evaluation, we choose the thresholds as follows. We compute
the similarity scores that our attacked systems give over a
set of similar pairs and over a set of dissimilar pairs. For the
first set, the average score is 0.79 with a standard deviation of
0.15. For the second set, these values are respectively 0.37 and
0.17. We thus opted for a success threshold τu = 0.5 for
untargeted attacks and τt = 0.8 for targeted ones. Both τu and
τt are within one standard deviation distant from the average
similarity value measured for the relevant set for the attack.
For the charts, we plot τu ∈ [0.46, 0.62] and τt ∈ [0.74, 0.88].
To fully understand the performance of the attacks, we also
measure the amount of function pairs in a dataset already
meeting a given threshold. For the targeted scenario, we plot it
as a line labeled C0. As our readers can see (Figure 9), their
contribution is marginal: hence, we do not discuss them in FIGURE 9. (a) Black-box targeted attack with Spatial Greedy against the
three target models while varying the success threshold τt ∈ [0.74, 0.88],
the remainder of the evaluation. For the untargeted scenario, and settings C0 to C4. We use a set of candidates of 400 instructions,
no such pair can exist by construction. ε = 0.1, and r = 0.75. (b) White-box targeted attack against the three
target models while varying the success threshold τt ∈ [0.74, 0.88] and
Black-box Attacks: To evaluate the effectiveness of Spatial settings C0 to C4. Left: GCAM attack with 20k iterations against GEMINI.
Greedy against the black-box baseline Greedy, we select Center: GCAM attack with 1k iterations against GMN. Right: GCAM attack
a maximum perturbation size δ̄ (namely, the number of with 1k iterations against SAFE.

inserted instructions) and a number of dead branches B in


four settings: C1 (δ̄ = 15, B = 5), C2 (δ̄ = 30, B = 10),
C3 (δ̄ = 45, B = 15), and C4 (δ̄ = 60, B = 20). B. COMPLETE ATTACK RESULTS
We set ε = 0.1 in all greedy attacks. For Spatial This section provides complete results for our black-box and
Greedy and black-box Greedy, we test two sizes for the white-box attacks on the three target models. For brevity,
set of candidates: 110 and 400. For Greedy, we pick we focus only on Spatial Greedy when discussing black-box
110 instructions manually and then randomly add others for a targeted and untargeted attacks, leaving out the results for the
total of 400; for Spatial Greedy, we recall that the selection is baseline Greedy. The two will see a detailed comparison later,
dynamic (Section V-B2). The larger size brought consistently with Spatial Greedy emerging as generally superior.
better results in both attacks, hence we present results only
for it. Finally, for Spatial Greedy, we use c = 10 and 1) BLACK-BOX TARGETED ATTACK
r ∈ {0.25, 0.50, 0.75}, with r = 0.75 being the most effective Considering an attacker with black-box knowledge in a
choice in our tests (thus, the only one presented next). For targeted scenario, the three target models show a similar
the gray-box Greedy embodiments for Gemini and GMN, behavior against Spatial Greedy.
we refer to Section VIII-A1 and VIII-B1, respectively. The attack success rate A-rate is positively correlated with
White-box Attack: We evaluate GCAM considering four the number B of dead branches and the maximum number δ̄
different values for the number B of dead branches inserted: of instructions introduced in the adversarial example. Fixing
C1 (B = 5), C2 (B = 10), C3 (B = 15), and C4 (B = 20). at τt = 0.80 the success threshold for the attack, we have
For each model, we use the number of iterations that brings an A-rate that on Gemini goes from 15.77% (setting C1) up
the attack to convergence. to 27.54% (setting C4). The other target models follow this

161260 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

TABLE 3. Evaluation metrics with τu = 0.50 relative to the black-box attacks against the three target models in the untargeted scenario. Spatial Greedy
(SG) is evaluated using parameters ε = 0.1 and r = 0.75. Greedy (G) is evaluated using ε = 0.1. Similarly to Table 2, G* is the gray-box version of Greedy
where applicable. When examining G against SAFE, a set of candidates of size 400 is considered.

behavior, as the A-rate for GMN goes from 31.13% up to


59.68%, and from 37.13% up to 60.68% for SAFE. This trend
holds for other success thresholds as visible in Figure 9. From
these results, it is evident that the higher the values of the two
parameters, the lower the robustness of the attacked models.
Table 2 presents a complete overview of the results.
The other metrics confirm the relationship between the
parameters B and δ̄ and the effectiveness of our attack. In par-
ticular, when increasing the perturbation size, as highlighted
by the modification size M-size metric, both A-sim and
the normalized increment N-inc increase, suggesting that
incrementing the perturbation size is always beneficial.

2) BLACK-BOX UNTARGETED ATTACK


Considering an attacker with black-box knowledge in a FIGURE 10. (a) Black-box untargeted attack with Spatial Greedy against
untargeted scenario, all the three target models are vulnerable the three target models while varying the success threshold
to Spatial Greedy, with different robustness. τu ∈ [0.46, 0.62], and the settings C1, C2, C3, and C4. We use a set of
candidates of 400 instructions, ε = 0.1, and r = 0.75. (b) White-box
The observations highlighted in Section X-B1 also hold untargeted attack against the three target models while varying the
in this scenario. Incrementing B and δ̄ is beneficial for the success threshold τu ∈ [0.46, 0.62], and the settings C1, C2, C3, and C4.
Left: GCAM attack with 40k iterations against GEMINI. Center: GCAM
attacker. As visible in Figure 10 and in Table 3, the attack attack with 1k iterations against GMN. Right: GCAM attack with 1k
success rate A-rate for τu = 0.50 in setting C1 is 22.95% for iterations against SAFE.
Gemini, 65.87% for GMN, and 56.49% for SAFE. The metric
increases across settings, peaking at 53.89% for Gemini, and the attack success rate A-rate. When considering setting
91.62% for GMN, and 90.62% for SAFE in setting C4. C1, the A-rate for τt = 0.80 is 24.35% for Gemini, and
Table 3 also reports the results for modification size 11.57% for SAFE; moving to C4, it increases up to 31.60%
metric M-size. In this case, we can see the effectiveness of for Gemini, and 21.76% for SAFE. On the contrary, GMN
Spatial Greedy as a small number of inserted instructions is does not show a monotonic A-rate increase for an increasing
needed against each of the considered target models. Indeed, B value, as the peak A-rate is 38.32% in setting C2.
considering setting C4, which is the one that modifies the We now discuss the modification size M-size metric:
function most, the M-size at τu = 0.50 is 11.35 for Gemini, fixing τt = 0.80 and considering the setting where A-rate
4.14 for GMN, and 7.64 for SAFE. peaks, we measure an M-size value of 38.90 for SAFE (C4),
133.84 for Gemini (C4), and 350.50 for GMN (C2): SAFE
3) WHITE-BOX TARGETED ATTACK is the model that sees the insertion of fewer instructions.
With an attacker with white-box knowledge in a targeted This is not surprising: due to the feature-space representation
scenario, the three target models show different behaviors. of SAFE, the embeddings we alter in the attack for it
Table 4 presents a complete overview of the results. (Section VIII-C2) refer to a number of instructions that
Both Gemini and SAFE show a higher robustness to our is fixed.
GCAM attack if compared to GMN.
As visible in Figure 9, when attacking Gemini and SAFE, 4) WHITE-BOX UNTARGETED ATTACK
there is a positive correlation between the number B of Figure 10 and Table 5 report the results for our attacks with
locations (i.e., dead branches) where to insert instructions white-box knowledge in the untargeted scenario.

VOLUME 12, 2024 161261


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

TABLE 4. Evaluation metrics with τt = 0.80 for the white-box targeted attack against the three target models. The GCAM attack is executed up to 20k
iterations for Gemini and up to 1k for GMN and SAFE.

TABLE 5. Evaluation metrics with τu = 0.50 for the white-box untargeted attack against the three target models. The GCAM attack is executed up to 20k
iterations for Gemini and up to 1k for GMN and SAFE.

Gemini looks more robust than the other models: for


example, fixing τu = 0.50, we measure the highest attack
success rate A-rate as 39.52% in the C4 setting. On the
contrary, for the same τu , the highest A-rate for SAFE is
88.42% (setting C3) and 84.63% for GMN (setting C4).
The general trend of having a positive correlation of B and
the A-rate is still observable (with a sharp increase of the
A-rate from setting C1 to C2). The M-size shows that
SAFE is the most fragile model in terms of instructions
to add, as they are much fewer than with the other two
models.

5) GREEDY VS. SPATIAL GREEDY


We now compare the performance of Spatial Greedy against
the Greedy baseline, until now left out of our discussions for
brevity. Figure 11 shows the results for a targeted attack on
the Targ. Additional data points are available in Table 2.
We discuss Gemini and GMN first. We recall that we could
exploit their feature extraction process to reduce the size of
the set of candidates, devising a gray-box Greedy procedure.
FIGURE 11. Greedy and Spatial Greedy targeted attacks against the three
Spatial Greedy is instead always black-box. models while varying the success threshold τt ∈ [0.74, 0.88], considering
Considering the A-rate at τt = 0.80, Spatial Greedy the setting C4. For both, we consider ε = 0.1 and |CAND| = 400. For
Spatial Greedy, we also set r = 0.75.
always outperform the gray-box baseline, except for setting
C4 on Gemini (although the two perform similarly: 27.94%
for Greedy and 27.56% for Spatial Greedy). Looking at the 37.13% for Spatial Greedy; then, it increases up to 56.89%
other metrics, we can see that our black-box approach based for Greedy and 60.68% for Spatial Greedy when considering
on instructions embeddings is almost on par or improves on the C4 scenario.
the results provided by the gray-box baseline. The other metrics confirm this behavior. Considering the
Moving to SAFE, we recall that only a black-box Greedy average similarity A-sim, regardless of the chosen δ̄ and B
is feasible. Considering the A-rate, we can notice that from the setting, we observe that adversarial pairs generated
increasing both δ̄ and B produces a more noticeable difference through Spatial Greedy present a final average similarity that
between the baseline technique and Spatial Greedy. In the C1 is higher than the one relative to the pairs generated using
setting, the A-rate at τt = 0.80 is 34.33% for Greedy and the baseline solution. The effectiveness of Spatial Greedy is

161262 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

further confirmed by the normalized increment N-inc metric; peak A-rate at τu = 0.50 is 53.89% for Gemini, 91.62% for
at a comparison of the results, the impact of the candidates GMN, and 90.62% for SAFE.
selected using Spatial Greedy is more consistent if compared The number of instructions M-size needed for generat-
to the one of the candidates selected using the baseline ing valid adversarial examples further confirms the weak
approach. We omit a discussion of the untargeted case for resilience of the target models to untargeted attacks. When
brevity. considering the worst setting according to M-size (i.e., C4),
Comparing Spatial Greedy with Greedy, we measure on while we need only few instructions for untargeted attacks
the Targ dataset an average A-rate increase of 2.27 and a at τu = 0.50 (i.e., 11.35 for Gemini, 4.14 for GMN, and
decreased M-size by 0.46 instructions across all configu- 7.64 for SAFE), we need a significantly higher number of
rations and models. When considering the Untarg, Spatial added instructions for targeted attacks (i.e., 51.85 for Gemini,
Greedy sees an average A-rate increase of 1.75, whereas the 40.99 for GMN, and 40.31 for SAFE) at τt = 0.80.
average M-size is smaller by 0.16 instructions. We report
detailed results in Table 6. Takeaway: On all the attacked models, both targeted
Restricted Set Experiments: Finally, we perform a further and untargeted attacks are feasible, especially using
experiment between the black-box version of Greedy and Spatial Greedy (see also RQ2). Their resilience
Spatial Greedy, considering a candidates’ set of smaller size; against untargeted attacks is significantly lower.
in particular, we consider a set of 50 instructions which,
in the case of Greedy, is a subset of the one considered
for the previously detailed experiments. Our hypothesis D. RQ2: BLACK-BOX VS. WHITE-BOX ATTACKS
is that the smaller the size of the candidates’ set the higher An interesting finding from our tests is that the white-box
the difference in terms of A-rate in favour of Spatial Greedy. strategy does not always outperform the black-box one.
We highlight that we applied the black-box version of Greedy Figure 12 depicts a comparison in the targeted scenario
also when targeting Gemini and GMN. In the following, between Spatial Greedy and GCAM for the attack success
we refer to the results obtained in the targeted case when rate A-rate, average similarity A-sim, and normalized
considering the C4 scenario. increment N-inc metrics. The figure shows how different
When targeting SAFE, there is a significant difference, values of the success attack threshold τt can influence the
with an A-rate of 30.74% for Greedy and 49.70% for Spatial considered metrics. On GMN and SAFE, Spatial Greedy is
Greedy. A similar trend is seen with Gemini, where Greedy
shows an A-rate of 9.58% compared to 25.55% for Spatial
Greedy, and with GMN, where Greedy’s A-rate is 37.13%
versus 49% for Spatial Greedy.

Takeaway: Spatial Greedy is typically superior, and


always at least comparable, to a Greedy attack even
when an efficient gray-box Greedy variant is possible.
The results suggest that our dynamic update of the set
of candidates, done at each iteration of the optimiza-
tion procedure, can lead to the identification of new
portions of the instruction space (and, consequently,
a new subset of the ISA) that can positively influence
the attack results. Finally, the smaller the size of the
candidates’ set the higher the effectiveness of Spatial
Greedy compared to Greedy.

C. RQ1: TARGETED VS. UNTARGETED ATTACKS


From the previous sections, the attentive reader may have
noticed that all our approaches are much more effective in
an untargeted scenario for all models and proposed metrics.
When looking at the A-rate for all thresholds of sim- FIGURE 12. Black-box and white-box targeted attacks against the three
ilarities, the three target models are less robust against models while varying the success threshold τt ∈ [0.74, 0.88]. In the
untargeted attacks (rather than targeted ones) regardless of the black-box scenario, all the results refer to the Spatial Greedy approach
(ε = 0.1, r = 0.75, and |CAND| = 400). In the white-box scenario, the
adversarial knowledge. For the best attack among black-box results for Gemini are for a GCAM attack with 20k iterations while the
and white-box configurations, in the targeted scenario, the ones for SAFE and GMN are for a GCAM attack with 1k iterations.
We consider all approaches in their most effective parameter choice,
peak A-rate at τt = 0.80 is 27.54% for Gemini, 59.68% for being it always setting C4 except for the GCAM attack against GMN, for
GMN, and 60.68% for SAFE. For the untargeted scenario, the which we consider setting C2.

VOLUME 12, 2024 161263


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

TABLE 6. Difference between SG and G for the A-rate and M-size in the four settings C1-C4, averaged on the three models. Where applicable, we consider
the gray-box version of Greedy.

more effective than GCAM, resulting in significantly higher E. RQ3: IMPACT OF FEATURES EXTRACTION AND
A-rate values, while the two perform similarly on Gemini. ARCHITECTURES ON ATTACKS
Interestingly, in contrast with the evaluation based on the As detailed in Section VIII, we can distinguish the target
A-rate metric, both the A-sim and N-inc values highlight a models according to three aspects (NN architecture, input
coherent behavior among the three target models. Generally, representation, and feature mapping process). Here, we are
adversarial examples generated using Spatial Greedy exhibit interested in investigating whether these aspects can influ-
a higher A-sim value than the white-box ones (considering ence the performance of our attacks or not.
τt = 0.80, we have 0.86 vs. 0.86 for Gemini, 0.93 vs. 0.84 for In the targeted black-box scenario (Figure 9 and Table 2),
GMN, and 0.92 vs. 0.85 for SAFE). Looking at N-inc, SAFE and GMN are the weakest among the three considered
we face a completely reversed situation; the metric is better models, as the peak attack success rate A-rate at τt = 0.80 is
in the adversarial samples generated using GCAM (0.62 for 60.68% for SAFE, 59.68% for GMN, and 27.54% for Gemini
Gemini, 0.79 for GMN, and 0.71 for SAFE) compared to (C4 setting). These results highlight that our attack is sensible
those from Spatial Greedy (0.27 for Gemini, 0.79 for GMN, to some aspects of the target model, in particular to the feature
and 0.55 for SAFE). These two observations lead us to the mapping process and the DNN architecture employed by
hypothesis that the black-box attack is more effective against the considered models. To assess this insight, we conduct
pairs of binary functions that exhibit high initial similarity different analyses to check whether our Spatial Greedy
values and can potentially reach a high final similarity. On the attack is exploiting some hidden aspects of the considered
other side, GCAM is particularly effective against pairs that models to guide the update of the candidates’ set. First,
are very dissimilar at the beginning. we check whether or not there exists a correlation between
For the untargeted scenario, our results (Tables 3 and 5) the number of initial instructions composing the function
for the A-rate metric considering τu = 0.50 show that Spatial f1 and the obtained final similarity value. This analysis is
Greedy has a slight advantage on GCAM. For Spatial Greedy, particularly interesting for SAFE, as this model computes the
we have best-setting values of 53.89% for Gemini, 91.62% similarity between two functions by only considering their
for GMN, and 90.62% for SAFE; for GCAM, we have first 150 instructions; the results of this study are reported
39.52% for Gemini, 84.63% for GMN, and 88.42% for in Figure 13. From the plots it is visible that there exists
SAFE. a negative correlation between the final similarity and the
In our experiments, GCAM performed worse than the initial number of instructions composing the function we
black-box strategy, which may look puzzling since theo- are modifying, also confirmed by the Pearson’s r correlation
retically a white-box attack should be more potent than a coefficient highlighting that this negative correlation is
black-box one. We believe this behavior is due to the inverse almost moderate for SAFE (with r = −0.38) while it is
feature mapping problem. Hence, we conducted a GCAM weak for both Gemini and GMN (with r = −0.25 and
attack exclusively in the feature space by eliminating all r = −0.22 respectively). These results confirm that when
constraints needed to identify a valid potential sample in the Spatial Greedy modifies a function that is initially small
problem space (i.e., non-negativity of coefficients for Gemini (particularly composed by less than 150 instructions), then
and GMN, rounding to genuine instruction embeddings for our adversarial example and the function f2 are more likely
SAFE). As a result, GCAM achieved an A-rate between to have a final similarity value near 1 when targeting SAFE
92.90% and 99.81% in targeted scenarios and between rather then the two other models.
97.01% and 100% in untargeted ones. Then, since both Gemini and GMN implements a feature
mapping function which deeply looks at the particular assem-
Takeaway: In our tests, the Spatial Greedy black-box bly instructions composing the single blocks, we conduct
attack is on par or beats the white-box GCAM a further analysis to assess whether or not the instructions
attack based on a rounding inverse strategy. Further inserted by Spatial Greedy trigger the features required by
investigation is needed to confirm if this result the two considered models. In particular, for each inserted
will hold for more refined inverse feature mapping instruction, we check whether or not it is mapped over the
techniques and when attacking other models. features considered by the two models and, for each adver-
sarial example, we calculate the percentage of instructions

161264 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

FIGURE 13. Correlation between initial number of instructions of the function f1 and the similarity between
the generated adversarial example fadv and the target function f2 . The considered adversarial examples are
generated in the targeted scenario, using our Spatial Greedy approach (ε = 0.1, r = 0.75, and |CAND| = 400)
in setting C4. We also reported an equal number of samples randomly drawn from a uniform distribution.

satisfying this condition. In the targeted black-box scenario The small M-size results in the untargeted scenario prevent
using Spatial Greedy, we find that on average, 90.83% of us to obtain meaningful results when running these analyses
the inserted instructions for Gemini and 100% for GMN are in this context, so we decided to not report the obtained
mapped to the considered features. results.
To verify how the particular architecture implemented For the white-box attack, we believe that the different
by the model affects the performance of Spatial Greedy, levels of robustness among the considered models are mainly
we checked how the instructions inserted by our procedure due to their feature mapping processes. As mentioned
are distributed across the various dead branches. Our in Section X-D, we evaluated a variant of the GCAM
hypothesis is that when targeting GNN-based models (as attack solely in the feature space, removing all constraints
Gemini and GMN), our attack should span the inserted necessary for producing adversarial examples valid in the
instructions across the various dead branches; on the contrary, problem space. For both Gemini and GMN, we removed
the position of the block should not influence the choice of the the non-negativity constraint of coefficients, and for SAFE,
attack when targeting a RNN as SAFE. For all the considered we eliminated the rounding to real instruction embeddings
models, the block where our attack inserts the majority of the constraint. In the C4 targeted scenario, the unconstrained ver-
instructions for each adversarial example is the one closest sion of GCAM increases the A-rate of the standard GCAM
to beginning of the function. However, while this is evident on Gemini from 31.60% to 96%, on GMN from 35.33%
for GMN and SAFE (where the first block contains most to 99.81%, and on SAFE from 21.76% to 92.90%. This
of the inserted instruction in 313 and 205 of the considered demonstrates that the performance of our attack is primarily
examples respectively), in Gemini the inserted instructions influenced by the feature mapping method rather than the
are more uniformly distributed across the various dead specific model architectures. The results on the untargeted
branches. We report the complete distribution in Figure 14. scenario confirm this hypothesis, as the unbounded GCAM
To further validate these results, we calculated the entropy reaches an A-rate near 100% for both GMN and SAFE, while
of the generated adversarial examples, resulting in values a value of 97.01% against Gemini.
of 2.94 for Gemini, 2.77 for GMN, and 2.68 for SAFE.
Higher entropy suggests a more uniform distribution of Take away: When considering the black-box sce-
inserted instructions across dead branches, while lower values nario, the particular architecture seems to influence
indicate concentration in specific blocks. These entropy the position where the instructions are inserted.
values reinforce our previous conclusions. We highlight that In general, the particular feature mapping process
these results are partially coherent with our initial hypothesis; adopted by the models seems playing a crucial role
indeed, the first block is the closest to the prologue of the in the choice of the instructions.
function, which plays a key role for both SAFE and GMN. In the white-box scenario, the feature mapping
Indeed, as mentioned in [44], SAFE primarily targets function processes adopted by the models prevent in reaching
prologues, which explains why our attack inserts most optimal A-rate results.
instructions into the first block, as it is closest to the function
prologue; For GMN, since prologue instructions typically
follow specific compiler patterns, the nodes containing these XI. MIRAI CASE STUDY
instructions are likely to match, prompting Spatial Greedy to We complement our evaluation with a case study examining
insert most instructions into the dead branches closest to the our attacks in the context of disguising functions from
prologue. malware.

VOLUME 12, 2024 161265


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

FIGURE 14. Distribution of the blocks where, for each adversarial example (successful or not), our Spatial
Greedy attack inserts most of the instructions. The considered adversarial examples are generated in the
targeted scenario, using our Spatial Greedy approach (ε = 0.1, r = 0.75, and |CAND| = 400) in setting C4.

FIGURE 16. Resilience of the three models to black-box and white-box


attackers in the untargeted scenario, on the Mirai dataset for a different
success threshold τu ∈ [0.46, 0.62], considering the setting C4. In case of
black-box attacker, we test the Spatial Greedy approach against the
target models with ε = 0.1, r = 0.75, |CAND| = 400. In case of white-box
attacker, we test GCAM attack with 20k iterations against GEMINI, with 1k
iterations against GMN, and with 1k iterations against SAFE.

For the A-rate, when attacking GMN and SAFE, Spatial


Greedy has an edge on both Greedy and GCAM, with the
latter performing markedly worse than the two black-box
ones. With Gemini, Spatial Greedy and Greedy perform
similarly, with both resulting below GCAM. This behaviour
FIGURE 15. Experiments on the three models subject of black-box and is consistent with the main evaluation results (cf. Figure 12).
white-box attackers in the targeted scenario, on the Mirai dataset for a
different success threshold τt ∈ (0.74, 0.88) in setting C4. In case of In more detail, with GMN, the average increase of
black-box attacker, we test the Spatial Greedy approach against the A-rate for Spatial Greedy over Greedy is 3.73 (max.
target models with ε = 0.1, r = 0.75, |CAND| = 400. In case of white-box
attacker, we test GCAM attack with 20k iterations against GEMINI, with 1k
of 6.27 at τt = 0.74, min. of 2.27 at τt = 0.88). With SAFE,
iterations against GMN, and with 1k iterations against SAFE. this increase is 3.81 (max. of 6.35 at τt = 0.74; min. of zero
at τt = 0.8). With Gemini, GCAM is the best attack with an
average 7.94 increase over Spatial Greedy (max. of 9.52% at
τt = 0.74; min. of 6.35% at τt = 0.88). SAFE remains the
We consider the code base from a famous leak of the Mirai
easiest model to attack also on this dataset.
malware, compiling it gcc 9.2.1 with -O0 optimization level
Regarding A-sim and N-inc, Spatial Greedy and Greedy
on Ubuntu 20.04. After filtering out all functions with less
perform similarly on GMN and SAFE, whereas on Gemini
than six instructions, we obtain a set of 63 functions. We build
Spatial Greedy is slightly worse than Greedy for A-sim at
distinct datasets for the targeted and untargeted case. For the
lower thresholds. The relative performance of GCAM vs. the
former, we pair malicious Mirai functions with benign ones
black-box attacks resembles the trends discussed in the main
from the Targ dataset from the main evaluation. For the latter,
evaluation (cf. Figure 12).
each of the 63 functions is paired with itself.
Figure 16 reports on the experiments we conducted for the
Figure 15 reports on our targeted attacks, comparing
untargeted scenario. We note that Spatial Greedy outperforms
Greedy, Spatial Greedy, and the white-box GCAM for the
the other attacks on SAFE (with the exception of GCAM
metrics of A-rate, A-sim and N-inc. For brevity, we focus on
when τu =0.46) and performs analogously to them on the
the performant C4 configuration from the main evaluation.

161266 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

other two models. Compared to the main evaluation results, patterns of compiler-generated code or not, using models for
targeted attacks have worse performance than untargeted ones checking compiler provenance [45], [46].
also on this dataset. Moreover, successful untargeted attacks Adversarial training [13], [47] is the standard solution
continue to require fewer instructions: in particular, across for increasing the robustness of an already trained model;
all models, a successful black-box targeted attack needs on however, while it could improve the robustness against our
average 42.63 instructions, whereas the untargeted one adds methodologies, there is no guarantee that the retrained models
on average 5.27 instructions. would be robust against zero-day attacks. To overcome these
limitations, techniques based on randomized defenses [48]
XII. PRACTICAL IMPACTS AND POSSIBLE could be considered. In particular [48] proposes a method-
COUNTERMEASURES ology to increase the robustness of DNN classifiers against
In this section we discuss the practical impacts of our paper adversarial examples by introducing random noise inside
and possible countermeasures. the input representation during both training and inference.
While originally designed for the computer vision scenario,
this method has been adapted to the malware classification
A. PRACTICAL IMPACTS
domain, by randomly substituting [49] and deleting [50] bytes
The findings in Section X-B reveal that the evaluated
from the input sample. However, the applicability of these
binary similarity systems are susceptible to both targeted
approaches in the binary similarity domain has not been
and untargeted attacks, though their resilience differs. These
studied yet and must focus on manipulating directly assembly
systems show higher robustness against targeted attacks,
instructions or CFG nodes.
with an average A-rate of 49.43%, whereas the average
A more promising approach consists of analyzing only
A-rate for untargeted attacks is 79.44%. From a practical
a subset of the instructions from the input functions; the
perspective, as detailed in Section I we can consider the three
rationale is that this could thwart the attack by partially
main uses cases of binary similarity systems: vulnerability
destroying the pattern of instructions introduced by the
detection, plagiarism detection, and malware detection. The
adversary. Similarly to [51], one could learn the function
results in the untargeted scenario imply that when having an
representation by focusing only on some random portions of
attacker that is trying to substitute a function with an older,
the input. A more refined approach could consist of filtering
vulnerable version or make a plagiarized function dissimilar
out instructions using techniques such as data-flow analysis
to the original one, then they succeed in nearly 80% of
and micro-trace execution, to concentrate solely on the ones
cases. This suggests that current binary similarity models are
with the highest semantic importance. However, one has to
unfit for tasks such as vulnerability detection or authorship
keep in mind that refined analyses at the pre-processing stage
identification when used in a context that could be subject to
could introduce significant delays that would partially nullify
adversarial attacks (as example, but not limited to, when used
the speed advantages of using DNNs solutions over symbolic
in security sensitive scenarios). To remark on this our results
execution ones.
in Section XI practically show that, when an attacker creates
Finally, one could use an ensemble of all the target models
a new variant of a malicious function without targeting any
combined with a majority voting approach to determine the
specific benign function, then the models fail in recognising
final similarity. As discussed in the evaluation, the various
it as similar to any known malicious sample in nearly 78%
attacked models respond differently to our attacks. This
of the cases. In contrast, the considered models show greater
suggests that an ensemble model could be a feasible defense.
resistance when the attacker is trying to create a variant of
its input matching a specific target function. This implies XIII. LIMITATIONS AND FUTURE WORKS
that the considered models are more resistant when facing an In this paper, we have seen how adding dead code is a natural
attacker trying to make a malicious function closely resemble and effective way to realize appreciable perturbations for a
a specific whitelisted function rather than when the attacker selection of heterogeneous binary similarity systems.
is hiding the malevolent function. However, it is important In Section IV-C, we acknowledged how, in the face of
to note that even in this scenario, as reported in Section XI, defenders that pre-process code with static analysis, our
an attacker can bypass the binary similarity detection system implementation would be limited from having the inserted
in more than half of the cases. dead blocks guarded by non-obfuscated branch predicates.
Furthermore, we highlight that all the approaches we
B. COUNTERMESURES propose consist of inserting into dead branches sequences of
Given these results, even though our primary focus has been instructions that do not present any data-dependency, which
on the attack side, it is important to investigate potential make them easier to detect.
defensive strategies. Our experiments suggest that, depending on the charac-
A typical approach consists of using a classifier as detector teristics of a given model and pair of functions, the success
to distinguish between clean examples and adversarial ones. of an attack may be affected by factors like the initial
In our context, one could use an anomaly detection model difference in code size and CFG topology, among others.
to check whether the input function’s code follows common In this respect, it could be interesting to explore how to

VOLUME 12, 2024 161267


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

alternate our dead-branch addition perturbation, for example, [10] J. Pewny, F. Schuster, L. Bernhard, T. Holz, and C. Rossow, ‘‘Leveraging
with the insertion of dead fragments within existing blocks. semantic signatures for bug search in binary programs,’’ in Proc. 30th
Annu. Comput. Secur. Appl. Conf. (ACSAC), 2014, pp. 406–415.
We believe both limitations could be addressed in future [11] X. Yuan, P. He, Q. Zhu, and X. Li, ‘‘Adversarial examples: Attacks
work with implementation effort, whereas the main goal of and defenses for deep learning,’’ IEEE Trans. Neural Netw. Learn. Syst.,
this paper was to show that adversarial attacks against binary vol. 30, no. 9, pp. 2805–2824, Sep. 2019.
[12] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow,
similarity systems are a concrete possibility. To enhance our
and R. Fergus, ‘‘Intriguing properties of neural networks,’’ 2013,
attacks, we could explore more complex patching imple- arXiv:1312.6199.
mentation strategies based on binary rewriting or a modified [13] I. J. Goodfellow, J. Shlens, and C. Szegedy, ‘‘Explaining and harnessing
compiler back-end. Such studies may then include also adversarial examples,’’ 2014, arXiv:1412.6572.
[14] N. Carlini and D. Wagner, ‘‘Towards evaluating the robustness of neural
other performant similarity systems, such as Asm2Vec [8] or networks,’’ in Proc. IEEE Symp. Secur. Privacy (SP), May 2017, pp. 39–57.
jTrans [52]. [15] J. Li, S. Qu, X. Li, J. Szurley, J. Z. Kolter, and F. Metze, ‘‘Adversarial
music: Real world audio adversary against wake-word detection system,’’
in Proc. 32nd Annu. Conf. Neural Inf. Process. Syst. (NeurIPS), 2019,
XIV. CONCLUSION pp. 11908–11918.
We presented the first study on the resilience of code models [16] R. Jia and P. Liang, ‘‘Adversarial examples for evaluating reading
for binary similarity to black-box and white-box adversarial comprehension systems,’’ in Proc. 22nd Conf. Empirical Methods Natural
Lang. Process. (EMNLP), 2017, pp. 2021–2031.
attacks, covering targeted and untargeted scenarios. Our tests
[17] F. Pierazzi, F. Pendlebury, J. Cortellazzi, and L. Cavallaro, ‘‘Intriguing
highlight that current state-of-the-art solutions in the field properties of adversarial ml attacks in the problem space,’’ in Proc. 41st
(Gemini, GMN, and SAFE) are not robust to adversarial IEEE Symp. Secur. Privacy (SP), 2020, pp. 1332–1349.
attacks crafted for misleading binary similarity models. Fur- [18] B. Devore-McDonald and E. D. Berger, ‘‘Mossad: Defeating software
plagiarism detection,’’ in Proc. ACM Program. Lang. (OOPSLA), vol. 4,
thermore, their resilience against untargeted attacks appears Jun. 2020, pp. 1–28.
significantly lower in our tests. Our black-box Spatial Greedy [19] A. Hazimeh, A. Herrera, and M. Payer, ‘‘Magma: A ground-truth fuzzing
technique also shows that an instruction-selection strategy benchmark,’’ in Proc. ACM Meas. Anal. Comput. Syst., 2020, vol. 4, no. 3,
pp. 1–29.
guided by a dynamic exploration of the entire ISA is more
[20] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, ‘‘Distributed
effective than using a fixed set of instructions. We hope to representations of words and phrases and their compositionality,’’ in
encourage follow-up studies by the community to improve Proc. 27th Annu. Conf. Neural Inf. Process. Syst. (NeurIPS), 2013,
the robustness and performance of these systems. pp. 3111–3119.
[21] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu,
‘‘Towards deep learning models resistant to adversarial attacks,’’ 2017,
ACKNOWLEDGMENT arXiv:1706.06083.
This work has been carried out while Gianluca Capozzi [22] Y. Li, C. Gu, T. Dullien, O. Vinyals, and P. Kohli, ‘‘Graph matching
networks for learning the similarity of graph structured objects,’’ in Proc.
was enrolled in the Italian National Doctorate on Artificial Int. Conf. Mach. Learn., 2019, pp. 3835–3845.
Intelligence run by Sapienza University of Rome. [23] A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y. Fratantonio, M. Mansouri,
and D. Balzarotti, ‘‘How machine learning is solving the binary function
similarity problem,’’ in Proc. 31st USENIX Secur. Symp. (SEC), 2022,
REFERENCES pp. 2099–2116.
[1] T. Dullien and R. Rolles, ‘‘Graph-based comparison of executable objects [24] N. Mrkšic, D. Ó Séaghdha, B. Thomson, M. Gašic, L. M. Rojas-Barahona,
(English version),’’ in Proc. Symp. sur la sécurité des Technol. de P.-H. Su, D. Vandyke, T.-H. Wen, and S. Young, ‘‘Counter-fitting word
l’information et des Commun. (SSTIC), 2005, vol. 5, no. 1, p. 3. vectors to linguistic constraints,’’ in Proc. Conf. North Amer. Chapter
[2] W. M. Khoo, A. Mycroft, and R. Anderson, ‘‘Rendezvous: A search engine Assoc. Comput. Linguistics: Human Lang. Technol., 2016, pp. 142–148.
for binary code,’’ in Proc. 10th Work. Conf. Mining Softw. Repositories [25] S. Ren, Y. Deng, K. He, and W. Che, ‘‘Generating natural language
(MSR), May 2013, pp. 329–338. adversarial examples through probability weighted word saliency,’’
[3] S. Alrabaee, P. Shirani, L. Wang, and M. Debbabi, ‘‘SIGMA: A semantic in Proc. 57th Annu. Meeting Assoc. Comput. Linguistics, 2019,
integrated graph matching approach for identifying reused functions in pp. 1085–1097.
binary code,’’ Digit. Invest., vol. 12, pp. S61–S71, Mar. 2015. [26] D. Li, Y. Zhang, H. Peng, L. Chen, C. Brockett, M.-T. Sun, and B.
[4] L. Massarelli, G. A. Di Luna, F. Petroni, L. Querzoni, and R. Baldoni, Dolan, ‘‘Contextualized perturbation for textual adversarial attack,’’ in
‘‘Function representations for binary similarity,’’ IEEE Trans. Dependable Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human
Secure Comput., vol. 19, no. 4, pp. 2259–2273, Jul. 2022. Lang. Technol., 2021, pp. 5053–5069.
[5] Y. David, N. Partush, and E. Yahav, ‘‘Statistical similarity of binaries,’’ [27] L. Li, R. Ma, Q. Guo, X. Xue, and X. Qiu, ‘‘BERT-ATTACK: Adversarial
in Proc. 37th ACM SIGPLAN Conf. Program. Lang. Design Implement., attack against BERT using BERT,’’ in Proc. Conf. Empirical Methods
Jun. 2016, pp. 266–280. Natural Lang. Process. (EMNLP), 2020, pp. 6193–6202.
[6] Y. David, N. Partush, and E. Yahav, ‘‘Similarity of binaries through re- [28] N. Yefet, U. Alon, and E. Yahav, ‘‘Adversarial examples for models
optimization,’’ in Proc. 38th ACM SIGPLAN Conf. Program. Lang. Design of code,’’ in Proc. ACM Program. Lang. (OOPSLA), vol. 4, Jun. 2020,
Implement., Jun. 2017, pp. 79–94. pp. 1–30.
[7] M. Egele, M. Woo, P. Chapman, and D. Brumley, ‘‘Blanket execution: [29] W. Zhang, S. Guo, H. Zhang, Y. Sui, Y. Xue, and Y. Xu, ‘‘Challenging
Dynamic similarity testing for program binaries and components,’’ in Proc. machine learning-based clone detectors via semantic-preserving code
23rd USENIX Secur. Symp. (SEC), 2014, pp. 303–317. transformations,’’ IEEE Trans. Softw. Eng., vol. 49, no. 5, pp. 3052–3070,
[8] S. H. H. Ding, B. C. M. Fung, and P. Charland, ‘‘Asm2 Vec: Boosting static May 2023.
representation robustness for binary clone search against code obfuscation [30] B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, C. Eckert,
and compiler optimization,’’ in Proc. IEEE Symp. Secur. Privacy (SP), and F. Roli, ‘‘Adversarial malware binaries: Evading deep learning for
May 2019, pp. 472–489. malware detection in executables,’’ in Proc. 26th Eur. Signal Process. Conf.
[9] X. Xu, C. Liu, Q. Feng, H. Yin, L. Song, and D. Song, ‘‘Neural (EUSIPCO), Sep. 2018, pp. 533–537.
network-based graph embedding for cross-platform binary code similarity [31] F. Kreuk, A. Barak, S. Aviv-Reuven, M. Baruch, B. Pinkas, and J. Keshet,
detection,’’ in Proc. ACM SIGSAC Conf. Comput. Commun. Secur., ‘‘Adversarial examples on discrete sequences for beating whole-binary
Oct. 2017, pp. 363–376. malware detection,’’ 2018, arXiv:1802.04528.

161268 VOLUME 12, 2024


G. Capozzi et al.: Adversarial Attacks Against Binary Similarity Systems

[32] K. Lucas, M. Sharif, L. Bauer, M. K. Reiter, and S. Shintre, ‘‘Malware [52] H. Wang, W. Qu, G. Katz, W. Zhu, Z. Gao, H. Qiu, J. Zhuge, and C. Zhang,
makeover: Breaking ML-based static analysis by modifying executable ‘‘JTrans: Jump-aware transformer for binary code similarity detection,’’
bytes,’’ in Proc. 16th ACM Asia Conf. Comput. Commun. Secur. in Proc. 31st ACM SIGSOFT Int. Symp. Softw. Test. Anal. (ISSTA), 2022,
(AsiaCCS), 2021, pp. 744–758. pp. 1–13.
[33] W. Song, X. Li, S. Afroz, D. Garg, D. Kuznetsov, and H. Yin, ‘‘MAB-
malware: A reinforcement learning framework for blackbox generation of
adversarial malware,’’ in Proc. 17th ACM Asia Conf. Comput. Commun.
Secur. (AsiaCCS), 2022, pp. 990–1003.
[34] L. Jia, B. Tang, C. Wu, Z. Wang, Z. Jiang, Y. Lai, Y. Kang, N. Liu, and J. GIANLUCA CAPOZZI received the master’s
Zhang, ‘‘FuncFooler: A practical black-box attack against learning-based degree in engineering in computer science from
binary code similarity detection methods,’’ 2022, arXiv:2208.14191. the Sapienza University of Rome, Italy, in 2021,
[35] B. Biggio and F. Roli, ‘‘Wild patterns: Ten years after the rise of adversarial where he is currently pursuing the Ph.D. degree.
machine learning,’’ Pattern Recognit., vol. 84, pp. 317–331, Dec. 2018. His main research interest includes adversarial
[36] P. Borrello, D. C. D’Elia, L. Querzoni, and C. Giuffrida, ‘‘Constantine: machine learning against neural network models
Automatic side-channel resistance using efficient control and data flow for binary analysis.
linearization,’’ in Proc. ACM SIGSAC Conf. Comput. Commun. Secur.,
Nov. 2021, pp. 715–733.
[37] S. Heule, E. Schkufza, R. Sharma, and A. Aiken, ‘‘Stratified synthesis:
Automatically learning the x86–64 instruction set,’’ in Proc. 37th
ACM SIGPLAN Conf. Program. Lang. Des. Implement. (PLDI), 2016,
pp. 237–250.
[38] Z. L. Chua, S. Shen, P. Saxena, and Z. Liang, ‘‘Neural nets can learn DANIELE CONO D’ELIA received the Ph.D.
function type signatures from binaries,’’ in Proc. 26th USENIX Secur. degree in engineering in computer science from
Symp. (SEC), 2017, pp. 99–116. the Sapienza University of Rome, in 2016. He is
[39] H. Dai, B. Dai, and L. Song, ‘‘Discriminative embeddings of latent variable currently a tenure-track Assistant Professor with
models for structured data,’’ in Proc. 33rd Int. Conf. Mach. Learn. (ICML), the Sapienza University of Rome. His research
vol. 48, 2016, pp. 2702–2711. activities span several fields across software and
[40] J. Pennington, R. Socher, and C. Manning, ‘‘GloVe: Global vectors for systems security, with contributions in the analysis
word representation,’’ in Proc. 19th Conf. Empirical Methods Natural of adversarial code and in the design of program
Lang. Process. (EMNLP), 2014, pp. 1532–1543.
analyses and transformations to make software
[41] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, ‘‘Enriching word
more secure.
vectors with subword information,’’ Trans. Assoc. Comput. Linguistics,
vol. 5, pp. 135–146, Dec. 2017.
[42] C. Guo, J. R. Gardner, Y. You, A. G. Wilson, and K. Q. Weinberger, ‘‘Sim-
ple black-box adversarial attacks,’’ in Proc. 36th Int. Conf. Mach. Learn.
(ICML), vol. 97, 2019, pp. 2484–2493. GIUSEPPE ANTONIO DI LUNA received the
[43] J. Chen, M. I. Jordan, and M. J. Wainwright, ‘‘HopSkipJumpAttack: A Ph.D. degree. After the Ph.D. study, he did a post-
query-efficient decision-based attack,’’ in Proc. 41st IEEE Symp. Secur. doctoral research with the University of Ottawa,
Privacy (SP), 2020, pp. 1277–1294. Canada, working on fault tolerant distributed algo-
[44] W. K. Wong, H. Wang, Z. Li, and S. Wang, ‘‘BinAug: Enhancing binary rithms, distributed robotics, and algorithm design
similarity analysis with low-cost input repairing,’’ in Proc. IEEE/ACM 46th for programmable particles. In 2018, he started a
Int. Conf. Softw. Eng., vol. 9, Feb. 2024, pp. 1–13.
postdoctoral research with Aix-Marseille Univer-
[45] L. Massarelli, G. A. Di Luna, F. Petroni, L. Querzoni, and R. Baldoni,
sity, France, where he worked on dynamic graphs.
‘‘Investigating graph embedding neural networks with unsupervised
features extraction for binary analysis,’’ in Proc. Workshop Binary Anal.
Currently, he is performing research on applying
Res., 2019, pp. 1–11. NLP techniques to the binary analysis domain.
[46] X. He, S. Wang, Y. Xing, P. Feng, H. Wang, Q. Li, S. Chen, and K. Sun, He is an Associate Professor with the Sapienza University of Rome, Italy.
‘‘BinProv: Binary code provenance identification without disassembly,’’
in Proc. 25th Int. Symp. Res. Attacks, Intrusions Defenses, Oct. 2022,
pp. 350–363.
[47] K. Lucas, S. Pai, W. Lin, L. Bauer, M. K. Reiter, and M. Sharif, LEONARDO QUERZONI received the Ph.D.
‘‘Adversarial training for raw-binary malware classifiers,’’ in Proc. 32nd degree with a thesis on efficient data routing algo-
USENIX Secur. Symp., 2023, pp. 1163–1180.
rithms for publish/subscribe middleware systems,
[48] J. Cohen, E. Rosenfeld, and J. Z. Kolter, ‘‘Certified adversarial robustness
in 2007. He is a Full Professor with the Sapienza
via randomized smoothing,’’ in Proc. 36th Int. Conf. Mach. Learn. (ICML),
University of Rome, Italy. He has authored
vol. 97, 2019, pp. 1310–1320.
[49] D. Gibert, G. Zizzo, and Q. Le, ‘‘Towards a practical defense against adver- more than 80 papers published in international
sarial attacks on deep learning-based malware detectors via randomized scientific journals and conferences. His research
smoothing,’’ 2023, arXiv:2308.08906. interests range from cyber security to distributed
[50] Z. Huang, N. G. Marchant, K. Lucas, L. Bauer, O. Ohrimenko, and systems, in particular binary similarity, distributed
B. I. P. Rubinstein, ‘‘RS-Del: Edit distance robustness certificates for stream processing, dependability, and security in
sequence classifiers via randomized deletion,’’ in Proc. 36th Annu. Conf. distributed systems. In 2017, he received the Test of Time Award from the
Neural Inf. Process. Syst. (NeurIPS), 2023, pp. 1–36. ACM International Conference on Distributed Event-Based Systems for the
[51] D. Gibert, L. Demetrio, G. Zizzo, Q. Le, J. Planes, and B. Biggio, paper TERA: Topic-Based Event Routing for Peer-to-Peer Architectures,
‘‘Certified adversarial robustness of machine learning-based malware published, in 2007.
detectors via (De)Randomized smoothing,’’ 2024, arXiv:2405.00392.

Open Access funding provided by ‘Università degli Studi di Roma La Sapienza 2’ within the CRUI CARE Agreement

VOLUME 12, 2024 161269

You might also like