Semantics Aware Malware
Semantics Aware Malware
Mihai Christodorescu∗ Somesh Jha∗ Sanjit A. Seshia† Dawn Song Randal E. Bryant†
University of Wisconsin, Madison Carnegie Mellon University
{mihai, jha}@cs.wisc.edu {sanjit@cs., dawnsong@, bryant@cs.}cmu.edu
eax = 0x403000
2
ebx = 0x400000
1 const addr1 ← 0x403000
3
A = const_addr1 const addr2 ← 0x400000
edx = eax + 3
2 condition(X) ← X≥0x406000
4
B = const_addr2 true f(X) ← X≪2 + 1
eax >= 0x406000 ? c←4
3
5 false
d←4
condition(A) ? true mem[ebx] = mem[edx−3] << 2 + 1
4 false
6 (c) Execution context.
mem[B] = f(mem[A]) eax = eax + 4
5
7
A=A+c const addr1 : F(0)
edx = edx + 4
6
const addr2 : F(0)
8
c : F(0)
B=B+d ebx = ebx + 4
d : F(0)
7 9
f : F(1)
jump const_address2 jump 0x400000
condition : P(1)
(a) Template of malicious behavior. (b) Malware instance. (d) Symbolic constant types.
Figure 1. Malware instance (b) satisfies the template (a) according to our semantics.
When does an instruction sequence contain the constant is 0-ary function or has type F (0). A con-
malicious behavior? Consider the instruction se- stant c of type τ is written as c : τ . In the template
quence shown in Figure 1(b). Assume the symbolic shown in Figure 1(a) the variables VT are {A, B} and
constants in the template are assigned values shown the symbolic constants CT are shown in Figure 1(d).
in Figure 1(c). If the template and the instruction se- Let I be an instruction sequence or a program frag-
quence shown in Figure 1(a) and 1(b) are executed from ment. An example instruction sequence is shown in Fig-
a state where the contents of the memory are the same, ure 1(b). Memory contents are represented as a function
then after both the executions the state of the memory M : Addr → V alues from the set of addresses Addr
is the same. In other words, the template and the in- to the set of values V alues, where M [a] denotes the
struction sequence have the same effect on the mem- value stored at address a.
ory. This is because whenever memory accesses are An execution context for a template T , with T =
performed the addresses in the two executions are the (IT , VT , CT ), is an assignment of values of appropriate
same. Moreover, stores to memory locations are also types to the symbolic constants in the set CT . Formally,
the same. Thus, there is an execution of instruction se- an execution context ECT for a template T is a function
quence shown in Figure 1(b) that exhibits the behavior with domain CT , such that for all c ∈ CT the type of c
specified by the template given in Figure 1(a). In other and ECT (c) are the same. An execution context for the
words, the malicious behavior specified by the template template shown in Figure 1(a) is shown in Figure 1(c).
is demonstrated by the instruction sequence. Note that Given an execution context ECT for a template T , let
this intuitive notion of an instruction sequence demon- ECT (T ) be the template obtained by replacing every
strating a specified malicious behavior is not affected constant c ∈ CT by ECT (c).
by program transformations, such as register renaming, A state sT for the template T is a 3-tuple denoted
inserting irrelevant instruction sequences, and changing (valT , pcT , memT ), where valT : VT → V alues is
starting addresses of memory blocks. an assignment of values to the variables in VT , pcT is a
value for the program counter, and memT : Addr →
2.1. Formal semantics V alues gives the memory content. We follow the con-
vention that a state and its component referring to the
A template T = (IT , VT , CT ) is a 3-tuple, where template are superscripted by T . Given a template state
IT is a sequence of instructions and VT and CT are sT , val(sT ), pc(sT ) and mem(sT ) refer to the three
the set of variables and symbolic constants that appear components of the state. Similarly, a state for the in-
in IT . There are two types of symbolic constants: an struction sequence I is a 3-tuple (val, pc, mem), where
n-ary function (denoted as F (n)) and an n-ary predi- val : Reg → V alues is an assignment of values to
cate (denoted as P (n)). Notice that a simple symbolic the set of registers Reg, pc is a value for the program
counter, and mem : Addr → V alues gives the mem- • (Condition 3): If
ory contents. Let S T and S be the state space for the
pc(sTk ) ∈ affected (σ(T, ECT , sT0 )),
template and the instruction sequence respectively.
Assume that we are given an execution context ECT then pc(sr ) ∈ affected (σ(T, ECT , sT0 )). In other
for a template T , and the template is in state sT . If we words, if the program counter at the end of executing
execute an instruction i from the template ECT (T ) in the template T points to the affected memory area,
state sT , we transition to a new state sT1 and generate a then the program counter after executing I should
system event e (in our context, events are usually sys- also point into the affected memory area.
tem calls or kernel traps). We denote a state change
from sT to sT1 generating an event e as sT −→ sT1 .
e Consider the example shown in Figure 1. Assume
For uniformity, if an instruction i does not generate a that we use the execution context shown in Figure 1(c)
system event, we say that it generates the null event, for the template shown in Figure 1(a). Suppose we ex-
or e(i) = null. For every initial template state sT0 , ecute the template and instruction sequence shown in
executing the template T in an execution context ECT Figure 1 from states with the same memory contents.
generates a sequence as follows: The state of the memory is same in both executions,
so condition 1 is true. Condition 2 is trivially satisfied.
eT eT Since the jumps have the same target, condition 3 is triv-
σ(T, ECT , sT0 ) = sT0 −→
1
sT1 −→
1
··· , ially true.
where for i ≥ 1, sTi is the state after executing the i- Definition 2 A program P satisfies a template T (de-
th instruction from the template ECT (T ) and eTi is the noted as P |= T ) iff P contains an instruction sequence
event generated by the (i − 1)-th instruction. Notice I such that I contains a behavior specified by T . Given
that if the template does not terminate, σ(T, ECT , sT0 ) a program P and a template T , we call the problem of
can be infinite. Similarly, σ(I, s0 ) denotes the sequence determining whether P |= T as the template matching
when the instruction sequence I is executed from the problem or TMP.
initial state s0 . Defining a variant family. Definition 2 can be used
Definition 1 We say that an instruction sequence I to define a variant family. The intuition is that most vari-
contains a behavior specified by the template T = ants of a malware contain a common set of malicious
(IT , VT , CT ) if there exists a program state s0 , an ex- behavior, such as a decryption loop and a loop to search
ecution context ECT , and a template state sT0 such for email addresses. Let T be a set of templates (this set
that mem(sT0 ) = mem(s0 ) (the memory in the two contains specification of malicious behavior common to
states are the same), and the two sequences σ(I, s0 ) and a certain malware family). The set T defines a variant
σ(T, ECT , sT0 ) are finite, and the following conditions family as follows:
hold on the two sequences: {P | for all T ∈ T , P |= T }
• (Condition 1): Let the two execution sequences be In other words, the variant family defined by T contains
given as follows: all programs that satisfy all templates in the set T.
eT eT eT Theorem 1 TMP is undecidable.
σ(T, ECT , sT0 ) = sT0 −→
1
sT1 −→
2 k
· · · −→ sTk Proof: We will reduce the halting problem to TMP. Let
e
1 2 e r e
σ(I, s0 ) = s0 −→ s1 −→ · · · −→ sr M be a Turing machine, and PM be a program that uses
instructions in our IR that simulates M (since our IR
Let affected (σ(T, ECT , sT0 )) be the set of addresses is Turing complete, this can be accomplished). With-
a such that mem(sT0 )[a] 6= mem(sTk )[a], i.e., out loss of generality, assume that PM does not touch
affected (σ(T, ECT , sT0 )) is the set of memory ad- a special address sp addr while simulating the Turing
dresses whose value changes after executing the tem- machine M . Before starting to simulate M , PM sets
plate T from the initial state. We require that mem[sp addr] to 0. After simulating M , if PM halts,
mem(sTk )[a] = mem(sr )[a] holds for all a ∈ it sets mem[sp addr] to 1. Consider the template T
affected (σ(T, ECT , sT0 )), i.e. values at addresses shown below:
that belong to the set affected (σ(T, ECT , sT0 )) are mem[sp addr] = 0
the same after executing the template T and the in- mem[sp addr] = 1
struction sequence I.
It is easy to see that PM |= T iff M halts.
• (Condition 2): Ignoring null events, the event se-
quence heT0 , · · · , eTk i is a subsequence of the event 2.2. A weaker semantics
sequence he0 , · · · , er i. In order for the two system
events e1 and e2 to match, their arguments and return In some scenarios the semantics described in defini-
values should be identical. tion 1 is too strict. For example, if a template uses cer-
tain memory locations to store temporaries, then these Program IDAPro
memory locations should not be checked for equality
in comparing executions of an instruction sequence and IR conversion
a template. Let σ(T, ECT , sT0 ) be the sequence gen-
erated when a template T is executed from a state sT0
using the execution context ECT . Define a set of core Template Program IR
memory locations core(σ(T, ECT , sT0 )) which is a sub-
set of affected (σ(T, ECT , sT0 )). Condition 1 in defini-
tion 1 can be changed as follows to give a weaker se- Yes/No Malware detector
mantics:
• (Modified condition 1): We require that for all
Decision procedures
a ∈ core(σ(T, ECT , sT0 )), we have mem(sTk )[a] =
nop
4
ecx = eax + 1
3 5
true true
A > const_addr3 ? ecx − 1 > 0x406000
false 6
false
eax = ecx − 1
4 7
Figure 3. Example of program (on the right) satisfying a template (on the left) according to
our algorithm AMD . Gray arrows connect program nodes to their template counterparts. The
dashed arrow on the left marks one of the def-use relations that does hold true in the template,
while the corresponding dashed arrow on the right marks the def-use relation that must hold true
in the program.
match the program with the template as the ordering of • A predefined operator in the template can only be
memory updates in the template loop is different from unified with the same operator in the program.
that in the program loops.
Local unification. The unification step addresses • An external function call in the template can only be
the first condition of algorithm AMD , by producing an unified with the same external function call in the
assignment to variables in an instruction at a template program.
node such that it matches a program node instruction (if Standard unification rules apply in all other cases: for
such an assignment exists). Since a program node ex- example, an operator expression in the template unifies
pression contains only ground terms, the algorithm uses with an expression in the program if the program oper-
one-way matching that instantiates template variables ator is the same as the template operator and the cor-
with program expressions. In Figure 3, template node responding operator arguments unify. Template node 1
3 matches program node 5, with the binding { A ← ecx (A = const addr1) can unify with program nodes 1
- 1, const addr3 ← 0x406000 }. In the prototype we
(eax = 0x403000) or 2 (ebx = 0x400000) but not
implemented, a template can contain expressions with with program nodes 8 or 9, since template node 1 ex-
variables, symbolic constants, predefined functions (see pression has a symbolic constant as the right-hand side.
the operators in Appendix B), and external function The result of the local unification step is a binding
calls. The matching algorithm takes these restrictions relating template variables to program expressions. The
into account: binding for the example in Figure 3 is shown in Table 1.
• A variable in the template can be unified with any Note that the bindings are different at various program
program expression, except for assignment expres- points (at program node 1, template variable A is bound
sions. to eax, while at program node 5, template variable A is
bound to (ecx - 1)). Requiring bindings to be consis-
• A symbolic constant in the template can only be uni- tent (monomorphic) seems like an intuitive approach,
fied with a program constant. but leads to an overly restrictive semantics – any obfus-
cation attack that reassigns registers in a program frag-
• The predefined function memory : F (1) (traditionally ment would evade detection. We want to check pro-
written in array notation memory[. . . ]) can only be gram expressions bound to the same template variable
unified with the program function memory : F (1). (e.g. eax and (ecx - 1) are both bound to A) and verify
they change value in the same way the template vari- lem: given the program fragment R, we want to check
able changes values. We employ a mechanism based whether it maintains the value predicate φ ≡ ∀I ∈
on def-use chains and value preservation to answer this R . val prehIi (eax ) = val posthIi (ecx − 1) .
problem. The algorithm uses decision procedures to determine
whether a value predicate holds. We discuss these de-
Unified nodes Bindings cision procedures in Section 3.3; for now, we treat the
A ← eax decision procedures as oracles that can answer queries
T1 P1
const addr1 ← 0x403000 of the form “Does a program fragment R maintain an
B ← ebx invariant φ?” For each match considered between a pro-
T2 P2
const addr2 ← 0x400000 gram node and a template node, the algorithm checks
A ← ecx - 1 whether the def-use chain is preserved for each program
T3 P5
const addr3 ← 0x406000 expression corresponding to a template variable used at
A ← eax the matched node (see Appendix C for a listing of all
T4 P7
B ← ebx the def-use chains checked for the example shown in
A ← eax Figure 3). This approach eliminates a large number of
T5 P8
increment1 ← 1 matches that cannot lead to a correct template assign-
B ← ebx ment.
T6 P9
increment2 ← 2
T7 P10 const addr2 ← 0x400000 3.3. Using decision procedures to check value-
preservation
Table 1. Bindings generated from the uni-
fication of template and program nodes in
A value-preservation oracle is a decision procedure
Figure 3. Notation Tn refers to node n of
that determines whether a given program fragment is
the template, and Pm refers to node m of
a semantic nop with respect to certain program vari-
the program.
ables. Formally, given a program fragment P and pro-
gram expressions e1 , e2 , a decision procedure D deter-
Value preservation on def-use chains. The sec- mines whether the value predicate φ(P, e1 , e2 ) ≡ ∀I ∈
ond condition of algorithm AMD requires template vari- P . val prehIi (e1 ) = val posthIi (e2 ) holds.
ables and the corresponding program expressions to
have similar update patterns (although not necessarily true if P is a semantic nop,
the same values). For each def-use chain in the tem- D(P, e1 , e2 ) = i.e. φ(P, e1 , e2 ) holds
plate, the algorithm checks whether the value of the pro- ⊥ otherwise
gram expression corresponding to the template-variable
use is the same as the value of the program expression Similarly, we can define decision procedures that de-
corresponding to the template-variable definition. Con- termine whether ¬φ(P, e1 , e2 ) holds (in this case, the
sider Figure 3, where template nodes 1 and 3 are def- result of D(P, e1 , e2 ) is “false” or ⊥). We denote by
use related. Program nodes 1 and 5 map to template D+ a decision procedure for φ(P, e1 , e2 ), and by D− a
nodes 1 and 3, respectively. Denote by R the program decision procedure for ¬φ(P, e1 , e2 ).
fragment between nodes 1 and 5, such that R contains As the value preservation queries are frequent in our
only program paths from program node 1 to node 5 that algorithm (possibly at every step during node match-
correspond to template paths between template nodes 1 ing), the prototype use a collections of decision proce-
and 3 (i.e., any path in R has nodes that either map to dures ordered by their precision and cost. Intuitively,
template paths between 1 and 3, or have no correspond- the most naı̈ve decision procedures are the least precise,
ing template node). A def-use chain for template vari- but the fastest to execute. If a D+ -style decision proce-
able A connects template node 1 with template node 3: dure reports “true” on some input, all D+ -style deci-
in the program, the expressions corresponding to tem- sion procedures following it in the ordered collection
plate variable A after program node 1 and before pro- will also report “true” on the same input. Similarly, if a
gram node 5 must be equal in value, across all paths in D− -style decision procedure reports “false on some in-
R. This condition can be expressed in terms of value put, all D− -style decision procedures following it in the
predicates: for each path I in program fragment R (e.g. ordered collection will also report “false” on the same
I = hP2, P3, P4i), val prehIi (eax ) = val posthIi (ecx − 1), input. As both D+ - and D− -style decision procedures
where val prehIi represents the variable-valuation func- are sound, we define the order between D+ and D− de-
tion for the program state before path I and val posthIi cision procedures based only on performance.
is the variable-valuation function for the program state This collection of decision procedures provides us
after path I. We can formulate this query about pre- with an efficient algorithm for testing whether a pro-
serving def-use chains as a value preservation prob- gram fragment P preserves expression values: iterate
through the ordered collection of decision procedures, Handled
Obfuscation transformation
querying each Di , and stop when one of them returns by AMD ?
“true”, respectively “false” for D− -style decision pro- Instruction reordering ✓
cedures. This algorithm provides for incrementally ex- Register renaming ✓
pensive and powerful checking of the program frag- Garbage insertion ✓
ment, in step with its complexity: program fragments Instruction replacement limited
that are simple semantic nops will be detected early by Equivalent functionality ✗
decision procedures in the ordered collection. Complex Reordered memory accesses ✗
value preserving fragments will require passes through
multiple decision procedures. We present, in order, four Table 3. Obfuscation transformations ad-
decision procedures that are part of our prototype. dressed by our malware detection algo-
Nop Library D+ rithm and some limitations.
NOP . This decision procedure iden-
tifies sequences of actual nop instructions, which are
processor-specific instructions similar to the skip state- fragment for a given number of steps and determines
ment in some high-level programming languages, as whether φ(P, e1 , e2 ) holds at the end of the simulation.
well as predefined instruction sequences known to be For illustration, consider Figure 3: the value preser-
semantic nops. Based on simple pattern matching, the vation problem consists of the program fragment R, cre-
decision procedure annotates basic blocks as nop se- ated from program nodes 2, 3, and 4, and the value pred-
quences where applicable. If the whole program frag- icate φ ≡ ∀I ∈ R . val prehIi (eax ) = val posthIi (ecx −1).
ment under analysis is annotated, then it is a semantic To use the Simplify theorem proving oracle, the formula
nop. The nop library can also act as a cache for queries shown in Table 2 is generated from program fragment
already resolved by other decision procedures. R.
Randomized Symbolic Execution D− RE . This ora-
cle is based on a D− -style decision procedure using ran- 3.4. Strengths and limitations
domized execution. The program fragment is executed
using a random initial state (i.e. the values in registers For the algorithm to be effective against real-life at-
and memory are chosen to be random). At completion tacks, it has to “undo” various obfuscations and other
time, we check whether it is true that ¬φ(P, e1 , e2 ): if program transformations that a malware writer might
true, at least one path in the program fragment is not a use. We list the strengths and weaknesses of our algo-
semantic nop, and thus the whole program fragment is rithm in Table 3. We discuss below in detail four classes
not a semantic nop. of obfuscations algorithm AMD can handle: code re-
Theorem Proving D+ ThSimplify . The value preser- ordering, equivalent instruction replacement, register
vation problem can be formulated as a theorem to be renaming, and garbage insertion.
proved, given that the program fragment has no loops. Code reordering is one of the simplest obfuscations
We use the Simplify theorem prover [14] to implement hackers use to evade signature matching. The obfus-
this oracle: the program fragment is represented as a cation changes the order of instructions on disk, in the
state transformer δ, using each program register and binary image, while maintaining the execution order us-
memory expression converted to SSA form. We then ing jump instructions. This obfuscation is handled by
use Simplify to prove the formula δ ⇒ φ(P, e1 , e2 ), in the use of control flow graphs. Register renaming is a
order to show that all paths through the program frag- simple obfuscation that reassigns registers in selected
ment are semantic nops under φ(P, e1 , e2 ). If Simplify program fragments. As a result, a signature matching
confirms that the formula is valid, the program fragment tool will fail to detect any obfuscated variant as long as
is a semantic nop. One limitation of the Simplify the- it searches for a signature with a specific register. Our
orem prover is the lack of bit-vector arithmetic, which template matching algorithm avoids this pitfall by us-
binary programs are based on. Thus, we can query Sim- ing templates. The uninterpreted variables are assigned
plify only on programs that do not use bit-vector opera- corresponding program registers and memory locations
tions. only during unification and, thus, the matching algo-
Theorem Proving D+ ThUCLID . A second theorem rithm can identify any program with the same behavior
proving oracle is based on the UCLID infinite-state as the template, irrespective of the register allocation.
bounded model checker [22]. For our purposes, the Garbage insertion obfuscates a program by inserting
logic supported by UCLID is a superset of that sup- instruction sequences that do not change the behavior
ported by Simplify. In particular, UCLID precisely of the program. Algorithm AMD tackles this class of
models integer and bit-vector arithmetic. We model the obfuscations through the use of decision procedures to
program fragment instructions as state transformers for reason about value predicates on def-use chains. Equiv-
each register and for memory (represented as an unin- alent instruction replacement uses the rich instruction
terpreted function). UCLID then simulates the program set available on some architectures, notably on the Intel
Instruction sequence: Simplify formula:
2: ebx = 0x400000 ( IMPLIES ( AND ( EQ ebx 1 4194304)
3: nop (EQ ecx 4 ( + e a x p r e 1 ) )
4: ecx = eax + 1 (EQ e c x p o s t ecx 4 )
)
(EQ e a x p r e ( − e c x p o s t 1 ) )
Value predicate: )
∀I ∈ R . val prehIi (eax ) = val posthIi (ecx − 1)
Table 2. Example of Simplify query corresponding to program fragment R and value predicate
φ.
IA-32 (x86), to replace groups of consecutive instruc- gorithm’s resilience to obfuscation. We used malware
tions with other short instruction sequences that have instances in the wild both for developing behavior tem-
the same semantics. For example, to add 1 to register plates and as starting point for obfuscation transforma-
X in the x86 architecture, one can use the inc X (in- tions in the obfuscation resilience testing. Highlights of
crement) instruction, or the add X, 1 instruction, or the evaluation are:
the sub X, -1 instruction. We handle a limited kind
of instruction replacement by normalizing the code into • The template-based algorithm detects worms from
an intermediate representation (IR) with semantically- the same family, as well as unrelated worms, using
disjoint operations. a single template.
Limitations. Our tool has few limitations. First, • No false positives were generated when running our
the current implementation requires all of the IR in- malware detector on benign programs, illustrating the
structions in the template to appear in the same form soundness of our algorithm in its current implemen-
in the program. For example, if an IR node in the tem- tation.
plate contains “x = x * 2”, the same operation (an
assignment with a multiplication on the right hand side) • The algorithm exhibits improved resilience to obfus-
has to appear in the program for that node to match. cation when compared to commercial anti-virus tool
This means that we will not match the equivalent pro- McAfee VirusScan.
gram instruction “eax = eax << 1” (arithmetic left
shift). Attacks against this requirement are still possi- 4.1. Variant detection evaluation
ble, but are fairly hard, as the bar has been raised: the
attacker has to devise multiple equivalent, yet distinct,
We developed two templates and tested malware
implementations of the same computation, to evade de-
samples and benign programs against them. One tem-
tection. This is not an inherent limitation of the seman-
plate captures the decryption-loop functionality com-
tics, and can be handled using an IR normalization step.
mon in many malicious programs, while the other tem-
The second limitation comes from the use of def-use
plate captures the mass-mailing functionality common
chains for value preservation checking. The def-use re-
to email worms. While throughout this section we use
lations in the malicious template effectively encode a
only the decryption-loop template as example, the re-
specific ordering of memory updates – thus, our algo-
sults from using the mass email template were similar.
rithm AMD will detect only those program that exhibit
For malware samples we used seven variants of Netsky
the same ordering of memory updates. We note that au-
(B, C, D, O, P, T, and W), seven variants of B[e]agle (I,
tomatically generating functionally-equivalent variants
J, N, O, P, R, and Y), and seven variants of Sober (A, C,
is a hard problem. Handling obfuscations that reorder
D, E, F, G, and I). All of them are email worms, each
instructions to achieve a different ordering of memory
with many variants in the wild, ranging in size from
updates is one goal of our ongoing research.
12 kB to 75 kB.
Malware Template detection Running time
4. Experimental results Decryp- Mass-
family tion loop mailer
Avg. Std. dev.
Netsky 100% 100% 99.57 s 41.01 s
We evaluated our implementation of algorithm AMD B[e]agle 100% 100% 56.41 s 40.72 s
against real-world malware variants. The three major Sober 100% 0% 100.12 s 45.00 s
goals of our experiments were to develop malicious be-
havior templates that can match malware variants, to Table 4. Malware detection using algo-
measure the false positive rates the algorithm generates rithm AMD for 21 e-mail worm instances.
with these templates, and to measure the detection al-
The decryption-loop template describes the behav- tion: programs are grouped by size, in 5 kB increments,
ior of programs that unpack or decrypt themselves to and the disassembly and detection rates are plotted for
memory: the template consists of (1) a loop that pro- each group. For example, the bar to the right of the
cesses data from a source memory area and writes data 71,680 B point represents programs between 71,680 B
to a destination memory area, and (2) a jump that targets and 76,799 B in size, and indicates that 97.78% of the
the destination area. Many binary worms use this tech- programs in this group were disassembled successfully
nique to make static analysis harder: the actual code of and were detected as benign, while 2.22% failed to dis-
the worm is not available for inspection until runtime. assemble (either because the disassembler crashed, or
To construct the template, we analyzed the Netsky.B because it failed to group program code into functions).
email worm and manually extracted the code that per- The running time per test case ranged from 9.12 s to
forms the decryption and the jump. The template was 413.78 s, with an average of 165.5 s.
then generalized from this code by replacing actual reg-
False Positive Evaluation of 2,000 Binaries
isters and memory addresses with variables. The mass-
mailing template was developed in a similar manner: it 100%
brary (used by Sober worm instances) is not supported. Program size (grouped in 5 kB increments)