Dalvik Bytecode
Dalvik Bytecode
Dalvik Bytecode
General Design
The machine model and calling conventions are meant to approximately imitate common real architectures and C-style calling conventions: The VM is register-based, and frames are fixed in size upon creation. Each frame consists of a particular number of registers (specified by the method) as well as any adjunct data needed to execute the method, such as (but not limited to) the program counter and a reference to the .dex file that contains the method. The N arguments to a method land in the last N registers of the method's invocation frame. Registers are 32 bits wide. Adjacent register pairs are used for 64-bit values. In terms of bitwise representation, (Object) null == (int) 0. The storage unit in the instruction stream is a 16-bit unsigned quantity. Some bits in some instructions are ignored / must-be-zero. Instructions aren't gratuitously limited to a particular type. For example, instructions that move 32-bit register values without interpretation don't have to specify whether they are moving ints or floats. There are separately enumerated and indexed constant pools for references to strings, types, fields, and methods. Bitwise literal data is represented in-line in the instruction stream. Because, in practice, it is uncommon for a method to need more than 16 registers, and because needing more than eight registers is reasonably common, many instructions may only address the first 16 registers. When reasonably possible, instructions allow references to up to the first 256 registers. In cases where an instruction variant isn't available to address a desired register, it is expected that the register contents get moved from the original register to a low register (before the operation) and/or moved from a low result register to a high register (after the operation). When installed on a running system, some instructions may be altered, changing their format, as an install-time static linking optimization. This is to allow for faster execution once linkage is known. See the associated instruction formats document for the suggested variants. The word "suggested" is used advisedly; it is not mandatory to implement these. Human-syntax and mnemonics: Dest-then-source ordering for arguments. Some opcodes have a disambiguating suffix with respect to the type(s) they operate on: Type-general 64-bit opcodes are suffixed with -wide. Type-specific opcodes are suffixed with their type (or a straightforward abbreviation), one of: -boolean -byte -char -short -int -long -float -double -object -string -class -void. Type-general 32-bit opcodes are unmarked. Some opcodes have a disambiguating suffix to distinguish otherwise-identical operations that have different instruction layouts or options. These suffixes are separated from the main names with a slash ("/") and mainly exist at all to make there be a one-to-one mapping with static constants in the code that generates and interprets executables (that is, to reduce ambiguity for humans). See the instruction formats document for more details about the various instruction formats (listed under "Op & Format") as well as details about the opcode syntax.
Mnemonic / Syntax
Arguments Waste cycles. A: destination register (4 bits) B: source register (4 bits) A: destination register (8 bits) B: source register (16 bits) A: destination register (16 bits) B: source register (16 bits) A: destination register pair (4 bits) B: source register pair (4 bits)
Description Move the contents of one non-object register to another. Move the contents of one non-object register to another. Move the contents of one non-object register to another. Move the contents of one register-pair to another. Note: It is legal to move from vN to either vN-1 or vN+1, so implementations must arrange for both halves of a register pair to be read before anything is written. Move the contents of one register-pair to another. Note: Implementation considerations are the same as move-wide, above. Move the contents of one register-pair to another. Note: Implementation considerations are the same as move-wide, above. Move the contents of one object-bearing register to another. Move the contents of one object-bearing register to another. Move the contents of one object-bearing register to another. Move the single-word non-object result of the most recent invoke-kind into the indicated register. This must be done as the instruction immediately after an invoke-kind whose (single-word, non-object) result is not to be ignored; anywhere else is invalid. Move the double-word result of the most recent invoke-kind into the indicated register pair. This must be done as the instruction immediately after an invoke-kind whose (double-word) result is not to be ignored; anywhere else is invalid. Move the object result of the most recent invoke-kind into the indicated register. This must be done as the instruction immediately after an invoke-kind or filled-new-array whose (object) result is not to be ignored; anywhere else is invalid. Save a just-caught exception into the given register. This should be the first instruction of any exception handler whose caught exception is not to be ignored, and this instruction may only ever occur as the first instruction of an exception handler; anywhere else is invalid. Return from a void method.
05 22x
A: destination register pair (8 bits) B: source register pair (16 bits) A: destination register pair (16 bits) B: source register pair (16 bits) A: destination register (4 bits) B: source register (4 bits)
06 32x
move-object vA, vB
move-object/from16 vAA, A: destination register (8 bits) vBBBB B: source register (16 bits) move-object/16 vAAAA, vBBBB move-result vAA
A: destination register (16 bits) B: source register (16 bits) A: destination register (8 bits)
0b 11x
move-result-wide vAA
0c 11x
move-result-object vAA
0d 11x
move-exception vAA
0e 10x
return-void
Op & Format
0f 11x 10 11x 11 11x 12 11n 13 21s 14 31i 15 21h
Mnemonic / Syntax
return vAA return-wide vAA return-object vAA const/4 vA, #+B const/16 vAA, #+BBBB const vAA, #+BBBBBBBB const/high16 vAA, #+BBBB0000 const-wide/16 vAA, #+BBBB const-wide/32 vAA, #+BBBBBBBB const-wide vAA, #+BBBBBBBBBBBBBBBB const-wide/high16 vAA, #+BBBB000000000000 const-string vAA, string@BBBB
Arguments A: return value register (8 bits) A: return value register-pair (8 bits) A: return value register (8 bits) A: destination register (4 bits) B: signed int (4 bits) A: destination register (8 bits) B: signed int (16 bits) A: destination register (8 bits) B: arbitrary 32-bit constant A: destination register (8 bits) B: signed int (16 bits) A: destination register (8 bits) B: signed int (16 bits) A: destination register (8 bits) B: signed int (32 bits)
Description Return from a single-width (32-bit) non-object value-returning method. Return from a double-width (64-bit) value-returning method. Return from an object-returning method. Move the given literal value (sign-extended to 32 bits) into the specified register. Move the given literal value (sign-extended to 32 bits) into the specified register. Move the given literal value into the specified register. Move the given literal value (right-zero-extended to 32 bits) into the specified register. Move the given literal value (sign-extended to 64 bits) into the specified register-pair. Move the given literal value (sign-extended to 64 bits) into the specified register-pair.
A: destination register (8 bits) Move the given literal value into the specified B: arbitrary double-width (64-bit) register-pair. constant A: destination register (8 bits) B: signed int (16 bits) A: destination register (8 bits) B: string index Move the given literal value (right-zero-extended to 64 bits) into the specified register-pair. Move a reference to the string specified by the given index into the specified register. Move a reference to the string specified by the given index into the specified register. Move a reference to the class specified by the given index into the specified register. In the case where the indicated type is primitive, this will store a reference to the primitive type's degenerate class. Acquire the monitor for the indicated object. Release the monitor for the indicated object. Note: If this instruction needs to throw an exception, it must do so as if the pc has already advanced past the instruction. It may be useful to think of this as the instruction successfully executing (in a sense), and the exception getting thrown after the instruction but before the next one gets a chance to run. This definition makes it possible for a method to use a monitor cleanup catch-all (e.g., finally) block as the monitor cleanup for that block itself, as a way to handle the arbitrary exceptions that might get thrown due to the historical implementation of Thread.stop(), while still managing to have proper monitor hygiene. Throw if the reference in the given register cannot be cast to the indicated type. The type must be a reference type (not a primitive type). Store in the given destination register 1 if the indicated reference is an instance of the given type, or 0 if not. The type must be a reference type (not a primitive type).
19 21h
const-string/jumbo vAA, A: destination register (8 bits) string@BBBBBBBB B: string index const-class vAA, type@BBBB
1d 11x 1e 11x
1f 21c
A: reference-bearing register (8 bits) B: type index (16 bits) A: destination register (4 bits) B: reference-bearing register (4 bits) C: type index (16 bits)
20 22c
Op & Format
21 12x
Mnemonic / Syntax
array-length vA, vB
Arguments A: destination register (4 bits) B: array reference-bearing register (4 bits) A: destination register (8 bits) B: type index
Description Store in the given destination register the length of the indicated array, in entries Construct a new instance of the indicated type, storing a reference to it in the destination. The type must refer to a non-array class. Construct a new array of the indicated type and size. The type must be an array type. Construct an array of the given type and size, filling it with the supplied contents. The type must be an array type. The array's contents must be single-word (that is, no arrays of long or double). The constructed instance is stored as a "result" in the same way that the method invocation instructions store their results, so the constructed instance must be moved to a register with a subsequent move-result-object instruction (if it is to be used). Construct an array of the given type and size, filling it with the supplied contents. Clarifications and restrictions are the same as filled-new-array, described above.
22 21c
23 22c
new-array vA, vB, type@CCCC filled-new-array {vD, vE, vF, vG, vA}, type@CCCC
A: destination register (8 bits) B: size register C: type index B: array size and argument word count (4 bits) C: type index (16 bits) D..G, A: argument registers (4 bits each)
24 35c
25 3rc
A: array size and argument word count (8 bits) B: type index (16 bits) C: first argument register (16 bits) N = A + C - 1
26 31t
A: array reference (8 bits) Fill the given array with the indicated data. B: signed "branch" offset to table The reference must be to an array of data (32 bits) primitives, and the data table must match it in type and size. Note: The address of the table is guaranteed to be even (that is, 4-byte aligned). If the code size of the method is otherwise odd, then an extra code unit is inserted between the main code and the table whose value is the same as a nop. A: exception-bearing register (8 bits) A: signed branch offset (8 bits) Throw the indicated exception. Unconditionally jump to the indicated instruction. Note: The branch offset may not be 0. (A spin loop may be legally constructed either with goto/32 or by including a nop as a target before the branch.) Unconditionally jump to the indicated instruction. Note: The branch offset may not be 0. (A spin loop may be legally constructed either with goto/32 or by including a nop as a target before the branch.) Unconditionally jump to the indicated instruction.
27 11x 28 10t
29 20t
goto/16 +AAAA
2a 30t 2b 31t
A: register to test Jump to a new instruction based on the value B: signed "branch" offset to table in the given register, using a table of offsets data (32 bits) corresponding to each value in a particular integral range, or fall through to the next instruction if there is no match. Note: The address of the table is guaranteed to be even (that is, 4-byte aligned). If the code size of the method is otherwise odd, then an extra code unit is inserted between the main code and the table whose value is
Op & Format
2c 31t
Mnemonic / Syntax
sparse-switch vAA, +BBBBBBBB (with supplemental
data as specified below in "sparse-switch Format")
Arguments
A: register to test Jump to a new instruction based on the value B: signed "branch" offset to table in the given register, using an ordered table data (32 bits) of value-offset pairs, or fall through to the next instruction if there is no match. Note: Alignment and padding considerations are identical to packed-switch, above. A: destination register (8 bits) Perform the indicated floating point or long B: first source register or pair comparison, storing 0 if the two arguments C: second source register or pair are equal, 1 if the second argument is larger, or -1 if the first argument is larger. The "bias" listed for the floating point operations indicates how NaN comparisons are treated: "Gt bias" instructions return 1 for NaN comparisons, and "lt bias" instructions return -1. For example, to check to see if floating point a < b, then it is advisable to use cmpg-float; a result of -1 indicates that the test was true, and the other values indicate it was false either due to a valid comparison or because one or the other values was NaN. A: first register to test (4 bits) Branch to the given destination if the given B: second register to test (4 bits) two registers' values compare as specified. C: signed branch offset (16 bits) Note: The branch offset may not be 0. (A spin loop may be legally constructed either by branching around a backward goto or by including a nop as a target before the branch.) A: register to test (8 bits) B: signed branch offset (16 bits) Branch to the given destination if the given register's value compares with 0 as specified. Note: The branch offset may not be 0. (A spin loop may be legally constructed either by branching around a backward goto or by including a nop as a target before the branch.)
2d..31 23x
cmpkind vAA, vBB, vCC 2d: cmpl-float (lt bias) 2e: cmpg-float (gt bias) 2f: cmpl-double (lt bias) 30: cmpg-double (gt bias) 31: cmp-long
32..37 22t
if-test vA, vB, +CCCC 32: if-eq 33: if-ne 34: if-lt 35: if-ge 36: if-gt 37: if-le
38..3d 21t
if-testz vAA, +BBBB 38: if-eqz 39: if-nez 3a: if-ltz 3b: if-gez 3c: if-gtz 3d: if-lez
(unused)
A: value register or pair; may be Perform the identified array operation at the source or dest (8 bits) identified index of the given array, loading or B: array register (8 bits) storing into the value register. C: index register (8 bits)
44: 45: 46: 47: 48: 49: 4a: 4b: 4c: 4d: 4e: 4f: 50: 51: 52..5f 22c
aget aget-wide aget-object aget-boolean aget-byte aget-char aget-short aput aput-wide aput-object aput-boolean aput-byte aput-char aput-short
iinstanceop vA, vB, field@CCCC 52: iget 53: iget-wide 54: iget-object 55: iget-boolean 56: iget-byte 57: iget-char 58: iget-short 59: iput 5a: iput-wide 5b: iput-object 5c: iput-boolean 5d: iput-byte 5e: iput-char 5f: iput-short
A: value register or pair; may be source or dest (4 bits) B: object register (4 bits) C: instance field reference index (16 bits)
Perform the identified object instance field operation with the identified field, loading or storing into the value register. Note: These opcodes are reasonable candidates for static linking, altering the field argument to be a more direct offset.
Op & Format
60..6d 21c
Mnemonic / Syntax
sstaticop vAA, field@BBBB 60: sget 61: sget-wide 62: sget-object 63: sget-boolean 64: sget-byte 65: sget-char 66: sget-short 67: sput 68: sput-wide 69: sput-object 6a: sput-boolean 6b: sput-byte 6c: sput-char 6d: sput-short invoke-kind {vD, vE, vF, vG, vA}, meth@CCCC 6e: invoke-virtual 6f: invoke-super 70: invoke-direct 71: invoke-static 72: invoke-interface
Arguments A: value register or pair; may be source or dest (8 bits) B: static field reference index (16 bits)
Description Perform the identified object static field operation with the identified static field, loading or storing into the value register. Note: These opcodes are reasonable candidates for static linking, altering the field argument to be a more direct offset.
6e..72 35c
B: argument word count (4 bits) C: method index (16 bits) D..G, A: argument registers (4 bits each)
Call the indicated method. The result (if any) may be stored with an appropriate move-result* variant as the immediately subsequent instruction. invoke-virtual is used to invoke a normal virtual method (a method that is not static or final, and is not a constructor). invoke-super is used to invoke the closest superclass's virtual method (as opposed to the one with the same method_id in the calling class). invoke-direct is used to invoke a non-static direct method (that is, an instance method that is by its nature non-overridable, namely either a private instance method or a constructor). invoke-static is used to invoke a static method (which is always considered a direct method). invoke-interface is used to invoke an interface method, that is, on an object whose concrete class isn't known, using a method_id that refers to an interface. Note: These opcodes are reasonable candidates for static linking, altering the method argument to be a more direct offset (or pair thereof).
(unused)
(unused)
A: argument word count (8 bits) B: method index (16 bits) C: first argument register (16 bits) N = A + C - 1 Call the indicated method. See first invoke-kind description above for details, caveats, and suggestions.
invoke-kind/range {vCCCC .. vNNNN}, meth@BBBB 74: invoke-virtual/range 75: invoke-super/range 76: invoke-direct/range 77: invoke-static/range 78: invoke-interface/range
(unused) unop vA, vB
(unused)
A: destination register or pair (4 Perform the identified unary operation on the bits) source register, storing the result in the B: source register or pair (4 bits) destination register.
7b: 7c: 7d: 7e: 7f: 80: 81: 82: 83: 84: 85: 86: 87: 88: 89: 8a: 8b: 8c: 8d:
neg-int not-int neg-long not-long neg-float neg-double int-to-long int-to-float int-to-double long-to-int long-to-float long-to-double float-to-int float-to-long float-to-double double-to-int double-to-long double-to-float int-to-byte
Op & Format
Mnemonic / Syntax
8e: int-to-char 8f: int-to-short
Arguments
Description
90..af 23x
90: 91: 92: 93: 94: 95: 96: 97: 98: 99: 9a: 9b: 9c: 9d: 9e: 9f: a0: a1: a2: a3: a4: a5: a6: a7: a8: a9: aa: ab: ac: ad: ae: af: b0..cf 12x b0: b1: b2: b3: b4: b5: b6: b7: b8: b9: ba: bb: bc: bd: be: bf: c0: c1: c2: c3: c4: c5: c6: c7: c8: c9: ca: cb: cc: cd: ce: cf: d0..d7 22s
add-int sub-int mul-int div-int rem-int and-int or-int xor-int shl-int shr-int ushr-int add-long sub-long mul-long div-long rem-long and-long or-long xor-long shl-long shr-long ushr-long add-float sub-float mul-float div-float rem-float add-double sub-double mul-double div-double rem-double add-int/2addr sub-int/2addr mul-int/2addr div-int/2addr rem-int/2addr and-int/2addr or-int/2addr xor-int/2addr shl-int/2addr shr-int/2addr ushr-int/2addr add-long/2addr sub-long/2addr mul-long/2addr div-long/2addr rem-long/2addr and-long/2addr or-long/2addr xor-long/2addr shl-long/2addr shr-long/2addr ushr-long/2addr add-float/2addr sub-float/2addr mul-float/2addr div-float/2addr rem-float/2addr add-double/2addr sub-double/2addr mul-double/2addr div-double/2addr rem-double/2addr
A: destination register or pair (8 Perform the identified binary operation on the bits) two source registers, storing the result in the B: first source register or pair (8 first source register. bits) C: second source register or pair (8 bits)
binop/2addr vA, vB
A: destination and first source Perform the identified binary operation on the register or pair (4 bits) two source registers, storing the result in the B: second source register or pair first source register. (4 bits)
#+CCCC d0: add-int/lit16 d1: rsub-int (reverse subtract) d2: mul-int/lit16 d3: div-int/lit16 d4: rem-int/lit16 d5: and-int/lit16 d6: or-int/lit16 d7: xor-int/lit16
A: destination register (4 bits) B: source register (4 bits) C: signed int constant (16 bits)
Perform the indicated binary op on the indicated register (first argument) and literal value (second argument), storing the result in the destination register. Note: rsub-int does not have a suffix since this version is the main opcode of its family. Also, see below for details on its semantics.
Op & Format
d8..e2 22b
Mnemonic / Syntax
binop/lit8 vAA, vBB, #+CC
d8: d9: da: db: dc: dd: de: df: e0: e1: e2: e3..ff 10x
add-int/lit8 rsub-int/lit8 mul-int/lit8 div-int/lit8 rem-int/lit8 and-int/lit8 or-int/lit8 xor-int/lit8 shl-int/lit8 shr-int/lit8 ushr-int/lit8
Arguments A: destination register (8 bits) B: source register (8 bits) C: signed int constant (8 bits)
Description Perform the indicated binary op on the indicated register (first argument) and literal value (second argument), storing the result in the destination register. Note: See below for details on the semantics of rsub-int.
(unused)
(unused)
packed-switch Format
Name ident size first_key targets Format ushort = 0x0100 ushort int int[] identifying pseudo-opcode number of entries in the table first (and lowest) switch case value list of size relative branch targets. The targets are relative to the address of the switch opcode, not of this table. Description
Note: The total number of code units for an instance of this table is (size
* 2) + 4.
sparse-switch Format
Name ident size keys targets Format ushort = 0x0200 ushort int[] int[] identifying pseudo-opcode number of entries in the table list of size key values, sorted low-to-high list of size relative branch targets, each corresponding to the key value at the same index. The targets are relative to the address of the switch opcode, not of this table. Description
Note: The total number of code units for an instance of this table is (size
* 4) + 2.
fill-array-data Format
Description number of bytes in each element number of elements in the table data values
Note: The total number of code units for an instance of this table is (size
* element_width + 1) / 2 + 4.
int-to-float
Conversion of int32 to float, using round-to-nearest. This loses precision for some values. Conversion of int32 to double.
int-to-double
long-to-int
long-to-float
Conversion of int64 to float, using round-to-nearest. This loses precision for some values. Conversion of int64 to double, using round-to-nearest. This loses precision for some values. Conversion of float to int32, using round-toward-zero. NaN and -0.0 (negative zero) convert to the integer 0. Infinities and values with too large a magnitude to be represented get converted to either 0x7fffffff or -0x80000000
long-to-double
float-to-int
Opcode float-to-long
Notes Conversion of float to int32, using round-toward-zero. The same special case rules as for float-to-int apply here, except that out-of-range values get converted to either 0x7fffffffffffffff or -0x8000000000000000 depending on sign. Conversion of float to double, preserving the value exactly.
float-to-double
float a; double result = (double) a; double a; int32 result = (int32) a; double a; int64 result = (int64) a; double a; float result = (float) a; int32 a; int32 result = (a << 24) >> 24; int32 a; int32 result = a & 0xffff; int32 a; int32 result = (a << 16) >> 16; int32 a, b; int32 result = a + b; int32 a, b; int32 result = a - b; int32 a, b; int32 result = b - a; int32 a, b; int32 result = a * b; int32 a, b; int32 result = a / b; int32 a, b; int32 result = a % b;
double-to-int
Conversion of double to int32, using round-toward-zero. The same special case rules as for float-to-int apply here. Conversion of double to int64, using round-toward-zero. The same special case rules as for float-to-long apply here. Conversion of double to float, using round-to-nearest. This loses precision for some values. Truncation of int32 to int8, sign extending the result.
double-to-long
double-to-float
int-to-byte
int-to-char
int-to-short
Twos-complement addition. Twos-complement subtraction. Twos-complement reverse subtraction. Twos-complement multiplication. Twos-complement division, rounded towards zero (that is, truncated to integer). This throws ArithmeticException if b == 0. Twos-complement remainder after division. The sign of the result is the same as that of a, and it is more precisely defined as result == a - (a / b) * b. This throws ArithmeticException if b == 0. Bitwise AND. Bitwise OR. Bitwise XOR. Bitwise shift left (with masked argument).
rem-int
int32 a, b; int32 result = a & b; int32 a, b; int32 result = a | b; int32 a, b; int32 result = a ^ b; int32 a, b; int32 result = a << (b & 0x1f); int32 a, b; int32 result = a >> (b & 0x1f); uint32 a, b; int32 result = a >> (b & 0x1f);
shr-int
ushr-int
10
C Semantics int64 a, b; int64 result = a + b; int64 a, b; int64 result = a - b; int64 a, b; int64 result = a * b; int64 a, b; int64 result = a / b; int64 a, b; int64 result = a % b;
Notes Twos-complement addition. Twos-complement subtraction. Twos-complement multiplication. Twos-complement division, rounded towards zero (that is, truncated to integer). This throws ArithmeticException if b == 0. Twos-complement remainder after division. The sign of the result is the same as that of a, and it is more precisely defined as result == a - (a / b) * b. This throws ArithmeticException if b == 0. Bitwise AND. Bitwise OR. Bitwise XOR. Bitwise shift left (with masked argument).
rem-long
int64 a, b; int64 result = a & b; int64 a, b; int64 result = a | b; int64 a, b; int64 result = a ^ b; int64 a, b; int64 result = a << (b & 0x3f); int64 a, b; int64 result = a >> (b & 0x3f); uint64 a, b; int64 result = a >> (b & 0x3f); float a, b; float result = a + b; float a, b; float result = a - b; float a, b; float result = a * b; float a, b; float result = a / b; float a, b; float result = a % b; double a, b; double result = a + b; double a, b; double result = a b; double a, b; double result = a * b; double a, b; double result = a / b; double a, b; double result = a % b;
shr-long
ushr-long
Floating point addition. Floating point subtraction. Floating point multiplication. Floating point division. Floating point remainder after division. This function is different than IEEE 754 remainder and is defined as result == a roundTowardZero(a / b) * b. Floating point addition.
add-double
sub-double
mul-double
div-double
rem-double
Floating point remainder after division. This function is different than IEEE 754 remainder and is defined as result == a roundTowardZero(a / b) * b.
11