pycode
pycode
The dis module supports the analysis of CPython bytecode by disassembling it. The
CPython bytecode which this module takes as an input is defined in the
file Include/opcode.h and used by the compiler and the interpreter.
def myfunc(alist):
return len(alist)
>>>
>>> dis.dis(myfunc)
2 0 LOAD_GLOBAL 0 (len)
3 LOAD_FAST 0 (alist)
6 CALL_FUNCTION 1
9 RETURN_VALUE
This is a convenience wrapper around many of the functions listed below, most
notably get_instructions(), as iterating over a Bytecode instance yields the bytecode
operations as Instruction instances.
If first_line is not None, it indicates the line number that should be reported for
the first source line in the disassembled code. Otherwise, the source line
information (if any) is taken directly from the disassembled code object.
classmethod from_traceback(tb)
Construct a Bytecode instance from the given traceback, setting current_offset to
the instruction responsible for the exception.
codeobj
The compiled code object.
first_line
The first source line of the code object (if available)
dis()
Return a formatted view of the bytecode operations (the same as printed
by dis.dis(), but returned as a multi-line string).
info()
Return a formatted multi-line string with detailed information about the code
object, like code_info().
Example:
>>>
>>> bytecode = dis.Bytecode(myfunc)
>>> for instr in bytecode:
... print(instr.opname)
...
LOAD_GLOBAL
LOAD_FAST
CALL_FUNCTION
RETURN_VALUE
dis.code_info(x)
Return a formatted multi-line string with detailed code object information for the
supplied function, method, source code string or code object.
Note that the exact contents of code info strings are highly implementation
dependent and they may change arbitrarily across Python VMs or Python
releases.
dis.show_code(x, *, file=None)
Print detailed code object information for the supplied function, method, source
code string or code object to file (or sys.stdout if file is not specified).
dis.dis(x=None, *, file=None)
Disassemble the x object. x can denote either a module, a class, a method, a
function, a code object, a string of source code or a byte sequence of raw
bytecode. For a module, it disassembles all functions. For a class, it
disassembles all methods. For a code object or sequence of raw bytecode, it
prints one line per bytecode instruction. Strings are first compiled to code objects
with the compile() built-in function before being disassembled. If no object is
provided, this function disassembles the last traceback.
The disassembly is written as text to the supplied file argument if provided and
to sys.stdout otherwise.
dis.distb(tb=None, *, file=None)
Disassemble the top-of-stack function of a traceback, using the last traceback if
none was passed. The instruction causing the exception is indicated.
The disassembly is written as text to the supplied file argument if provided and
to sys.stdout otherwise.
The disassembly is written as text to the supplied file argument if provided and
to sys.stdout otherwise.
dis.get_instructions(x, *, first_line=None)
Return an iterator over the instructions in the supplied function, method, source
code string or code object.
The iterator generates a series of Instruction named tuples giving the details of
each operation in the supplied code.
If first_line is not None, it indicates the line number that should be reported for
the first source line in the disassembled code. Otherwise, the source line
information (if any) is taken directly from the disassembled code object.
dis.findlinestarts(code)
This generator function uses the co_firstlineno and co_lnotab attributes of the code
object code to find the offsets which are starts of lines in the source code. They
are generated as (offset, lineno) pairs.
dis.findlabels(code)
Detect all offsets in the code object code which are jump targets, and return a list
of these offsets.
dis.stack_effect(opcode[, oparg])
Compute the stack effect of opcode with argument oparg.
class dis.Instruction
Details for a bytecode operation
opcode
numeric code for operation, corresponding to the opcode values listed below and
the bytecode values in the Opcode collections.
opname
human readable name for operation
arg
numeric argument to operation (if any), otherwise None
argval
resolved arg value (if known), otherwise same as arg
argrepr
human readable description of operation argument
offset
start index of operation within bytecode sequence
starts_line
line started by this opcode (if any), otherwise None
is_jump_target
True if other code jumps to here, otherwise False
General instructions
NOP
Do nothing code. Used as a placeholder by the bytecode optimizer.
POP_TOP
Removes the top-of-stack (TOS) item.
ROT_TWO
Swaps the two top-most stack items.
ROT_THREE
Lifts second and third stack item one position up, moves top down to position
three.
DUP_TOP
Duplicates the reference on top of the stack.
DUP_TOP_TWO
Duplicates the two references on top of the stack, leaving them in the same
order.
Unary
operations
Unary
operations take
the top of the
stack, apply the
operation, and
push the result
back on the
stack.
UNARY_POS
ITIVE
Implements TOS = +TOS.
UNARY_N
EGATIVE
Implements TOS = -TOS.
UNAR
Y_NO
T
Implements TOS = not TOS.
UN
AR
Y_
IN
VE
RT
Implements TOS = ~TOS.
G
E
T
_
I
T
E
R
Implements TOS = iter(TOS).
Bina
ry
oper
ation
s
Binar
y
oper
ation
s
remo
ve
the
top
of
the
stack
(TOS
) and
the
seco
nd
top-
most
stack
item
(TOS
1)
from
the
stack
.
They
perfo
rm
the
oper
ation
, and
put
the
resul
t
back
on
the
stack
.
BIN
ARY
_PO
WER
Implements TOS = TOS1 ** TOS.
BINAR
Y_MUL
TIPLY
Implements TOS = TOS1 * TOS.
BINARY_F
LOOR_DIV
IDE
Implements TOS = TOS1 // TOS.
BINARY_TR
_DIVIDE
Implements TOS = TOS1 / TOS.
BINARY_MO
Implements TOS = TOS1 % TOS.
BINARY_AD
Implements TOS = TOS1 + TOS.
BINARY_SU
Implements TOS = TOS1 - TOS.
BINARY_SU
Implements TOS = TOS1[TOS].
BINARY_LS
Implements TOS = TOS1 << TOS.
BINARY_RS
Implements TOS = TOS1 >> TOS.
BINARY_AN
Implements TOS = TOS1 & TOS.
BINARY_XO
Implements TOS = TOS1 ^ TOS.
BINARY_OR
Implements TOS = TOS1 | TOS.
In-place oper
In-place opera
remove TOS
stack, but th
supports it, a
have to be) th
INPLACE_P
Implements in-place TOS = TOS1 ** TOS.
INPLACE_M
Implements in-place TOS = TOS1 * TOS.
INPLACE_F
Implements in-place TOS = TOS1 // TOS.
INPLACE_T
Implements in-place TOS = TOS1 / TOS.
INPLACE_M
Implements in-place TOS = TOS1 % TOS.
INPLACE_A
Implements in-place TOS = TOS1 + TOS.
INPLACE_S
Implements in-place TOS = TOS1 - TOS.
INPLACE_L
Implements in-place TOS = TOS1 << TOS.
INPLACE_R
Implements in-place TOS = TOS1 >> TOS.
INPLACE_A
Implements in-place TOS = TOS1 & TOS.
INPLACE_X
Implements in-place TOS = TOS1 ^ TOS.
INPLACE_O
Implements in-place TOS = TOS1 | TOS.
STORE_SUB
Implements TOS1[TOS] = TOS2.
DELETE_SU
Implements del TOS1[TOS].
Miscellaneou
PRINT_EXP
Implements the expression statement for the interactive mode. TOS is removed
from the stack and printed. In non-interactive mode, an expression statement is
terminated with POP_TOP.
BREAK_LOO
Terminates a loop due to a break statement.
CONTINUE_
Continues a loop due to a continue statement. target is the address to jump to
(which should be a FOR_ITER instruction).
SET_ADD(i)
Calls set.add(TOS1[-i], TOS). Used to implement set comprehensions.
LIST_APPE
Calls list.append(TOS[-i], TOS). Used to implement list comprehensions.
MAP_ADD(i)
Calls dict.setitem(TOS1[-i], TOS, TOS1). Used to implement dict comprehensions.
RETURN_VA
Returns with TOS to the caller of the function.
YIELD_VAL
Pops TOS and yields it from a generator.
YIELD_FRO
Pops TOS and delegates to it as a subiterator from a generator.
IMPORT_ST
Loads all symbols not starting with '_' directly from the module TOS to the local
namespace. The module is popped after loading all names. This opcode
implements from module import *.
POP_BLOCK
Removes one block from the block stack. Per frame, there is a stack of blocks,
denoting nested loops, try statements, and such.
POP_EXCEP
Removes one block from the block stack. The popped block must be an
exception handler block, as implicitly created when entering an except handler. In
addition to popping extraneous values from the frame stack, the last three
popped values are used to restore the exception state.
END_FINAL
Terminates a finally clause. The interpreter recalls whether the exception has to
be re-raised, or whether the function returns, and continues with the outer-next
block.
LOAD_BUIL
Pushes builtins.__build_class__() onto the stack. It is later called
by CALL_FUNCTION to construct a class.
SETUP_WIT
This opcode performs several operations before a with block starts. First, it
loads __exit__() from the context manager and pushes it onto the stack for later
use by WITH_CLEANUP. Then, __enter__() is called, and a finally block pointing
to delta is pushed. Finally, the result of calling the enter method is pushed onto
the stack. The next opcode will either ignore it ( POP_TOP), or store it in (a)
variable(s) (STORE_FAST, STORE_NAME, or UNPACK_SEQUENCE).
WITH_CLEA
Cleans up the stack when a with statement block exits. TOS is the context
manager’s __exit__() bound method. Below TOS are 1–3 values indicating
how/why the finally clause was entered:
SECOND = None
(SECOND, THIRD) = (WHY_{RETURN,CONTINUE}), retval
SECOND = WHY_*; no retval below it
(SECOND, THIRD, FOURTH) = exc_info()
STORE_NAM
Implements name = TOS. namei is the index of name in the attribute co_names of
the code object. The compiler tries to use STORE_FAST or STORE_GLOBAL if
possible.
DELETE_NA
Implements del name, where namei is the index into co_names attribute of the
code object.
UNPACK_SE
Unpacks TOS into count individual values, which are put onto the stack right-to-
left.
UNPACK_EX
Implements assignment with a starred target: Unpacks an iterable in TOS into
individual values, where the total number of values can be smaller than the
number of items in the iterable: one the new values will be a list of all leftover
items.
The low byte of counts is the number of values before the list value, the high byte
of counts the number of values after it. The resulting values are put onto the
stack right-to-left.
STORE_ATT
Implements TOS.name = TOS1, where namei is the index of name in co_names.
DELETE_AT
Implements del TOS.name, using namei as index into co_names.
STORE_GLO
Works as STORE_NAME, but stores the name as a global.
DELETE_GL
Works as DELETE_NAME, but deletes a global name.
Creates a tuple consuming count items from the stack, and pushes the resulting
tuple onto the stack.
Pushes a new dictionary object onto the stack. The dictionary is pre-sized to
hold count entries.
Imports the module co_names[namei]. TOS and TOS1 are popped and provide
the fromlist and level arguments of __import__(). The module object is pushed
onto the stack. The current namespace is not affected: for a proper import
statement, a subsequent STORE_FAST instruction modifies the namespace.
Loads the attribute co_names[namei] from the module found in TOS. The resulting
object is pushed onto the stack, to be subsequently stored by
a STORE_FAST instruction.
Increments bytecode counter by delta.
If TOS is true, sets the bytecode counter to target and leaves TOS on the stack.
Otherwise (TOS is false), TOS is popped.
If TOS is false, sets the bytecode counter to target and leaves TOS on the stack.
Otherwise (TOS is true), TOS is popped.
TOS is an iterator. Call its __next__() method. If this yields a new value, push it on
the stack (leaving the iterator below it). If the iterator indicates it is exhausted
TOS is popped, and the byte code counter is incremented by delta.
Pushes a block for a loop onto the block stack. The block spans from the current
instruction with a size of delta bytes.
Pushes a try block from a try-except clause onto the block stack. delta points to
the first except block.
Pushes a try block from a try-except clause onto the block stack. delta points to
the finally block.
Store a key and value pair in a dictionary. Pops the key and value while leaving
the dictionary on the stack.
Pushes a reference to the cell contained in slot i of the cell and free variable
storage. The name of the variable is co_cellvars[i] if i is less than the length
of co_cellvars. Otherwise it is co_freevars[i - len(co_cellvars)].
Loads the cell contained in slot i of the cell and free variable storage. Pushes a
reference to the object the cell contains on the stack.
Much like LOAD_DEREF but first checks the locals dictionary before consulting
the cell. This is used for loading free variables in class bodies.
Stores TOS into the cell contained in slot i of the cell and free variable storage.
Empties the cell contained in slot i of the cell and free variable storage. Used by
the del statement.
Pushes a new function object on the stack. From bottom to top, the consumed
stack must consist of
Creates a new function object, sets its __closure__ slot, and pushes it on the
stack. TOS is the qualified name of the function, TOS1 is the code associated
with the function, and TOS2 is the tuple containing cells for the closure’s free
variables. argc is interpreted as in MAKE_FUNCTION; the annotations and
defaults are also in the same order below TOS2.
Prefixes any opcode which has an argument too big to fit into the default two
bytes. ext holds two additional bytes which, taken together with the subsequent
opcode’s argument, comprise a four-byte argument, ext being the two most-
significant bytes.
Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on
the stack contains the variable argument list, followed by keyword and positional
arguments.
This is not really an opcode. It identifies the dividing line between opcodes which
don’t take arguments < HAVE_ARGUMENT and those which
do >= HAVE_ARGUMENT.
Sequence of bytecodes that access a free variable (note that ‘free’ in this context
refers to names in the current scope that are referenced by inner scopes or
names in outer scopes that are referenced from this scope. It does not include
references to global or builtin scopes).
Sequence of bytecodes that access an attribute by name.
When invoked from the command line, python -m pickletools will disassemble the
contents of one or more pickle files. Note that if you want to see the Python object
stored in the pickle rather than the details of pickle format, you may want to use -
m pickle instead. However, when the pickle file that you want to examine comes from an
untrusted source, -m pickletools is a safer option because it does not execute pickle
bytecode.
For example, with a tuple (1, 2) pickled in file x.pickle:
-o, --output=<file>
Name of a file where the output should be written.
-l, --indentlevel=<num>
The number of blanks by which to indent a new MARK level.
-m, --memo
When multiple objects are disassembled, preserve memo between
disassemblies.
-p, --preamble=<preamble>
When more than one pickle file are specified, print given preamble before each
disassembly.
Outputs a symbolic disassembly of the pickle to the file-like object out, defaulting
to sys.stdout. pickle can be a string or a file-like object. memo can be a Python dictionary
that will be used as the pickle’s memo; it can be used to perform disassemblies across
multiple pickles created by the same pickler. Successive levels, indicated
by MARK opcodes in the stream, are indented by indentlevel spaces. If a nonzero value
is given to annotate, each opcode in the output is annotated with a short description. The
value of annotate is used as a hint for the column where annotation should start.
pickletools.genops(pickle)
pickletools.optimize(picklestring)
Returns a new equivalent pickle string after eliminating unused PUT opcodes.
The optimized pickle is shorter, takes less transmission time, requires less
storage space, and unpickles more efficiently.