Lua 5.1 虚拟机指令简明手册
Lua 5.1 虚拟机指令简明手册
1 VM Instructions
Lua 5.1 虚拟机指令简明手册
作者 Kein-Hong Man, esq. <khman AT users.sf.net>
版本 0.1, 20060313
Contents 目录
1 Introduction 序言 2
2 Lua Instruction Basics Lua 指令基础 3
3 Really Simple Chunks 十分简单的程序块 5
4 Lua Binary Chunks Lua 二进制程序快 7
5 Instruction Notation 指令记法 15
6 Loading Constants 加载常量 16
7 Upvalues and Globals Upvalue 和全局变量 20
8 Table Instructions 表指令 22
9 Arithmetic and String Instructions 算术和字符串指令 23
10 Jumps and Calls 跳转和调用 28
11 Relational and Logic Instructions 关系和逻辑指令 35
12 Loop Instructions 循环指令 42
13 Table Creation 表创建 48
14 Closures and Closing 创建和结束闭包 52
15 Comparing Lua 5.0.2 and Lua 5.1 比较 Lua 5.0.2 和 Lua 5.1 56
16 Digging Deeper 深入探究 57
17 Acknowledgements 致谢 57
18 ChangeLog & ToDos 变更纪录&
&待做的 57
-1-
this one. See the following URLs for more information:
-2-
1 Introduction 序言
This is a no-frills introduction to the instruction set of the Lua 5.1 virtual machine. Compared
to Perl or Python, the compactness of Lua makes it relatively easier for someone to peek
under the hood and understand its internals. I think that one cannot completely grok a
scripting language, or any complex system for that matter, without slitting the animal open
and examining the entrails, organs and other yucky stuff that isn’t normally seen. So this
document is supposed to help with the “peek under the hood” bit.
This introductory guide covers Lua 5.1 only. Please see the older document for the guide to
Lua 5.0.2 virtual machine instructions. This is intentional; the internals of Lua is not fixed or
standardized in any way, so users must not expect compatibility from one version of Lua to
another as far as internals are concerned.
ChunkSpy has an interactive mode: you can enter a source chunk and get an immediate
disassembly. This allows you to use this document as a tutorial by entering the examples into
ChunkSpy and seeing the results yourself. The interactive mode is also very useful when you
are exploring the behaviour of the Lua code generator on many short code snippets.
ChunkSpy 具有交互模式:输入源代码块立刻得到反汇编。这样可以把本文用作教程,
把例子输入 ChunkSpy 查看结果。当你分析 Lua 代码生成器(汇编器)如何生成简短
代码片段时,交互模式也非常有用。
-3-
这只是个快速的入门,并不致力于全面或专业地论述 Lua 虚拟机(从现在起,除非另
作说明,“Lua”指的是“Lua5”)或其指令,而是简单、易于消化的 Lua 虚拟机指
令集的新手指南-不会有特技或吐烟圈等高难度动作。
The objective of this introduction is to cover all the Lua virtual machine instructions and the
structure of Lua 5 binary chunks with a minimum of fuss. Then, if you want more detail, you
can use luac or ChunkSpy to study non-trivial chunks of code, or you can dive into the Lua
source code itself for the real thing.
This is currently a draft, and I am not a Lua internals expert. So feedback is welcome. If you
find any errors, or if you have anything to contribute please send me an e-mail (to khman AT
users.sf.net or mkh AT pl.jaring.my) so that I can correct it. Thanks.
-4-
2 Lua Instruction Basics Lua 指令基础
The Lua virtual machine instruction set we will look at is a particular implementation of the
Lua language. It is by no means the only way to skin the chicken. The instruction set just
happens to be the way the authors of Lua chose to implement version 5 of Lua. The following
sections are based on the instruction set used in Lua 5.1. The instruction set might change in
the future – do not expect it to be set in stone. This is because the implementation details of
virtual machines are not a concern to most users of scripting languages. For most applications,
there is no need to specify how bytecode is generated or how the virtual machine runs, as
long as the language works as advertised. So remember that there is no official specification
of the Lua virtual machine instruction set, there is no need for one; the only official
specification is of the Lua language.
In the course of studying disassemblies of Lua binary chunks, you will notice that many
generated instruction sequences aren’t as perfect as you would like them to be. This is
perfectly normal from an engineering standpoint. The canonical Lua implementation is not
meant to be an optimizing bytecode compiler or a JIT compiler. Instead it is supposed to load,
parse and run Lua source code efficiently. It is the totality of the implementation that counts.
If you really need the performance, you are supposed to drop down into native C functions
anyway.
Lua instructions have a fixed size, using a 32 bit unsigned integer data type by default. In
binary chunks, endianness is significant, but while in memory, an instruction can be portably
decoded or encoded in C using the usual integer shift and mask operations. The details can be
found in lopcodes.h, while the Instruction type definition is defined in llimits.h.
There are three instruction types and 38 opcodes (numbered 0 through 37) are currently in
use as of Lua 5.1. The instruction types are enumerated as iABC, iABx, iAsBx, and may be
visually represented as follows:
-5-
31 24 23 16 15 8 7 0
A:8
iABC B:9 C:9 A:8 Opcode:6
iABx Bx:18 Opcode:6
iAsBx sBx:18 A:8 Opcode:6
Lua5 指令格式
Instruction fields are encoded as simple unsigned integer values, except for sBx. Field sBx
can represent negative numbers, but it doesn’t use 2s complement. Instead, it has a bias equal
to half the maximum integer that can be represented by its unsigned counterpart, Bx. For a
field size of 18 bits, Bx can hold a maximum unsigned integer value of 262143, and so the
bias is 131071 (calculated as 262143 >> 1). A value of -1 will be encoded as (-1 + 131071)
or 131070 or 1FFFE in hexadecimal.
Fields A, B and C usually refers to register numbers (I’ll use the term “register” because of its
similarity to processor registers). Although field A is the target operand in arithmetic
operations, this rule isn’t always true for other instructions. A register is really an index into
the current stack frame, register 0 being the bottom-of-stack position.
字段 A、B 和 C 通常引用寄存器编码(我将使用术语“寄存器”,因为它与处理器寄
存器的相似性)。虽然在算术操作中字段 A 是目标操作数,但这个规则并非也适用于
其他指令。寄存器通常是指向当前栈帧中的索引,0 号寄存器是栈底位置。
Unlike the Lua C API, negative indices (counting from the top of stack) are not supported.
For some instructions, where the top of stack may be required, it is encoded as a special
operand value, usually 0. Local variables are equivalent to certain registers in the current
stack frame, while dedicated opcodes allow read/write of globals and upvalues. For some
instructions, a value in fields B or C may be a register or an encoding of the number of a
constant in the constant pool. This will be described further in the section on instruction
notation.
与 C API 不同的是,负索引(从栈顶开始计数)是不支持的。某些指令需要指定栈
顶,则索引被编码为特定的操作数(通常是 0)。局部变量等价于当前栈帧中的某个
寄存器,但是也有允许读/写全局(变量)和 upvlaue 的操作码。对于某些指令来说,
字段 B 或 C 的值可能为寄存器或常量池中的常量的已编码的编号。这方面在关于指令
标记的章节会更深入地论述。
-6-
By default, Lua has a maximum stack frame size of 250. This is encoded as MAXSTACK in
llimits.h. The maximum stack frame size in turn limits the maximum number of locals
per function, which is set at 200, encoded as LUAI_MAXVARS in luaconf.h. Other limits
found in the same file include the maximum number of upvalues per function (60), encoded
as LUAI_MAXUPVALUES , call depths, the minimum C stack size, etc. Also, with an sBx field
of 18 bits, jumps and control structures cannot exceed a jump distance of about 131071.
-7-
Opcode Name Description
操作码 命名 说明
0 MOVE Copy a value between registers 在寄存器间拷贝值
1 LOADK Load a constant into a register 把一常量载入寄存器
2 LOADBOOL Load a boolean into a register 把一布尔值载入寄存器
3 LOADNIL Load nil values into a range of registers 把 nil 载入一系列寄存器
4 GETUPVAL Read an upvalue into a register 把一 upvalue 读入寄存器
5 GETGLOBAL Read a global variable into a register 把一全局变量读入寄存器
6 GETTABLE Read a table element into a register 把一表元素读入寄存器
7 SETGLOBAL Write a register value into a global variable 把一寄存器值写入全局变量
8 SETUPVAL Write a register value into an upvalue 把一寄存器值写入 upvalue
9 SETTABLE Write a register value into a table element 把一寄存器值写入表元素
10 NEWTABLE Create a new table 创建表
11 SELF Prepare an object method for calling 为调用对象方法做准备
12 ADD Addition operator 加法操作
13 SUB Subtraction operator 减法操作
14 MUL Multiplication operator 乘法操作
15 DIV Division operator 除法操作
16 MOD Modulus (remainder) operator 取模(余数)操作
17 POW Exponentiation operator 取幂操作
18 UNM Unary minus operator 一元负操作
19 NOT Logical NOT operator 逻辑非操作
20 LEN Length operator 取长度操作
21 CONCAT Concatenate a range of registers 连接一系列寄存器
22 JMP Unconditional jump 无条件跳转
23 EQ Equality test 相等测试
24 LT Less than test 小于测试
25 LE Less than or equal to test 小于或等于测试
26 TEST Boolean test, with conditional jump 布尔测试,带条件跳转
27 TESTSET Boolean test, with conditional jump and assignment
布尔测试,带条件跳转和赋值
28 CALL Call a closure 调用闭包
29 TAILCALL Perform a tail call 执行尾调用
30 RETURN Return from function call 从函数调用返回
31 FORLOOP Iterate a numeric for loop 迭代数字 for 循环
32 FORPREP Initialization for a numeric for loop 初始化数字 for 循环
33 TFORLOOP Iterate a generic for loop 迭代一般形式的 for 循环
34 SETLIST Set a range of array elements for a table 设置表的一系列数组元素
35 CLOSE Close a range of locals being used as upvalues
关闭被用作 upvalue 的一系列局部变量
36 CLOSURE Create a closure of a function prototype 创建一函数原型的闭包
37 VARARG Assign vararg function arguments to registers
把可变数量参数赋给寄存器
-8-
3 Really Simple Chunks 十分简单的程序块
Before heading into binary chunk and virtual machine instruction details, this section will
demonstrate briefly how ChunkSpy can be used to explore Lua 5 code generation. All the
examples in this document were produced using the Lua 5.1 version of ChunkSpy found in
the ChunkSpy 0.9.8 distribution.
在深入二进制块和虚拟机指令细节以前,本节将简要地展示如何用 ChunkSpy 查看
Lua5 的代 码 生 成 。 本 文 中 的 所 有 示 例 都 是 用 0.9.8 版分 发 包 中 对 应 Lua5.1 版的
ChunkSpy 产生的。
首先,以交互模式启动 ChunkSpy(用户输入为黑体):
$ lua ChunkSpy.lua --interact
ChunkSpy: A Lua 5.1 binary chunk disassembler
Version 0.9.8 (20060307) Copyright (c) 2004-2006 Kein-Hong Man
The COPYRIGHT file describes the conditions under which this
software may be distributed (basically a Lua 5-style license.)
>
We’ll start with the shortest possible binary chunk that can be generated:
我们将从能生成的尽可能最短的二进制块开始:
do end
>do
; source chunk: (interactive mode)
; x86 standard (32-bit, little endian, doubles)
ChunkSpy will treat your keyboard input as a small chunk of Lua source code. The library
function string.dump() is first used to generate a binary chunk string, then ChunkSpy will
disassemble that string and give you a brief assembly language-style output listing.
Some features of the listing: Comment lines are prefixed by a semicolon. The header portion
of the binary chunk is not displayed with the brief style. Data or header information that isn’t
-9-
an instruction is shown as an assembler directive with a dot prefix. luac-style comments are
generated for some instructions, and the instruction location is in square brackets.
清单的一些特征:注释行以分号为前缀。摘要样式不显示二进制块的头部。非指令的
数据或头部信息显示为以点号为前缀的汇编指令。某些指令会生成 luac 样式的注
释,并且指令位置放在方括号中。
A “do end” generates a single RETURN instruction and does nothing else. There are no
parameters, locals, upvalues or globals. For the rest of the disassembly listings shown in this
document, we will omit some common header comments and show only the function
disassembly part. Instructions will be referenced by its marked position, e.g. line [1]. Here is
another very short chunk:
A RETURN instruction is generated for every return in the source. The first RETURN (line
[1]) is generated by the return keyword, while the second RETURN (line [2]) is always
added by the code generator. This isn’t a problem, because the second RETURN never gets
executed anyway, and only 4 bytes is wasted. Perfect generation of RETURN instructions
requires basic block analysis, and it is not done because there is no performance penalty for
an extra RETURN during execution, only a negligible memory penalty.
Notice in these examples, the minimum stack size is 2, even when the stack isn’t used. The
next snippet assigns a constant value of 6 to the global variable a:
在这些示例中要注意,最小的栈尺寸是 2,即使并未用到栈。下一个片段把常量值 6
赋给全局变量 a:
a=6
>a=6
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.const "a" ; 0
.const 6 ; 1
[1] loadk 0 1 ; 6
-10
10--
[2] setglobal 0 0 ; a
[3] return 0 1
; end of function
All string and number constants are pooled on a per-function basis, and instructions refer to
them using an index value which starts from 0. Global variable names need a constant string
as well, because globals are maintained as a table. Line [1] loads the value 6 (with an index to
the constant pool of 1) into register 0, then line [2] sets the global table with the constant “a”
(constant index 0) as the key and register 0 (holding the number 6) as the value.
所有字符串和数值常量都在函数的基础上进行池化(管理),指令使用从 0 开始的索
引引用它们。全局变量名也需要常量字符串,因为全局变量是作为表维护的。行[1]把
值 6(常量池中的索引为 1)载入 0 号寄存器,然后行[2]用常量“a”(常量索引 0)
作为键、0 号寄存器(持有数值 6)作为值设置全局表。
如果我们把该变量写成局部的,会得到:
local a="hello"
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "a" ; 0
.const "hello" ; 0
[1] loadk 0 0 ; "hello"
[2] return 0 1
; end of function
Local variables reside in the stack, and they occupy a stack (or register) location for the
duration of their existence. The scope of a local variable is specified by a starting program
counter location and an ending program counter location; this is not shown in a brief
disassembly listing.
局部变量驻留在栈中,它们在生存期内占用 1 个栈(或寄存器)位置。局部变量的作
用域通过起始程序计数器(pc)位置和截止程序计数器位置指定;摘要反汇编清单不
显示这些。
The local table in the function tells the user that register 0 is variable a. This information
doesn’t matter to the VM, because it needs to know register numbers only – register
allocation was supposed to have been properly done by the code generator. So LOADK in
line [1] loads constant 0 (the string “hello”) into register 0, which is the local variable a. A
stripped binary chunk will not have local variable names for debugging.
Some examples in the following sections have been further annotated with additional
comments in parentheses. Please note that ChunkSpy will not generate such comments, nor
-11
11--
will it indent functions that are at different nesting levels. Next we will take a look at the
structure of Lua 5.1 binary chunks.
后面章节中的一些示例用圆括号中的注释做了进一步的诠释。请注意,ChunkSpy 不会
生成这样的注释,也不会缩进不同嵌套层次的函数。接下来我们要看看 Lua5.1 二进制
块的结构。
-12
12--
4 Lua Binary Chunks Lua 二进制程序块
Lua can dump functions as binary chunks, which can then be written to a file, loaded and run.
Binary chunks behave exactly like the source code from which they were compiled.
Lua 能把函数转储(dump)为二进制块,它可被写入文件、加载并运行。二进制块行
为表现得与编译出它们的源代码非常相似。
A binary chunk consist of two parts: a header block and a top-level function. The header
portion contains 12 elements:
一个二进制块由两部分组成:头部块和顶层函数。头部包含 12 个元素:
-13
13--
A Lua 5.1 binary chunk header is always 12 bytes in size. Since the characteristics of a Lua
virtual machine is hard-coded, the Lua undump code checks all 12 of the header bytes to
determine whether the binary chunk is fit for consumption or not. All 12 header bytes of the
binary chunk must exactly match the header bytes of the platform, otherwise Lua 5.1 will
refuse to load the chunk. The header is also not affected by endianness; the same code can be
used to load the main header of little-endian or big-endian binary chunks. The data type of
lua_Number is determined by the size of lua_Number byte and the integral flag together.
In theory, a Lua binary chunk is portable; in real life, there is no need for the undump code to
support such a feature. If you need undump to load all kinds of binary chunks, you are
probably doing something wrong. If however you somehow need this feature, you can try
ChunkSpy’s rewrite option, which allows you to convert a binary chunk from one profile to
another.
Anyway, most of the time there is little need to seriously scrutinize the header, because since
Lua source code is usually available, a chunk can be readily compiled into the native binary
chunk format.
头部块后面紧跟着顶层函数或代码块:
-14
14--
Is_vararg 标志(见下面的进一步解释)
• 1=VARARG_HASARG
• 2=VARARG_ISVARARG
• 4=VARARG_NEEDSARG
1 byte maximum stack size (number of registers used)
最大栈尺寸(使用的寄存器数量)
List list of instructions (code) 指令(代码)列表
List list of constants 常量列表
List list of function prototypes 函数原型列表
List source line positions (optional debug data) 源码位置(可选的调试数据)
List list of locals (optional debug data) 局部变量列表(可选的调试数据)
List list of upvalues (optional debug data) upvalue 列表(可选的调试数据)
A function block in a binary chunk defines the prototype of a function. To actually execute
the function, Lua creates an instance (or closure) of the function first. A function in a binary
chunk consist of a few header elements and a bunch of lists. Debug data can be stripped.
二进制块中的函数块定义了函数原型。要实际执行函数,Lua 首先创建一个实例(或
闭包)。二进制块中的函数由一些头部元素和列表组成。调试数据可被剥离。
字符串以这种方式定义:
The string data size takes into consideration a NUL character at the end, so
an empty string (“”) has 1 as the size_t value. A size_t of 0 means zero
string data bytes; the string does not exist. This is often used by the source
name field of a function.
字符串数据尺寸考虑到了结尾的 NUL 字符,所以空字符串(””)的 size_t 值
是 1。size_t 为 0 表示 0 个字符串数据字节;该字符串并不存在。函数的源代
码名字段经常用它。
The source name is usually the name of the source file from which the binary chunk is
compiled. It may also refer to a string. This source name is specified only in the top-level
function; in other functions, this field consists only of a Size_t with the value 0.
源代码名通常是编译出二进制块的源文件的名字。它也可引用字符串。只在顶层函数
中指定源代码名;在其他函数中,该字段只由 0 值的 Size_t 构成。
-15
15--
The line defined and last line defined are the line numbers where the function prototype
starts and ends in the source file. For the main chunk, the values of both fields are 0. The next
two fields, the number of upvalues and the number of parameters
parameters, are self-explanatory, as
is the maximum stack size field. The is_vararg field is a bit more complicated, though.
These are all byte-sized fields.
定义开始行和定义结束行是函数原型在源代码中开始和结束的行号。主代码块的这两
个字段都是 0。之后的两个字段,upvalue
upvalue 数量和参数数量,同最大栈尺寸字段一样是
不言自明的。不过 is_vararg 字段稍微复杂些。这些是全部的字节尺寸的字段。
The is_vararg flag comprise 3 bitfields. By default, Lua 5.1 defines the constant
LUA_COMPAT_VARARG, allowing the table arg to be used in functions that are defined
with a variable number of parameters (vararg functions.) The table arg itself is not counted
in the number of parameters. For old style code that uses arg
arg, is_vararg is 7. If the code
within the vararg function uses ... instead of arg arg, then is_vararg is 3 (the
VARARG_NEEDSARG field is 0.) If 5.0.2 compatibility is compiled out, then is_vararg is
2.
is_vararg 标 志 包 含 3 个 位 字 段 。 缺 省 时 , Lua5.1 定 义 常 量
LUA_COMPAT_VARARG,允许表 arg 被用于带可变数量的参数( vararg 函数)定义
的函数。表 arg 自身计入参数数量内。对于使用 arg 的旧式代码,is_vararg is_vararg 是 7。如
果 内 含 vararg 函 数 的 代 码 使 用 ... 代 替 arg , 那 么 is_vararg 是 3 ( 字 段
VARARG_NEEDSARG 是 0)。如果编译出了 5.0.2 兼容性,则 is_vararg 是 2.
To summarize, the flag VARARG_ISVARARG (2) is always set for vararg functions. If
LUA_COMPAT_VARARG is defined, VARARG_HASARG (1) is also set. If ... is not used
within the function, then VARARG_NEEDSARG (4) is set. A normal function always has an
is_vararg flag value of 0, while the main chunk always has an is_vararg flag value of 2.
来个 总 结 , vararg 函数 总 是 设 置 标 志 VARARG_ISVARARG ( 2 )。 如 果 定 义 了
LUA_COMPAT_VARARG , 也 会 设 置 VARARG_HASARG ( 1 ) 。 如 果 函 数 内 没
用...
...,则设置 VARARG_NEEDSARG(4)。平常的函数总是具有 0 值的 is_vararg 标
...
志,可是主代码块总是具有值为 2 的 is_vararg 标志。
After the function header elements comes a number of lists that store the information that
makes up the body of the function. Each list starts with an Integer as a list size count,
followed by a number of list elements. Each list has its own element format. A list size of 0
has no list elements at all.
函数头部元素后面出现的是许多列表,它们存储构成函数体的信息。每个列表以作为
列表尺寸的 Integer 开始,后面跟着很多列表元素。每个列表都有自己的元素格式。0
尺寸的列表没有列表元素。
In the following boxes, a data type in square brackets, e.g. [Integer] means that there are
multiple numbers of the element, in this case an integer. The count is given by the list size.
Names in parentheses are the ones given in the Lua sources; they are data structure fields.
在下面的方框中,在方括号中的数据类型,例如[Integer
Integer
Integer],表示若干个元素,该例中
-16
16--
是整型。数量由列表尺寸给出。圆括号中的名字是 Lua 源代码中的;它们是数据结构
中的字段。
The first list is the instruction list, or the actual code to the function. This is the list of
instructions that will actually be executed:
第一个列表是指令表,或者说函数的实际编码。这是将被实际执行的指令列表。
The format of the virtual machine instructions was given in the last chapter. A RETURN
instruction is always generated by the code generator, so the size of the instruction list should
be at least 1. Next is the list of constants:
Number is the Lua number data type, normally an IEEE 754 64-bit double. Integer
Integer, Size_t
and Number are all endian-sensitive; Lua 5.1 will not load a chunk whose endianness is
different from that of the platform. Their sizes and formats are of course specified in the
binary chunk header. The data type of Number is determined by its size byte and the integral
flag. Boolean values are encoded as either 0 or 1.
函数原型表跟在常量表后:
Function prototypes or function blocks have the exact same format as the top-level function
or chunk. However, function prototypes that isn’t the top-level function do not have the
source name field defined. In this way, function prototypes at different lexical scoping levels
are defined and nested. In a complex binary chunk, the nesting may be several levels deep. A
closure will refer to a function by its number in the list.
函数原型或这说函数块的格式同顶层函数或(二进制)块完全一样。但是,非顶层函
数的函数原型没定义源代码名字段。这样,函数原型被定义和嵌套在不同的词法作用
域层次。这种嵌套在复杂的二进制块中可能达到若干层的深度。闭包通过其在(函
数)表中的编号引用函数。
The lists following the list of prototypes are optional. They contain debug information and
can be stripped to save space. First comes the source line position list:
原型列表后面的列表是可选的。它们含有调试信息,可以被剥离以节省空间。首先是
源代码行位置表。
-18
18--
Next up is the local list. Each local variable entry has 3 fields, a string and two integers:
接下来是局部(变量)表。每个局部变量项有 3 个字段,一个字符串和两个整数:
最后的列表是 upvalue 表:
All the lists are not shared or re-used: Locals, upvalues, constants and prototypes referenced
in the code must be specified in the respective lists in the same function. In addition, locals,
upvalues, constants and the function prototypes are indexed using numbers starting from 0. In
disassembly listings, both the source line position list and the instruction list are indexed
starting from 1. Note that the latter is by convention only; the indices does not matter to the
virtual machine itself, since all jump-related instructions use only signed displacements.
However, for debug information, the scope of local variables is encoded using absolute
program counter positions, and these positions are based on a starting index of 1. This is also
consistent with the output listing from luac.
所有列表都不可共享或重用:代码中引用的局部(变量)、upvalue、常量和原型必须
在同一函数中的各自的列表中指定。另外,局部(变量)、upvalue、常量和函数原型
用从 0 开始的数字索引。在反汇编清单中,源代码行位置表和指令表都从 1 开始索
引。注意,下面(所说)的只是约定;索引不影响虚拟机本身,因为所有跳转相关的
指令只用有符号位移。然而,对于调试信息,局部变量的作用域使用绝对程序计数器
位置编码,而且这些位置基于起始索引 1.这与 luac 的输出清单是一致的。
How does it all fit in? You can easily generate a detailed binary chunk disassembly using
ChunkSpy. Enter the following short bit of code and name the file simple.lua:
-19
19--
它们是如何装配在一起的?你可用 ChunkSpy 生成详细的二进制块反汇编。输入下面
的简短代码并命名文件为 simple.lua:
local a = 8
function b(c) d = a + c end
Next, run ChunkSpy from the command line to generate the listing:
The following is a description of the generated listing (simple.lst), split into segments.
下面是生成的清单( simple.lst)的说明,已经分成了片段。
Pos Hex Data Description or Code
------------------------------------------------------------------------
0000 ** source chunk: simple.lua
** global header start **
0000 1B4C7561 header signature: "\27Lua"
0004 51 version (major:minor hex digits)
0005 00 format (0=official)
0006 01 endianness (1=little endian)
0007 04 size of int (bytes)
0008 04 size of size_t (bytes)
0009 04 size of Instruction (bytes)
000A 08 size of number (bytes)
000B 00 integral (1=integral)
* number type: double
* x86 standard (32-bit, little endian, doubles)
** global header end **
This is an example of a binary chunk header. ChunkSpy calls this the global header to
differentiate it from a function header. For binary chunks specific to a certain platform, it is
easy to match the entire header at one go instead of testing each field. As described
previously, the header is 12 bytes in size, and needs to be exactly compatible with the
platform or else Lua 5.1 won’t load the binary chunk.
这是二进制块头部的示例。ChunkSpy 把它称为全局头部以区别于函数头部。对于特定
于某平台的二进制块,很容易一次性匹配整个头部而非测试每个字段。如前所述,头
部尺寸是 12 字节,而且需要与平台严格相容,否则 Lua5.1 不会加载二进制块。
The global header is followed by the function header of the top-level function:
全局头部后面是顶层函数的函数头部:
000C ** function [0] definition (level 1)
** start of function **
000C 0B000000 string size (11)
0010 73696D706C652E6C+ "simple.l"
0018 756100 "ua\0"
source name: simple.lua
001B 00000000 line defined (0)
001F 00000000 last line defined (0)
-20
20--
0023 00 nups (0)
0024 00 numparams (0)
0025 02 is_vararg (2)
0026 02 maxstacksize (2)
A function’s header is always variable in size, due to the source name string. The source
name is only present in the top-level function. A top-level chunk does not have a line number
on which it is defined, so both the line defined fields are 0. There are no upvalues or
parameters. A top-level chunk can always take a variable number of parameters; is_vararg is
always 2 for the top-level chunk. The stack size is set at the minimum of 2 for this very
simple chunk.
由于源代码名字符串的关系,函数头部的尺寸是可变的。源代码名只在顶层函数中存
在。顶层块没有关于它在哪儿定义的行号,所以两个定义行字段都是 0.此处没有
upvalue 和参数。顶层块总是带有可变数量的参数;对顶层块来说 is_vararg 总是 2。
这个非常简单的块的栈尺寸设为最小值 2。
Next we come to the various lists, starting with the code listing of the main chunk:
接下来看各种列表,从主块的编码列表开始:
* code:
0027 05000000 sizecode (5)
002B 01000000 [1] loadk 0 0 ; 8
002F 64000000 [2] closure 1 0 ; 1 upvalues
0033 00000000 [3] move 0 0
0037 47400000 [4] setglobal 1 1 ; b
003B 1E008000 [5] return 0 1
The first line of the source code compiles to a single instruction, line [1]. Local a is register 0
and the number 8 is constant 0. In line [2], an instance of function prototype 0 is created, and
the closure is temporarily placed in register 1. The MOVE instruction in line [3] is actually
used by the CLOSURE instruction to manage the upvalue a; it is not really executed. This
will be explained in detail in Chapter 14. The closure is then placed into the global b in line
[4]; “b” is constant 1 while the closure is in register 1. Line [5] returns control to the calling
function. In this case, it exits the chunk.
常量表在指令后面:
* constants:
003F 02000000 sizek (2)
0043 03 const type 3
0044 0000000000002040 const [0]: (8)
004C 04 const type 4
-21
21--
004D 02000000 string size (2)
0051 6200 "b\0"
const [1]: "b"
The top-level function requires two constants, the number 8 (which is used in the assignment
on line 1) and the string “b” (which is used to refer to the global variable b on line 2.)
This is followed by the function prototype list of the main chunk. On line 2 of the source, a
function prototype was declared within the main chunk. This function is instantiated and the
closure is assigned to global b.
再往后是主程序块的函数原型表。在源代码的第 2 行上,主程序块声明了一个函数原
型。该函数被实例化且其闭包被赋给全局(变量)bb。
The function prototype list holds all the relevant information, a function block within a
function block. ChunkSpy reports it as function prototype number 0, at level 2. Level 1 is the
top-level function; there is only one level 1 function, but there may be more than one function
prototype at other levels.
函数原型表持有函数块中的函数块的所有相关信息。ChunkSpy 把它报告为层次 2 上的
0 号函数原型。层 1 是顶层函数;只能有一个层 1 的函数,但是其他层次上可存在多
于一个的函数原型。
* functions:
0053 01000000 sizep (1)
Above is the first section of function b’s prototype. It has no name string; it is defined on line
2 (both values point to line 2); there is one upvalue; there is one parameter, c; it is not a
vararg function; and its maximum stack size is 2. Parameters are located from the bottom of
the stack, so the single parameter c of the function is at register 0.
-22
22--
于栈底开始的位置,所以该函数仅有的一个参数 c 在 0 号寄存器。
The prototype has 4 instructions. Most Lua virtual machine instructions are easy to decipher,
but some of them have details that are not immediately evident. This example however
should be quite easy to understand. In line [1], 0 is the upvalue a and 1 is the target register,
which is a temporary register. Line [2] is the addition operation, with register 1 holding the
temporary result while register 0 is the function parameter c. In line [3], the global d (so
named by constant 0) is set, and in the next line, control is returned to the caller.
The constant list for the function has one entry, the string “d” is used to look up the global
variable of that name. This is followed by the source line position list:
该函数的常量表有一项,字符串“b”用于查找同名全局变量。这之后是源代码行位置
表。
* lines:
008A 04000000 sizelineinfo (4)
[pc] (line)
008E 02000000 [1] (2)
0092 02000000 [2] (2)
0096 02000000 [3] (2)
009A 02000000 [4] (2)
All four instructions that were generated came from line 2 of the source code.
生成的所有四条指令都来自与源代码的第 2 行。
The last two lists of the function prototype are the local list and the upvalue list:
函数原型的最后两个列表是局部(变量)表和 upvalue 表:
* locals:
009E 01000000 sizelocvars (1)
00A2 02000000 string size (2)
00A6 6300 "c\0"
local [0]: c
00A8 00000000 startpc (0)
00AC 03000000 endpc (3)
* upvalues:
-23
23--
00B0 01000000 sizeupvalues (1)
00B4 02000000 string size (2)
00B8 6100 "a\0"
upvalue [0]: a
** end of function **
There is one local variable, which is parameter c. For parameters, the startpc value is 0.
Normal locals that are defined within a function have a startpc value of 1. There is also an
upvalue, a, which refers to the local a in the parent (top) function.
After the end of the function prototype data for function b, the chunk resumes with the debug
information for the top-level chunk:
在函数 b 的函数原型数据末尾之后,程序块回到顶层程序块的调试信息:
* lines:
00BA 05000000 sizelineinfo (5)
[pc] (line)
00BE 01000000 [1] (1)
00C2 02000000 [2] (2)
00C6 02000000 [3] (2)
00CA 02000000 [4] (2)
00CE 02000000 [5] (2)
* locals:
00D2 01000000 sizelocvars (1)
00D6 02000000 string size (2)
00DA 6100 "a\0"
local [0]: a
00DC 01000000 startpc (1)
00E0 04000000 endpc (4)
* upvalues:
00E4 00000000 sizeupvalues (0)
** end of function **
From the source line list, we can see that there are 5 instructions in the top-level function.
The first instruction came from line 1 of the source, while the other 4 instructions came from
line 2 of the source.
The top-level function has one local variable, named “a”, active from program counter
location 1 to location 4, and it refers to register 0. There are no upvalues, so the size of that
table is 0. The binary chunk ends after the debug information of the main chunk is listed.
-25
25--
5 Instruction Notation 指令记法
Before looking at some Lua virtual machine instructions, here is a little something about the
notation used for describing instructions. Instruction descriptions are given as comments in
the Lua source file lopcodes.h. The instruction descriptions are reproduced in the
following chapters, with additional explanatory notes. Here are some basic symbols:
The notation used to describe instructions is a little like pseudo-C. The operators used in the
notation are largely C operators, while conditional statements use C-style evaluation.
Booleans are evaluated C-style. Thus, the notation is a loose translation of the actual C code
that implements an instruction.
The operation of some instructions cannot be clearly described by one or two lines of
notation. Hence, this guide will supplement symbolic notation with detailed descriptions of
the operation of each instruction. Having described an instruction, examples will be given to
show the instruction working in a short snippet of Lua code. Using ChunkSpy’s interactive
mode, you can try out the examples yourself and get instant feedback in the form of
disassembled code. If you want a disassembled listing plus the byte values of data and
instructions, you can use ChunkSpy to generate a normal, verbose, disassembly listing.
一些指令的操作不能用以一两行标记清楚地描述。因此,本指南将补充一些象征性的
标记,它们有对每个指令的操作的详细描述。介绍了指令之后会给出简短的 Lua 代码
片段作为例子来展示指令的运作。利用 ChunkSpy 的交互模式,你能自己试验例子并
-26
26--
立即看到反汇编编码形式的反馈。如果你想得到反汇编清单外加数据和指令的字节
值,可用 ChunkSpy 生成常规、详细的反汇编清单。
The program counter of the virtual machine (PC) always points to the next instruction. This
behaviour is standard for most microprocessors. The rule is that once an instruction is read in
to be executed, the program counter is immediately updated. So, to skip a single instruction
following the current instruction, add 1 (the displacement) to the PC. A displacement of -1
will theoretically cause a JMP instruction to jump back onto itself, causing an infinite loop.
Luckily, the code generator is not supposed to be able to make up stuff like that.
虚拟机的程序计数器(PC)总是指向下一条指令。这是多数微处理器上的标准行为。
规则就是,一旦指令被读入并准备执行,程序计数器立刻更新。所以,要跳过当前指
令后一条指令,给 PC 加 1(位移)。理论上,位移为 -1 将使 JMP 指令跳回到自身
(位置),引发无限循环。幸运的是,代码生成器应该不能编造那样的东西。
As previously explained, registers and local variables are roughly equivalent. Temporary
results are always held in registers. Instruction fields B and C can point to a constant instead
of a register for some instructions, this is when the field value has its MSB (most significant
bit) set. For example, a field B value of 256 will point to the constant at index 0, if the field is
9 bits wide. For most instructions, field A is the target register. Disassembly listings preserve
the A, B, C operand field order for consistency.
如前所述,寄存器和局部变量大致是等价的。临时结果总是在寄存器中存储。对某些
指令来说,当指令字段 B 和 C 的值设置了 MSB(最高有效位)时,它们可指向常量
而非寄存器。例如,如果字段 B 位宽度为 9,则值 256 将指向索引为 0 的常量。多数
指令都把字段 A 用作目标寄存器。反汇编列表保持 A、B、C 操作数字段顺序的一致
性。
-27
27--
6 Loading Constants 加载常量
Loads and moves are the starting point of pretty much all processor or virtual machine
instruction sets, so we’ll start with primitive loads and moves:
加载和移动是几乎全部处理器或虚拟机指令集的起点,所以我们从原始的装载和移动
开始:
The most straightforward use of MOVE is for assigning a local to another local:
MOVE 最直接的用处是把一个局部变量赋给另一个局部变量:
Line [3] assigns (copies) the value in local a (register 0) to local b (register 1).
You won’t see MOVE instructions used in arithmetic expressions because they are not
needed by arithmetic operators. All arithmetic operators are in 2- or 3-operand style: the
entire local stack frame is already visible to operands R(A), R(B) and R(C) so there is no
need for any extra MOVE instructions.
-28
28--
Other places where you will see MOVE are:
• When moving parameters into place for a function call.
• When moving values into place for certain instructions where stack order is important, e.g.
GETTABLE, SETTABLE and CONCAT.
• When copying return values into locals after a function call.
• After CLOSURE instructions (discussed in Chapter 14.)
There are 3 fundamental instructions for loading constants into local variables. Other
instructions, for reading and writing globals, upvalues and tables are discussed in the
following chapters. The first constant loading instruction is LOADNIL:
有 3 条基本的指令用于加载常量到局部变量中。其他指令,如读写全局变量、upvalue
和表(的指令)在后面的章节讨论。第一个常量加载指令时 LOADNIL:
LOADNIL uses the operands A and B to mean a range of register locations. The example for
MOVE in the last page shows LOADNIL used to set a single register to nil
nil.
-29
29--
LOADNIL for locals a and b can be optimized away as the Lua virtual machine always sets
all locals to nil prior to executing a function. The optimization rule is a simple one: If no
other instructions have been generated, then a LOADNIL as the first instruction can be
optimized away.
行[2]把局部变量 d 和 e 置 nil
nil。局部变量 a 和 b 不需要 LOADNIL 指令,因为该指令已
经被优化掉了。局部变量 c 显式地用 0 值初始化。局部变量 a 和 b 的 LOADNIL 能被
优化掉是因为 Lua 虚拟机总是在执行函数以前先把全部局部变量设置为 nil nil。优化规则
是很简单的一条:如果还没生成其他指令,则作为第一条指令的 LOADNIL 可被优化
掉。
In the example, although the LOADNIL on line [2] is redundant, it is still generated as there
is already an instruction that is not LOADNIL on line [1]. Ideally, one should put all locals
that are initialized to nil at the top of the function, before anything else. In the above case, we
can rearrange the locals to take advantage of the optimization rule:
Now, we save one LOADNIL instruction. In other parts of a function, an explicit assignment
of nil to a local variable will of course require a LOADNIL instruction.
LOADK loads a constant from the constant list into a register or local. Constants are indexed
starting from 0. Some instructions, such as arithmetic instructions, can use the constant list
without needing a LOADK. Constants are pooled in the list, duplicates are eliminated. The
-30
30--
list can hold nil
nils, booleans, numbers or strings.
The constant 3 and the constant “foo” are both written twice in the source snippet, but in the
constant list, each constant has a single location. The constant list contains the names of
global variables as well, since GETGLOBAL and SETGLOBAL makes an implied LOADK
operation in order to get the name string of a global variable first before looking it up in the
global table.
在源码片段中常量 3 和“foo”都被写了两次,但在常量表中每个常量只有一个位置。
常量表中也含有全局变量的名字,因为 GETGLOBAL 和 SETGLOBAL 隐式地做了
LOADK 操作以在全局表中查找它之前先得到全局变量的名字字符串。
The final constant-loading instruction is LOADBOOL, for setting a boolean value, and it has
some additional functionality.
最后的常量加载指令是 LOADBOOL,用来设定布尔值,而且具有额外的功能。
-31
31--
LOADBOOL is used for loading a boolean value into a register. It’s also used where a
boolean result is supposed to be generated, because relational test instructions, for example,
do not generate boolean results – they perform conditional jumps instead. The operand C is
used to optionally skip the next instruction (by incrementing PC by 1) in order to support
such code. For simple assignments of boolean values, C is always 0.
LOADBOOL 用于把布尔值载入寄存器中。它也被用于期望生成布尔结果的地方,因
为,例如关系测试指令并不生成布尔结果-而是执行条件跳转。操作数 C 用来选择性
地跳过下一条指令(通过将 PC 增加 1)以支持这种编码。对于简单的赋布尔值,C 总
是 0。
local a,b = true,false
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "a" ; 0
.local "b" ; 1
[1] loadbool 0 1 0 ; true
[2] loadbool 1 0 0 ; false
[3] return 0 1
; end of function
This example is straightforward: Line [1] assigns true to local a (register 0) while line [2]
assigns false to local b (register 1). In both cases, field C is 0, so PC is not incremented and
the next instruction is not skipped.
This is an example of an expression that gives a boolean result and is assigned to a variable.
Notice that Lua does not optimize the expression into a true value; Lua 5.1 does not perform
compile-time constant evaluation for relational operations, but it can perform simple constant
evaluation for arithmetic operations.
在这个例子中,表达式给出布尔结果并赋给变量。注意,Lua 不会把表达式优化成一
个 true 值;Lua5.1 并不为关系操作执行编译时的常量求值,但能为算数操作执行简单
的常量求值。
-32
32--
Since the relational operator LT (which will be covered in greater detail later) does not give a
boolean result but performs a conditional jump, LOADBOOL uses its C operand to perform
an unconditional jump in line [3] – this saves one instruction and makes things a little tidier.
The reason for all this is that the instruction set is simply optimized for if then blocks.
if...then
Essentially, local a = 5 > 2 is executed in the following way:
由于关系操作符 LT(稍后将涉及更多细节)并不给出布尔结果而是执行条件跳转,
LOADBOOL 用其 C 操作数在行[3]中执行无条件跳转-这省去一条指令并让事情稍微
精简。所有这些的原因是,指令集只是为 if then 程序块优化了。实质上, local a =
if…then
5 > 2 以下面的方式执行:
local a
if 2 < 5 then
a = true
else
a = false
end
In the disassembly listing, when LT tests 2 < 5, it evaluates to true and doesn’t perform a
conditional jump. Line [2] jumps over the false result path, and in line [4], the local a
(register 0) is assigned the boolean true by the instruction LOADBOOL. If 2 and 5 were
reversed, line [3] will be followed instead, setting a false
false, and then the true result path (line
[4]) will be skipped, since LOADBOOL has its field C set to non-zero.
So the true result path goes like this (additional comments in parentheses):
所以 true 结果路径如此进行(额外的注释在圆括号中):
[1] lt 1 257 256 ; 2 5, to [3] if false (if 2 < 5)
[2] jmp 1 ; to [4]
[4] loadbool 0 1 0 ; true (a = true)
[5] return 0 1
and the false result path (which never executes in this example) goes like this:
并且 false 结果路径(本例中决不会执行)如此进行:
[1] lt 1 257 256 ; 2 5, to [3] if false (if 2 < 5)
[3] loadbool 0 0 1 ; false, to [5] (a = false)
[5] return 0 1
The true result path looks longer, but it isn’t, due to the way the virtual machine is
implemented. This will be discussed further in the section on relational and logic instructions.
true 结果路径看似长一些,根据虚拟机的实现方式,实际上并非如此。这些将在关于
关系和逻辑指令的章节更深入地讨论。
-33
33--
7 Upvalues and Globals Upvalue 和全局变量
When the Lua virtual machine needs an upvalue or a global, there are dedicated instructions
to load the value into a register. Similarly, when an upvalue or a global needs to be written to,
dedicated instructions are used.
The GETGLOBAL and SETGLOBAL instructions are very straightforward and easy to use.
The instructions require that the global variable name be a constant, indexed by instruction
field Bx. R(A) is either the source or target register. The names of the global variables used
by a function will be part of the constant list of the function.
From the example, you can see that “b” is the name of the local variable while “a” is the
name of the global variable. Line [1] loads the number 40 into register 0 (functioning as a
temporary register, since local b hasn’t been defined.) Line [2] assigns the value in register 0
to the global variable with name “a” (constant 0). By line [3], local b is defined and is
assigned the value of global a.
-34
34--
从该例中,你能看到“b”是局部变量名而“a”是全局变量名。行[1]把数值 40 载入 0
号寄存器(功能是作为临时寄存器,因为局部变量 b 还未定义)。行[2]把 0 号寄存器
中的值赋给名为“a” (0 号常量)的全局变量。行[3]定义了局部变量 b 并把全局变
量 a 的值赋给它。
During execution, upvalues are set up by a CLOSURE, and maintained by the Lua virtual
machine. In the following example, function b is declared inside the main chunk, and is
shown in the disassembly as a function prototype within a function prototype. The
indentation, which is not in the original output, helps to visually separate the two functions.
-35
35--
.function 1 0 0 2
.upvalue "a" ; 0
.const 1 ; 0
[1] loadk 0 0 ; 1
[2] setupval 0 0 ; a
[3] getupval 0 0 ; a
[4] return 0 2
[5] return 0 1
; end of function
Line [2] in function b sets upvalue a (upvalue number 0 in the upvalue table) to a number
value of 1 (held in temporary register 0.) In line [3], the value in upvalue a is retrieved and
placed into register 0, where the following RETURN instruction will use it as a return value.
The RETURN in line [5] is unused.
-36
36--
8 Table Instructions 表指令
Accessing table elements is a little more complex than accessing upvalues and globals:
All 3 operand fields are used, and some of the operands can be constants. A constant is
specified by setting the MSB of the operand to 1. If RK(C) need to refer to constant 1, the
encoded value will be (256 | 1) or 257, where 256 is the value of bit 8 of the operand.
Allowing constants to be used directly reduces considerably the need for temporary registers.
全部 3 个操作数都用到了,其中一些可谓常量。常量是通过把操作数的 MSB 设为 1
指定的。如果 RK(C)需要引用 1 号常量,编码后的值将是(256 | 1)或 257,其中 256 是
操作数第 8 位的值。允许直接使用常量大大减少了对临时寄存器的需要。
local p = {}; p[1] = "foo"; return p["bar"]
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "p" ; 0
.const 1 ; 0
.const "foo" ; 1
.const "bar" ; 2
[1] newtable 0 0 0 ; array=0, hash=0
[2] settable 0 256 257 ; 1 "foo"
[3] gettable 1 0 258 ; "bar"
[4] return 1 2
[5] return 0 1
; end of function
In line [1], a new empty table is created and the reference placed in local p (register 0).
Creating and populating new tables is a little involved so it will only be discussed later.
行[1]创建新表,其引用置于局部变量 p (0 号寄存器)中。创建和组装新表有点复
杂,所以我们将在稍后讨论。
-37
37--
Table index 1 is set to “foo” in line [2] by the SETTABLE instruction. Both the index and the
value for the table element are encoded constant numbers; 256 is constant 0 (the number 1)
while 257 is constant 1 (the string “foo”.) The R(A) value of 0 points to the new table that
was defined in line [1].
In line [3], the value of the table element indexed by the string “bar” is copied into temporary
register 1, which is then used by RETURN as a return value. 258 is constant 2 (the string
“bar”) while 0 in field B is the reference to the table.
在行[3]中,字符串“bar”索引的表元素的值被拷贝到临时的 1 号寄存器中,它然后作
为返回值用于 RETURN。258 是 2 号常量(字符串“bar”),字段 B 中的 0 是表的引
用。
RK(B) and RK(C) type operands are also used in virtual machine instructions that implement
binary arithmetic operators and relational operators.
RK(B)和 RK(C)类型的操作数也用于实现二元算数操作符和关系操作符的虚拟机指令
中。
-38
38--
9 Arithmetic and String Instructions 算数和字符串指令
The Lua virtual machine’s set of arithmetic instructions looks like 3-operand arithmetic
instructions on an RISC processor. 3-operand instructions allow arithmetic expressions to be
translated into machine instructions pretty efficiently.
The source operands, RK(B) and RK(C), may be constants. If a constant is out of range of
field B or field C, then the constant will be loaded into a temporary register in advance.
-39
39--
[9] return 0 1
; end of function
In the disassembly shown above, parts of the expression is shown as additional comments in
parentheses. Each arithmetic operator translates into a single instruction. This also means that
while the statement “count = count + 1” is verbose, it translates into a single instruction
if count is a local. If count is a global, then two extra instructions are required to read and
write to the global (GETGLOBAL and SETGLOBAL), since arithmetic operations can only
be done on registers (locals) only.
在上面的反汇编中,表达式的部分作为额外的注释显示在圆括号中。每个算术操作符
转为单条指令。这也意味着,尽管语句“ count = count + 1”很冗长,如果 count
是局部变量则转为单条指令。如果 count 是全局变量,则需要两条额外的指令读写全
局变量(GETGLOBAL 和 SETGLOBAL),因为算术操作符只能用于寄存器(局部变
量)。
As of Lua 5.1, the parser and code generator can perform limited constant expression folding
or evaluation. Constant folding only works for binary arithmetic operators and the unary
minus operator (UNM, which will be covered next.) There is no equivalent optimization for
relational, boolean or string operators.
从 Lua5.1 开始,解析器和编码生成器可执行有限的常量表达式折叠和求值。常量折叠
只作用于二元操作符和一元负操作符(UNM,将在下一部分涉及)。没有相当的针对
关系、布尔或字符串操作符的优化。
The optimization rule is simple: If both terms of a subexpression are numbers, the
subexpression will be evaluated at compile time. However, there are exceptions. One, the
code generator will not attempt to divide a number by 0 for DIV and MOD, and two, if the
result is evaluated as a NaN (Not a Number) then the optimization will not be performed.
优化规则很简单:如果一个子表达式的两项都是数值,则该子表达式将在编译时求
值。不过也有例外。第一个是,编码生成器不会处理 DIV 和 MOD 的数值被 0 除(的
情况),第二个是,如果结果求值后为 NaN(非数值)则不会执行优化。
Also, constant folding is not done if one term is in the form of a string that need to be coerced.
In addition, expression terms are not rearranged, so not all optimization opportunities can be
recognized by the code generator. This is intentional; the Lua code generator is not meant to
perform heavy duty optimizations, as Lua is a lightweight language. Here are a few examples
to illustrate how it works (additional comments in parentheses):
并且,如果某一项是需要强制转换的字符串的形式则不会做常量折叠。另外,不会重
新安排表达式项,所以编码生成器不能识别全部的优化机会。这是有意为之;Lua 编
码生成器并没打算执行重量级的优化职能,因为 Lua 是个轻量级语言。看些说明它如
何运转的例子(附加的注释在圆括号中 ):
local a = 4 + 7 + b; a = b + 4 * 7; a = b + 4 + 7
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "a" ; 0
-40
40--
.const "b" ; 0
.const 11 ; 1
.const 28 ; 2
.const 4 ; 3
.const 7 ; 4
[1] getglobal 0 0 ; b
[2] add 0 257 0 ; 11 (a = 11 + b)
[3] getglobal 1 0 ; b
[4] add 0 1 258 ; 28 (a = b + 28)
[5] getglobal 1 0 ; b
[6] add 1 1 259 ; 4 (loc1 = b + 4)
[7] add 0 1 260 ; 7 (a = loc1 + 7)
[8] return 0 1
; end of function
For the first assignment statement, 4+7 is evaluated, thus 11 is added to b in line [2]. Next, in
line [3] and [4], b and 28 are added together and assigned to a because multiplication has a
higher precedence and 4*7 is evaluated first. Finally, on lines [5] to [7], there are two
addition operations. Since addition is left-associative, code is generated for b+4 first, and
only after that, 7 is added. So in the third example, Lua performs no optimization. This can be
fixed using parentheses to explicitly change the precedence of a subexpression:
Now, the 4+7 subexpression can be evaluated at compile time. If the statement is written as:
the code generator will generate a single LOADK instruction; Lua first evaluates 4+7
4+7, then 7
is added, giving a total of 18. The arithmetic expression is completely evaluated in this case,
thus no arithmetic instructions are generated.
-41
41--
18。这样一来算术表达式被完全求值,因此没有生成算术指令。
In order to make full use of constant folding in Lua 5.1, the user just need to remember the
usual order of evaluation of an expression’s elements and apply parentheses where necessary.
The following are two expressions which will not be evaluated at compile time:
The first is due to a divide-by-0, while the second is due to a string constant that needs to be
coerced into a number. In both cases, constant folding is not performed, so the arithmetic
instructions needed to perform the operations at run time are generated instead.
第一个似乎由于被 0 除,第二个是由于一个需要强制转为数值的字符串常量。这两种
情况都不执行常量折叠,所以需要改为生成算术指令以在运行时执行这些操作。
Next are instructions for performing unary minus and logical NOT:
接下来是执行一元负和逻辑非的指令:
看下这两个一元操作的例子:
-42
42--
local p,q = 10,false; q,p = -p,not q
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 3 stacks
.function 0 0 2 3
.local "p" ; 0
.local "q" ; 1
.const 10 ; 0
[1] loadk 0 0 ; 10
[2] loadbool 1 0 0 ; false
[3] unm 2 0
[4] not 0 1
[5] move 1 2
[6] return 0 1
; end of function
Both UNM and NOT do not accept a constant as a source operand, making the LOADK on
line [1] and the LOADBOOL on line [2] necessary. When an unary minus is applied to a
constant number, the unary minus is optimized away. Similarly, when a not is applied to true
or false
false, the logical operation is optimized away.
In addition to this, constant folding is performed for unary minus, if the term is a number. So,
the expression in the following is completely evaluated at compile time:
此外,如果项是数值,会为一元负执行常量折叠。所以,下面的表达式在编译时完全
求值:
local a = - (7 / 4)
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "a" ; 0
.const -1.75 ; 0
[1] loadk 0 0 ; -1.75
[2] return 0 1
; end of function
Constant folding is performed on 7/4 first. Then, since the unary minus operator is applied to
the constant 1.75, constant folding can be performed again, and the code generated becomes a
simple LOADK (on line [1].)
-43
43--
返回 R(B)中的对象的长度。字符串返回字符串长度,表返回表尺寸(在 Lua
中定义)。对其他对象调用元方法。结果是个数值,放在 R(A)中。
This instruction is new in Lua 5.1, implementing the # operator. If # operates on a constant,
then the constant is loaded in advance using LOADK. The LEN instruction is currently not
optimized away using compile time evaluation, even if it is operating on a constant string or
table.
In the above example, LEN operates on local b in line [1], leaving the result in local a. Since
LEN cannot operate directly on constants, line [2] first loads the constant “foo” into a
temporary local, and only then LEN is executed.
Like LOADNIL, CONCAT accepts a range of registers. Doing more than one string
concatenation at a time is faster and more efficient than doing them separately.
-44
44--
.function 0 0 2 6
.local "x" ; 0
.local "y" ; 1
.const "foo" ; 0
.const "bar" ; 1
[1] loadk 0 0 ; "foo"
[2] loadk 1 1 ; "bar"
[3] move 2 0
[4] move 3 1
[5] move 4 0
[6] move 5 1
[7] concat 2 2 5
[8] return 2 2
[9] return 0 1
; end of function
In this example, strings are moved into place first (lines [3] to [6]) in the concatenation order
before a single CONCAT instruction is executed in line [7]. The result is left in temporary
local 2, which is then used as a return value by the RETURN instruction on line [8].
In the second example, three strings are concatenated together. Note that there is no string
constant folding. Lines [1] through [3] loads the three constants in the correct order for
concatenation; the CONCAT on line [4] performs the concatenation itself and assigns the
result to local a.
在第二个例子中,三个字符串被连接到一起。注意没有字符串折叠。行 [1]到[3]为连接
以正确的顺序载入三个常量;行[4]的 CONCAT 执行连接并把结果赋给局部变量 a。
-45
45--
10 Jumps and Calls 跳转和调用
Lua does not have any unconditional jump feature in the language itself, but in the virtual
machine, the unconditional jump is used in control structures and logical expressions.
Lua 语言中没有任何无条件跳转的特性,但是在虚拟机中,无条件跳转被用于控制结
构和逻辑表达式中。
For example, since a relational test instruction makes conditional jumps rather than generate a
boolean result, a JMP is used in the code sequence for loading either a true or a false
false:
例如,由于关系测试指令产生条件跳转而非生成布尔结果,JMP 被用在该编码序列中
以载入 true 或 false
false。
local m, n; return m >= n
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 3 stacks
.function 0 0 2 3
.local "m" ; 0
.local "n" ; 1
[1] le 1 1 0 ; to [3] if false (n <= m)
[2] jmp 1 ; to [4]
[3] loadbool 2 0 1 ; false, to [5] (false path)
[4] loadbool 2 1 0 ; true (true path)
[5] return 2 2
[6] return 0 1
; end of function
Line[1] performs the relational test. In line [2], the JMP skips over the false path (line [3]) to
the true path (line [4]). The result is placed into temporary local 2, and returned to the caller
by RETURN in line [5]. More examples where JMP is used will be covered in later chapters.
Next we will look at the CALL instruction, for calling instantiated functions:
In line [2], the call has zero parameters (field B is 1), zero results are retained (field C is 1),
while register 0 temporarily holds the reference to the function object from global z. Next we
see a function call with multiple parameters or arguments:
-47
47--
0 号寄存器持有来自全局变量 z 的函数对象。接下来我们看个带多个参数的函数调
用:
z(1,2,3)
>z(1,2,3)
; function [0] definition (level 1)
; 0 upvalues, 0 params, 4 stacks
.function 0 0 2 4
.const "z" ; 0
.const 1 ; 1
.const 2 ; 2
.const 3 ; 3
[1] getglobal 0 0 ; z
[2] loadk 1 1 ; 1
[3] loadk 2 2 ; 2
[4] loadk 3 3 ; 3
[5] call 0 4 1
[6] return 0 1
; end of function
Lines [1] to [4] loads the function reference and the arguments in order, then line [5] makes
the call with an operand B value of 4, which means there are 3 parameters. Since the call
statement is not assigned to anything, no return results need to be retained, hence field C is 1.
Here is an example that uses multiple parameters and multiple return values:
First, the function references are retrieved (lines [1] and [2]), then function y is called first
(temporary register 1). The CALL has a field C of 0, meaning multiple return values are
accepted. These return values become the parameters to function z, and so in line [4], field B
of the CALL instruction is 0, signifying multiple parameters. After the call to function z, 4
results are retained, so field C in line [4] is 5. Finally, here is an example with calls to
standard library functions:
-48
48--
中也是如此, CALL 指令的字段 B 是 0,表示多个参数。在函数 z 的调用之后保留 4
个结果,所以在行[4]中字段 C 是 5。最后一个例子是调用标准库函数:
print(string.char(64))
>print(string.char(64))
; function [0] definition (level 1)
; 0 upvalues, 0 params, 3 stacks
.function 0 0 2 3
.const "print" ; 0
.const "string" ; 1
.const "char" ; 2
.const 64 ; 3
[1] getglobal 0 0 ; print
[2] getglobal 1 1 ; string
[3] gettable 1 1 258 ; "char"
[4] loadk 2 3 ; 64
[5] call 1 2 0
[6] call 0 0 1
[7] return 0 1
; end of function
When a function call is the last parameter to another function call, the former can pass
multiple return values, while the latter can accept multiple parameters.
当函数调用是另一个函数调用的最后一个参数时,前一个能传递多个返回值,而后一
个能接受多个参数。
Like CALL, a field B value of 0 signifies multiple return values (up to top of stack.)
-49
49--
local e,f,g; return f,g
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 5 stacks
.function 0 0 2 5
.local "e" ; 0
.local "f" ; 1
.local "g" ; 2
[1] move 3 1
[2] move 4 2
[3] return 3 3
[4] return 0 1
; end of function
In line [3], 2 return values are specified (field B value of 3.) The return values are placed in
consecutive registers starting from register 3 by the MOVEs on line [1] and [2]. The
RETURN in line [4] is redundant; it is always generated by the Lua code generator.
A TAILCALL is used only for one specific return style, described above. Multiple return
results are always produced by a tail call. Here is an example:
-50
50--
.const "x" ; 0
.const "foo" ; 1
.const "bar" ; 2
[1] getglobal 0 0 ; x
[2] loadk 1 1 ; "foo"
[3] loadk 2 2 ; "bar"
[4] tailcall 0 3 0
[5] return 0 0
[6] return 0 1
; end of function
Arguments for a tail call are handled in exactly the same way as arguments for a normal call,
so in line [3], the tail call has a field B value of 3, signifying 2 parameters. Field C is 0, for
multiple returns; this due to the constant LUA_MULTRET in lua.h. In practice, field C is
not used by the virtual machine (except as an assert) since the syntax guarantees multiple
return results.
尾调用处理参数的方式同常规调用完全一样,所以在行[4]中,尾调用的字段 B 值为
3 , 指 示 2 个 参 数 。 对 多 返 回 来 说 字 段 C 是 0 ; 这 取 决 于 lua.h 中 的 常 量
LUA_MULTRET。实际上虚拟机没用到字段 C(除了一个断言),因为语法保证了多
返回结果。
Line [5] is a RETURN instruction specifying multiple return results. This is required when
the function called by TAILCALL is a C function. In the case of a C function, execution
continues to line [5] upon return, thus the RETURN is necessary. Line [6] is redundant.
When Lua functions are tailcalled, the virtual machine does not return to line [5] at all.
The other instructions covered in this section are SELF and VARARG. Both instructions are
covered here because they are closely tied to function calls. We will start with VARARG:
-51
51--
The use of VARARG will become clear with the help of a few examples:
Note that the main or top-level chunk is a vararg function, as the is_vararg flag is set (the
third number of the .function directive) in the example above. In this example, the left hand
side of the assignment statement needs three values (or objects.) So in line [1], the operand B
of the VARARG instruction is (3+1), or 4. VARARG will copy three values into a, b and c.
If there are less than three values available, nil
nils will be used to fill up the empty places.
注 意 主 程 序 或 顶 层 程 序 块 是 vararg 函 数 , 因 为 上 例 中 设 置 了 is_vararg 标 志
( .function 伪指令的第三个数值)。在本例中,赋值语句的左手边需要三个值(或
对象)。所以在行[1]中 VARARG 指令的操作数 B 是(3+1),或 4.VARARG 将拷贝
三个值到 a、b b 和 c 中。如果可用的值不足三个将用 nil 填满空位置。
local a = function(...) local a,b,c = ... end
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "a" ; 0
Here is an alternate version where a function is instantiated and assigned to local a. The old-
style arg is retained for compatibility purposes, but is unused in the above example.
这是个替代版本,其中实例化一个函数并赋给局部变量 a。出于兼容性目的保留旧式
的 arg
arg,但是上面的例子中并没使用。
local a; a(...)
>local
; function [0] definition (level 1)
-52
52--
; 0 upvalues, 0 params, 3 stacks
.function 0 0 2 3
.local "a" ; 0
[1] move 1 0
[2] vararg 2 0
[3] call 1 0 1
[4] return 0 1
; end of function
当函数被以“...”为参数被调用,它将接受数量可变的参数。在行[2]上,B 字段为 0
的 VARARG 被使用。VARARG 将把所有传入主程序块的参数拷贝至 2 号寄存器开始
的(寄存器),所以下一行中的 CALL 指令可用它们作为函数 a 的参数。这个函数调
用被设为接受可变数量的参数并返回 0 个结果。
local a = {...}
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "a" ; 0
[1] newtable 0 0 0 ; array=0, hash=0
[2] vararg 1 0
[3] setlist 0 0 1 ; index 1 to top
[4] return 0 1
; end of function
return ...
>return
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
[1] vararg 0 0
[2] return 0 0
[3] return 0 1
; end of function
Above are two other cases where VARARG needs to copy all passed parameters over to a set
of registers in order for the next operation to proceed. Both the above forms of table creation
and return accepts a variable number of values or objects.
-53
53--
然后把表本身的引用放在后续的寄存器 R(A+1)中。当准备方法调用时该指令
省去了一些麻烦的操作。
R(B) is the register holding the reference to the table with the method. The
method function itself is found using the table index RK(C), which may be
the value of register R(C) or a constant number.
寄存器 R(B)持有该方法所在的表的引用。该方法自身是利用表索引 RK(C)找
到的,后者可以是寄存器 R(C)的值或常量编号。
最后,我们有条指令,SELF,用于面向对象程序设计。SELF 指令省去了一条额外的
指令并加速了面向对象程序设计中的方法调用。它只在使用了冒号语法的方法调用中
才被生成。
foo:bar("baz")
>foo:bar("baz")
; function [0] definition (level 1)
; 0 upvalues, 0 params, 3 stacks
.function 0 0 2 3
.const "foo" ; 0
.const "bar" ; 1
.const "baz" ; 2
[1] getglobal 0 0 ; foo
[2] self 0 0 257 ; "bar"
[3] loadk 2 2 ; "baz"
[4] call 0 3 1
[5] return 0 1
; end of function
The method call is equivalent to: foo.bar(foo, "baz") , except that the global foo is only
looked up once. This is significant if metamethods have been set. The SELF in line [2] is
equivalent to a GETTABLE lookup (the table is in register 0 and the index is constant 1) and
a MOVE (copying the table reference from register 0 to register 1.)
Without SELF, a GETTABLE will write its lookup result to register 0 (which the code
generator will normally do) and the table reference will be overwritten before a MOVE can
be done. Using SELF saves roughly one instruction and one temporary register slot.
-54
54--
临时寄存器位置。
After setting up the method call using SELF, the call is made with the usual CALL
instruction in line [4], with two parameters. The equivalent code for a method lookup is
compiled in the following manner:
The alternative form of a method call is one instruction longer, and the user must take note of
any metamethods that may affect the call. The SELF in the previous example replaces the
GETTABLE on line [2] and the GETGLOBAL on line [3]. If foo is a local variable, then the
equivalent code is a GETTABLE and a MOVE.
这种方法调用的方式多了一条指令,并且用户必须注意任何可能影响调用的元方法。
前例中的 SELF 替换了行[2]的 GETTABLE 和行[3]的 GETGLOBAL。如果 foo 是局部
变量,则等价的编码是 GETTABLE 和 MOVE。
接下来我们来看看更多结构复杂的指令。
-55
55--
11 Relational and Logic Instructions 关系和逻辑指令
Relational and logic instructions are used in conjunction with other instructions to implement
control structures or expressions. Instead of generating boolean results, these instructions
conditionally perform a jump over the next instruction; the emphasis is on implementing
control blocks. Instructions are arranged so that there are two paths to follow based on the
relational test.
关系和逻辑指令与其他指令联合使用来实现控制结构或表达式。这些指令有条件地执
行跳转来越过下一条指令,而不是生成布尔结果;其重点是在实现控制块上。
By comparing the result of the relational operation with A, the sense of the comparison can
be reversed. Obviously the alternative is to reverse the paths taken by the instruction, but that
will probably complicate code generation some more. The conditional jump is performed if
the comparison result is not A, whereas execution continues normally if the comparison result
matches A. Due to the way code is generated and the way the virtual machine works, a JMP
instruction is always expected to follow an EQ, LT or LE. The following JMP is optimized
by executing it in conjunction with EQ, LT or LE.
通过关系操作结果与 A 的对照,比较的意义可能反转。很明显这种选择是为了反转指
令的执行路径,但可能使编码生成稍微复杂。如果比较结果不是 A 则执行条件跳转,
而如果比较结果与 A 匹配则执行绪继续(保持)正常。依据编码生成的方式和虚拟机
运转的方式,总是需要在 EQ、LT 或 LE 后面跟着一条 JMP 指令。在跟 EQ、LT 或
LE 联合执行时,后面的 JMP 会被优化。
-56
56--
local x,y; return x ~= y
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 3 stacks
.function 0 0 0 3
.local "x" ; 0
.local "y" ; 1
[1] loadnil 0 1
[2] eq 0 0 1 ; to [4] if true (x ~= y)
[3] jmp 1 ; to [5]
[4] loadbool 2 0 1 ; false, to [6] (false result path)
[5] loadbool 2 1 0 ; true (true result path)
[6] return 2 2
[7] return 0 1
; end of function
In the above example, the equality test is performed in line [2]. However, since the
comparison need to be returned as a result, LOADBOOL instructions are used to set a
register with the correct boolean value. This is the usual code pattern generated if the
expression requires a boolean value to be generated and stored in a register as an intermediate
value or a final result.
上例中,行[2]中进行相等测试。不过,比较需要作为结果返回,所以用 LOADBOOL
指令把正确的布尔值设置到寄存器。如果表达式需要生成布尔值并存入寄存器作为中
间值或最终值,这就是通常生成的编码模式。
很容易把反汇编代码想象为:
if x ~= y then
return true
else
return false
end
The true result path (when the comparison result matches A) goes like this:
while the false result path (when the comparison result does not match A) goes like this:
ChunkSpy comments the EQ in line [2] by letting the user know when the conditional jump
-57
57--
is taken. The jump is taken when “the value in register 0 equals to the value in register 1” (the
comparison) is not false (the value of operand A). If the comparison is x == y, everything
will be the same except that the A operand in the EQ instruction will be 1, thus reversing the
sense of the comparison. Anyway, these are just the Lua code generator’s conventions; there
are other ways to code x ~= y in terms of Lua virtual machine instructions.
For conditional statements, there is no need to set boolean results. Lua is optimized for
coding the more common conditional statements rather than conditional expressions.
没必要为条件语句设置布尔结果。Lua 为编码更一般的条件语句而非条件表达式进行
了优化。
local x,y; if x ~= y then return "foo" else return "bar" end
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 3 stacks
.function 0 0 2 3
.local "x" ; 0
.local "y" ; 1
.const "foo" ; 0
.const "bar" ; 1
[1] eq 1 0 1 ; to [3] if false (x ~= y)
[2] jmp 3 ; to [6]
[3] loadk 2 0 ; "foo" (true block)
[4] return 2 2
[5] jmp 2 ; to [8]
[6] loadk 2 1 ; "bar" (false block)
[7] return 2 2
[8] return 0 1
; end of function
In the above conditional statement, the same inequality operator is used in the source, but the
sense of the EQ instruction in line [1] is now reversed. Since the EQ conditional jump can
only skip the next instruction, additional JMP instructions are needed to allow large blocks of
code to be placed in both true and false paths. In contrast, in the previous example, only a
single instruction is needed to set a boolean value. For if statements, the true block comes
first followed by the false block in code generated by the code generator. To reverse the
positions of the true and false paths, the value of operand A is changed.
在上面的条件语句中,源代码中用了同样的不等操作符,但是现在行[1]中的 EQ 指令
的意义相反的。由于 EQ 条件跳转只能跳过下一条指令,需要额外的 JMP 指令以允许
在 true 和 false 路径中放置大块的编码。相比之下,前例只需要单条指令来设置一个布
尔值。对于 if 语句,在编码生成器生成的编码中 true 块先出现,后面跟着 false 块。改
变操作数 A 的值就能反转 true 和 false 路径的位置。
The true path (when x ~= y is true) goes from [1] to [3]–[5] and on to [8]. Since there is a
-58
58--
RETURN in line [4], the JMP in line [5] and the RETURN in [8] are never executed at all;
they are redundant but does not adversely affect performance in any way. The false path is
from [1] to [2] to [6]–[8] onwards. So in a disassembly listing, you should see the true and
false code blocks in the same order as in the Lua source.
下面是另一个例子,这次带有一个 elseif
elseif:
if 8 > 9 then return 8 elseif 5 >= 4 then return 5 else return 9 end
>if
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.const 8 ; 0
.const 9 ; 1
.const 5 ; 2
.const 4 ; 3
[01] lt 0 257 256 ; 9 8, to [3] if true (9 < 8)
[02] jmp 3 ; to [6]
[03] loadk 0 0 ; 8 (1st true block)
[04] return 0 2
[05] jmp 7 ; to [13]
[06] le 0 259 258 ; 4 5, to [8] if true (4 <= 5)
[07] jmp 3 ; to [11]
[08] loadk 0 2 ; 5 (2nd true block)
[09] return 0 2
[10] jmp 2 ; to [13]
[11] loadk 0 1 ; 9 (2nd false block)
[12] return 0 2
[13] return 0 1
; end of function
This example is a little more complex, but the blocks are structured in the same order as the
Lua source, so interpreting the disassembled code should not be too hard.
Next are the two instructions used for performing boolean tests and implementing Lua’s logic
operators:
-59
59--
用于实现逻辑操作符 and 和 or 或测试条件语句中的一个寄存器。
For TESTSET, register R(B) is coerced into a boolean and compared to the
boolean field C. If R(B) matches C, the next instruction is skipped, otherwise
R(B) is assigned to R(A) and the VM continues with the next instruction. The
and operator uses a C of 0 (false) while or uses a C value of 1 (true).
对 TESTSET,寄存器 R(B)被强制转为布尔值并与布尔字段 C 比较。如果
R(B)与 C 匹配则跳过下一条指令,否则把 R(B)赋给 R(A)且 VM 继续执行下一
条指令。(译注-这句似乎说反了。)操作符 and 的 C 为 0(false)而 or 的
C 值为 1(true)。(译注-不一定,比如 a = not b and c,字段 C 为 1,a =
not b or c,字段 C 为 0。)
TEST is a more primitive version of TESTSET. TEST is used when the
assignment operation is not needed, otherwise it is the same as TESTSET
except that the operand slots are different.
TEST 是更原始版本的 TESTSET。当需要赋值操作时使用 TEST,此外除了
操作数位置不同其他都一样。
For the fall-through case, a JMP is always expected, in order to optimize
execution in the virtual machine. In effect, TEST and TESTSET must always
be paired with a following JMP instruction.
为了优化虚拟机中的执行绪,失败的情况总是期望一个 JMP。实际上,TEST
和 TESTSET 必须总是后跟一个 JMP 指令,成对出现。
TEST and TESTSET are used in conjunction with a following JMP instruction, while
TESTSET has an addditional conditional assignment. Like EQ, LT and LE, the following
JMP instruction is compulsory, as the virtual machine will execute the JMP together with
TEST or TESTSET. The two instructions are used to implement short-circuit LISP-style
logical operators that retains and propagates operand values instead of booleans. First, we’ll
look at how and and or behaves:
-60
60--
operands in a string of and operations will make the whole boolean expression false false. If
operands evaluates to truetrue, evaluation continues. When a string of and operations evaluates
to true, the result is the last operand value.
In line [1], the first operand (the local a) is set to local c when the test is false (with a field C
of 0), while the jump to [3] is made when the test is truetrue, and then in line [3], the expression
result is set to the second operand (the local b). This is equivalent to:
在行[1]中,当测试为 false
false(字段 C 为 0)时第一个操作数(局部变量 a)被设为局部
变量 c,而当测试为 true 时跳到[3],然后在行[3]中,表达式的结果被设为第二个操作
数(局部变量 b)。这等价于:
if a then
c = b -- executed by MOVE on line [3] 由行[3]上的 MOVE 执行
else
c = a -- executed by TESTSET on line [1] 由行[1]上的 TESTSET 执行
end
The c = a portion is done by TESTSET itself, while MOVE performs c = b. Now, if the result
is already set with one of the possible values, a TEST instruction is used instead:
The TEST instruction does not perform an assignment operation, since a = a is redundant.
This makes TEST a little faster. This is equivalent to:
我们接着看 or 操作符:
-61
61--
local a,b,c; c = a or b
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 3 stacks
.function 0 0 2 3
.local "a" ; 0
.local "b" ; 1
.local "c" ; 2
[1] testset 2 0 1 ; to [3] if false
[2] jmp 1 ; to [4]
[3] move 2 1
[4] return 0 1
; end of function
An or sequence exits on true operands, because any operands evaluating to true in a string of
or operations will make the whole boolean expression true
true. If operands evaluates to false
false,
evaluation continues. When a string of or operations evaluates to false
false, all operands must
have evaluated to false
false.
-62
62--
Short-circuit logical operators also means that the following Lua code does not require the
use of a boolean operation:
For a single variable used in the expression part of a conditional statement, TEST is used to
boolean-test the variable:
对于用在条件语句的表达式部分的一个变量,TEST 被用于变量的布尔测试:
if Done then return end
>if
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.const "Done" ; 0
[1] getglobal 0 0 ; Done
[2] test 0 0 ; to [4] if true
[3] jmp 1 ; to [5]
[4] return 0 1
[5] return 0 1
; end of function
In line [2], the TEST instruction jumps to the true block if the value in temporary register 0
(from the global Done
Done) is true
true. The JMP at line [3] jumps over the true block, which is the
code inside the if block (line [4].)
-63
63--
在行[2]中,如果临时的 0 号寄存器中的值(赖在全局变量 Done Done)是 true 则 TEST 指
令跳到 true 块。行[3]的 JMP 跳过 true 块,即 if 块内的代码(行[4])。
If the test expression of a conditional statement consist of purely boolean operators, then a
number of TEST instructions will be used in the usual short-circuit evaluation style:
如果条件语句的测试表达式完全由布尔操作符组成,那么将在通常的短路求值风格中
使用许多 TEST 指令:
if Found and Match then return end
>if
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.const "Found" ; 0
.const "Match" ; 1
[1] getglobal 0 0 ; Found
[2] test 0 0 ; to [4] if true
[3] jmp 4 ; to [8]
[4] getglobal 0 1 ; Match
[5] test 0 0 ; to [7] if true
[6] jmp 1 ; to [8]
[7] return 0 1
[8] return 0 1
; end of function
In the last example, the true block of the conditional statement is executed only if both
Found and Match evaluates to truetrue. The path is from [2] (test for Found
Found) to [4] to [5] (test
for Match
Match) to [7] (the true block, which is an explicit return statement.)
If the statement has an else section, then the JMP on line [6] will jump to the false block (the
else block) while an additional JMP will be added to the true block to jump over this new
block of code. If or is used instead of and and, the appropriate C operand will be adjusted
accordingly.
:? in C) equivalent works:
Finally, here is how Lua’s ternary operator (:?
-64
64--
[1] test 0 0 ; to [3] if true
[2] jmp 2 ; to [5]
[3] testset 0 1 1 ; to [5] if false
[4] jmp 1 ; to [6]
[5] move 0 2
[6] return 0 1
; end of function
The TEST in line [1] is for the and operator. First, local a is tested in line [1]. If it is false
false,
then execution continues in [2], jumping to line [5]. Line [5] assigns local c to the end result
because since if a is false
false, then a and b is false
false, and false or c is c.
If local a is true in line [1], the TEST instruction makes a jump to line [3], where there is a
TESTSET, for the or operator. If b evaluates to true true, then the end result is assigned the value
of b, because b or c is b if b is not false
false. If b is also false
false, the end result will be c.
For the instructions in line [1], [3] and [5], the target (in field A) is register 0, or the local a,
which is the location where the result of the boolean expression is assigned. The equivalent
Lua code is:
The two a = c assignments are actually the same piece of code, but are repeated here to avoid
using a goto and a label. Normally, if we assume b is not false and not nil
nil, we end up with
the more recognizable form:
-65
65--
12 Loop Instructions 循环指令
Lua has dedicated instructions to implement the two types of for loops, while the other two
types of loops uses traditional test-and-jump.
-66
66--
block. This is significant if the loop index is used as an upvalue (see below.)
R(A), R(A+1) and R(A+2) are not visible to the programmer.
FORLOOP 也设置 R(A+3),局部于循环块的外部循环索引。如果循环索引被
用作 upvalue(见下面)。R(A)、R(A+1)和 R(A+2)对程序员不可见。
The loop variable ends with the last value before the limit is reached (unlike
C) because it is not updated unless the jump is made. However, since loop
variables are local to the loop itself, you should not be able to use it unless
you cook up an implementation-specific hack.
循环变量以到达界限前的最后一个值结束(与 C 不同),因为只有跳转才会
更新它。不过,由于循环变量是局部于循环本身的,你应该不能使用它,除非
你编造一个特定实现的 hack。
Loop indices behave a little differently in Lua 5.1 compared to Lua 5.0.2. Consider the
following, where loop index i is used as an upvalue in the instantiation of 10 functions:
Lua 5.0.2 will print out 10, while Lua 5.1 will print out 5. In Lua 5.0.2, the scope of the loop
index encloses the for loop, resulting in the creation of a single upvalue. In Lua 5.1, the loop
index is truly local to the loop, resulting in the creation of 10 separate upvalues.
For the sake of efficiency, FORLOOP contains a lot of functionality, so when a loop iterates,
only one instruction, FORLOOP, is needed. Here is a simple example:
-67
67--
[1] loadk 0 0 ; 0
[2] loadk 1 1 ; 1
[3] loadk 2 2 ; 100
[4] loadk 3 3 ; 5
[5] forprep 1 1 ; to [7]
[6] add 0 0 4
[7] forloop 1 -2 ; to [6] if loop
[8] return 0 1
; end of function
In the above example, notice that the for loop causes three additional local pseudo-variables
(or internal variables) to be defined, apart from the external loop index, i. The three pseudo-
variables, named (for index)
index), (for limit) and (for step) are required to completely specify the
state of the loop, and are not visible to Lua source code. They are arranged in consecutive
registers, with the external loop index given by R(A+3) or register 4 in the example.
The loop body is in line [6] while line [7] is the FORLOOP instruction that steps through the
loop state. The sBx field of FORLOOP is negative, as it always jumps back to the beginning
of the loop body.
Lines [2]–[4] initializes the three register locations where the loop state will be stored. If the
loop step is not specified in the Lua source, a constant 1 is added to the constant pool and a
LOADK instruction is used to initialize the pseudo-variable (for step) with the loop step.
FORPREP in lines [5] makes a negative loop step and jumps to line [7] for the initial test. In
the example, at line [5], the internal loop index (at register 1) will be (1-5) or -4. When the
virtual machine arrives at the FORLOOP in line [7] for the first time, one loop step is made
prior to the first test, so the initial value that is actually tested against the limit is (-4+5) or 1.
Since 1 < 100, an iteration will be performed. The external loop index i is then set to 1 and a
jump is made to line [6], thus starting the first iteration of the loop.
The loop at line [6]–[7] repeats until the internal loop index exceeds the loop limit of 100.
-68
68--
The conditional jump is not taken when that occurs and the loop ends. Beyond the scope of
(for index)
the loop body, the loop state ((for limit), (for step) and i) is not valid. This is
index), (for limit)
determined by the parser and code generator. The range of PC values for which the loop state
variables are valid is located in the locals list. The brief assembly listings generated by
ChunkSpy that you are seeing does not give the startpc and endpc values contained in the
locals list. In theory, these rules can be broken if you write Lua assembly directly.
行[6]-[7]的循环重复直到内部索引超过循环界限 100。当那发生时不进行条件跳转且循
环结束。在循环体作用域外,循环状态((for (for index) (for limit)
index)、(for step)和 i)是无
(for step)
limit)、(for
效的。这由解析器和编码生成器进行检查。(表示)循环状态变量的有效范围的 PC
值位于局部变量列表中。你所看到的 ChunkSpy 生成的概要汇编清单没有给出局部变
量列表中包含的 startpc 和 endpc
endpc。理论上,如果你直接编写 Lua 汇编能打破这些规
则。
for i = 10,1,-1 do if i == 5 then break end end
>for
; function [0] definition (level 1)
; 0 upvalues, 0 params, 4 stacks
.function 0 0 2 4
.local "(for index)" ; 0
.local "(for limit)" ; 1
.local "(for step)" ; 2
.local "i" ; 3
.const 10 ; 0
.const 1 ; 1
.const -1 ; 2
.const 5 ; 3
[1] loadk 0 0 ; 10
[2] loadk 1 1 ; 1
[3] loadk 2 2 ; -1
[4] forprep 0 3 ; to [8]
[5] eq 0 3 259 ; 5, to [7] if true
[6] jmp 1 ; to [8]
[7] jmp 1 ; to [9]
[8] forloop 0 -4 ; to [5] if loop
[9] return 0 1
; end of function
In the second loop example above, except for a negative loop step size, the structure of the
loop is identical. The body of the loop is from line [5] to line [8]. Since no additional stacks
or states are used, a break translates simply to a JMP instruction (line [7]). There is nothing
to clean up after a FORLOOP ends or after a JMP to exit a loop.
在上面的第二个循环例子中,除了负的循环步进尺寸外,循环的结构是完全一样的。
循环体从行[5]到行[8]。由于没用额外的栈或状态, break 只是翻译成一条 JMP 指令
(行[7])。在一条 FORLOOP 结束后或一条 JMP 退出循环后没有东西要清理。
Apart from a numeric for loop (implemented by FORPREP and FORLOOP), Lua has a
generic for loop, implemented by TFORLOOP:
-69
69--
TFORLOOP AC R(A+3), ... ,R(A+2+C) := R(A)(R(A+1), R(A+2));
if R(A+3) ~= nil then {
R(A+2) = R(A+3);
} else {
PC++;
}
Performs an iteration of a generic for loop. A Lua 5-style generic for loop
keeps 3 items in consecutive register locations to keep track of things. R(A)
is the iterator function , which is called once per loop. R(A+1) is the state, and
R(A+2) is the enumeration index. At the start, R(A+2) has an initial value.
R(A), R(A+1) and R(A+2) are internal to the loop and cannot be accessed by
the programmer; at first, they are set with an initial state.
执行一次泛型 for 循环的迭代。Lua5 风格的泛型 for 循环保有 3 项连续的寄
存器位置来跟踪状态。R(A)是迭代函数,每个循环调用一次。R(A+1)是 状
态,R(A+2)是枚举索引。刚开始,R(A+2)具有初值。R(A)、R(A+1)和 R(A+2)
在循环内部,不能被程序员访问;它们起初被设为初始状态。
In addition to these internal loop variables, the programmer specifies one or
more loop variables that are external and visible to the programmer. These
loop variables reside at locations R(A+3) onwards, and their count is
specified in operand C. Operand C must be at least 1. They are also local to
the loop body, like the external loop index in a numerical for loop.
除了这些内部循环变量,程序员指定一个或多个对程序员可见的外部循环变
量。这些循环变量驻留在 R(A+3)开始的位置,它们的数量由操作数 C 指定。
操作数 C 必须至少为 1.它们也是局部于循环体内的,同数字 for 循环中的外
部循环索引一样。
Each time TFORLOOP executes, the iterator function referenced by R(A) is
called with two arguments: the state and the enumeration index (R(A+1) and
R(A+2).) The results are returned in the local loop variables, from R(A+3)
onwards, up to R(A+2+C).
TFORLOOP 每次执行时,R(A)引用的迭代器函数被调用,有两个参数:状态
和枚举索引(R(A+1)和 R(A+2))。结果返回到从 R(A+3)开始直到 R(A+2+C)
的局部循环变量中。
Next, the first return value, R(A+3), is tested. If it is nil
nil, the iterator loop is at
an end, and TFORLOOP skips the next instruction and the for loop block
ends. Note that the state of the generic for loop does not depend on any of
the external iterator variables that are visible to the programmer.
接着,测试第一个返回值,R(A+3)。如果它是 nil,则迭代器循环到达末尾,
且 TFORLOOP 跳过下一条指令从而 for 循环块终止。注意,泛型 for 循环的
状态不依赖于任何对程序员可见的外部迭代器变量。
If R(A+3) is not nil
nil, there is another iteration, and R(A+3) is assigned as the
new value of the enumeration index, R(A+2). Then next instruction, which
must be a JMP, is immediately executed, sending execution back to the
beginning of the loop. This is an optimization case; TFORLOOP will not work
correctly without the JMP instruction.
-70
70--
如果 R(A+3)不是 nil
nil,则还有另一次迭代,并且 R(A+3)作为枚举索引的新值
赋给 R(A+2)。然后下一条指令,它 必须是个 JMP,立刻执行,把执行绪发送
回循环的起点。这是种优化情形;没有 JMP 指令则 TFORLOOP 将不会正确
运转。
Like the numerical for loop, the generic for loop behave a little differently in Lua 5.1
compared to Lua 5.0.2. In the following example:
This example has a loop with one additional result (vv) in addition the loop enumerator (ii):
该例的循环除了循环枚举器(ii)还有个额外的结果(vv):
for i,v in pairs(t) do print(i,v) end
>for
; function [0] definition (level 1)
; 0 upvalues, 0 params, 8 stacks
.function 0 0 2 8
.local "(for generator)" ; 0
.local "(for state)" ; 1
.local "(for control)" ; 2
.local "i" ; 3
.local "v" ; 4
.const "pairs" ; 0
.const "t" ; 1
.const "print" ; 2
[01] getglobal 0 0 ; pairs
[02] getglobal 1 1 ; t
[03] call 0 2 4
[04] jmp 4 ; to [9]
[05] getglobal 5 2 ; print
[06] move 6 3
[07] move 7 4
[08] call 5 3 1
[09] tforloop 0 2 ; to [11] if exit
[10] jmp -6 ; to [5]
-71
71--
[11] return 0 1
; end of function
The iterator function is located in register 0, and is named (for generator) for debugging
purposes. The state is in register 1, and has the name (for state)
state). The enumeration index, (for
control), is contained in register 2. These correspond to locals R(A), R(A+1) and R(A+2) in
control)
the TFORLOOP description. Results from the iterator function call is placed into register 3
and 4, which are locals i and v, respectively. On line [9], the operand C of TFORLOOP is 2,
corresponding to two iterator variables (ii and v).
Line [1]–[3] prepares the iterator state. Note that the call to the pairs standard library
function has 1 parameter and 3 results. After the call in line [3], register 0 is the iterator
function, register 1 is the loop state, register 2 is the initial value of the enumeration index.
The iterator variables i and v are both invalid at the moment, because we have not entered the
loop yet.
Line [4] is a JMP to TFORLOOP on line [9]. With the initial (or zeroth) iterator state,
TFORLOOP calls the iterator function, generating the first set of enumeration results in
locals i, v. If i is not nil
nil, the internal enumeration index (register 2) is set and the JMP on the
next line is immediately executed, starting the first iteration of the loop body (lines [5]–[8]).
The body of the generic for loop executes (print(i,v)) and then TFORLOOP is
encountered again, calling the iterator function to get the next iteration state. Finally, when
the first result is a nil
nil, the loop ends, and execution continues on line [11].
repeat and while loops use a standard test-and-jump structure. Here is a repeat loop:
-72
72--
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "a" ; 0
.const 0 ; 0
.const 1 ; 1
.const 10 ; 2
[1] loadk 0 0 ; 0
[2] add 0 0 257 ; 1
[3] eq 0 0 258 ; 10, to [5] if true
[4] jmp -3 ; to [2]
[5] return 0 1
; end of function
The body of the repeat loop is line [2], while the test-and-jump scheme is implemented in
lines [3] and [4]. Although two instructions are needed to loop the loop, Lua 5.1 executes EQ
and JMP together, saving some time.
repeat 的循环体是行[2],测试和跳转方案在行[3]和[4]中实现。尽管需要两条指令来处
理循环,Lua5.1 把 EQ 和 JMP 一起执行,节省了一些时间。
local a = 1; while a < 10 do a = a + 1 end
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "a" ; 0
.const 1 ; 0
.const 10 ; 1
[1] loadk 0 0 ; 1
[2] lt 0 0 257 ; 10, to [4] if true
[3] jmp 2 ; to [6]
[4] add 0 0 256 ; 1
[5] jmp -4 ; to [2]
[6] return 0 1
; end of function
For a while loop, the test (line[2]) is made first. If the test is true
true, execution continues with
the loop body (line [4]). A JMP on line [5] returns execution to the loop test instruction. This
is a little different from Lua 5.0.2 while loops, which have the loop test at the end of the loop
block and has a loop condition size limitation.
A while loop in the Lua 5.0.2 style will look like this:
-73
73--
.const 10 ; 1
[1] loadk 0 0 ; 1
[2] jmp 1 ; to [4]
[3] add 0 0 256 ; 1
[4] lt 1 0 257 ; 10, to [6] if false
[5] jmp -3 ; to [3]
[6] return 0 1
; end of function
The sense of the condition test is reversed, while the loop body is at line [3]. The condition
test is made at the end of the loop on line [4].
条件测试的意义反转了,而循环体在行[3]。条件测试在循环末尾的行[4]上进行。
-74
74--
13 Table Creation 表创建
There are two instructions for table creation and initialization. One instruction creates a table
while the other instruction sets the array elements of a table.
有两条指令用于表创建和初始化。一条指令创建表而另一条指令设置表的数组部分。
Creating an empty table forces both array and hash sizes to be zero:
创建空表会强制数组和散列尺寸为 0:
local q = {}
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "q" ; 0
[1] newtable 0 0 0 ; array=0, hash=0
[2] return 0 1
; end of function
In later examples, we will see how the size values are encoded. But first, we need to learn
about the SETLIST instruction, which is used to initialize array elements in a table.
-75
75--
在稍后 的 例子 中, 我 们将 看 看尺 寸值 是 如何 编 码的 。但 是 首先 , 我们 需要 学 习
SETLIST 指令,它用于初始化表的数组元素。
我们以简单的例子开始:
local q = {1,2,3,4,5,}
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 6 stacks
.function 0 0 2 6
.local "q" ; 0
.const 1 ; 0
.const 2 ; 1
.const 3 ; 2
.const 4 ; 3
.const 5 ; 4
[1] newtable 0 5 0 ; array=5, hash=0
[2] loadk 1 0 ; 1
[3] loadk 2 1 ; 2
[4] loadk 3 2 ; 3
-76
76--
[5] loadk 4 3 ; 4
[6] loadk 5 4 ; 5
[7] setlist 0 5 1 ; index 1 to 5
[8] return 0 1
; end of function
A table with the reference in register 0 is created in line [1] by NEWTABLE. Since we are
creating a table with no hash elements, the array part of the table has a size of 5, while the
hash part has a size of 0.
Constants are then loaded into temporary registers 1 to 5 (lines [2] to [6]) before the
SETLIST instruction in line [7] assigns each value to consecutive table elements. The start of
the block is encoded as 1 in operand C. The starting index is calculated as (1-1)*50+1 or 1.
Since B is 5, the range of the array elements to be set becomes 1 to 5, while the objects used
to set the array elements will be R(1) through R(5).
Next is a larger table with 55 array elements. This will require two blocks to initialize. Some
lines have been removed and ellipsis (...) added to save space.
接下来是带有 55 个数组元素的更大的表。这将需要初始化两个块。移除了一些行且加
入省略号(...)以节省空间。
local q = {1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, \
>local
1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, \
>>1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,}
>>1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,}
; function [0] definition (level 1)
; 0 upvalues, 0 params, 51 stacks
.function 0 0 2 51
.local "q" ; 0
.const 1 ; 0
.const 2 ; 1
...
.const 0 ; 9
[01] newtable 0 30 0 ; array=56, hash=0
[02] loadk 1 0 ; 1
[03] loadk 2 1 ; 2
...
[51] loadk 50 9 ; 0
[52] setlist 0 50 1 ; index 1 to 50
[53] loadk 1 0 ; 1
[54] loadk 2 1 ; 2
...
[57] loadk 5 4 ; 5
[58] setlist 0 5 2 ; index 51 to 55
[59] return 0 1
; end of function
-77
77--
Since FPF is 50, the array will be initialized in two blocks. The first block is for index 1 to 50,
while the second block is for index 51 to 55. Each array block to be initialized requires one
SETLIST instruction. On line [1], NEWTABLE has a field B value of 30, or 00011110 in
binary. From the description of NEWTABLE, xxx is 1102, while eeeee is 112. Thus, the size
of the array portion of the table is (1110)*2^(11-1) or (14*2^2) or 56.
Lines [2] to [51] sets the values used to initialize the first block. On line [52], SETLIST has a
B value of 50 and a C value of 1. So the block is from 1 to 50. Source registers are from R(1)
to R(50). Lines [53] to [57] sets the values used to initialize the second block. On line [58],
SETLIST has a B value of 5 and a C value of 2. So the block is from 51 to 55. The start of the
block is calculated as (2-1)*50+1 or 51. Source registers are from R(1) to R(5).
行[2]到[5]设置用来初始化第一块的值。在行[52],SETLIST 的 B 值为 50,其 C 值为
1。所以该块石从 1 到 50。源寄存器是从 R(1)到 R(50)。行[53]到[57]设置用来初始化
第二块的值。在行[58],SETLIST 的 B 值为 5,其 C 值为 2。所以该块从 51 到 55。块
的起点计算为(2-1)*50+1 或 51。源寄存器是从 R(1)到 R(5)。
这是带有散列元素的表:
local q = {a=1,b=2,c=3,d=4,e=5,f=6,g=7,h=8,}
>local
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "q" ; 0
.const "a" ; 0
.const 1 ; 1
.const "b" ; 2
.const 2 ; 3
.const "c" ; 4
.const 3 ; 5
.const "d" ; 6
.const 4 ; 7
.const "e" ; 8
.const 5 ; 9
.const "f" ; 10
.const 6 ; 11
.const "g" ; 12
.const 7 ; 13
.const "h" ; 14
.const 8 ; 15
[01] newtable 0 0 8 ; array=0, hash=8
[02] settable 0 256 257 ; "a" 1
[03] settable 0 258 259 ; "b" 2
[04] settable 0 260 261 ; "c" 3
[05] settable 0 262 263 ; "d" 4
[06] settable 0 264 265 ; "e" 5
-78
78--
[07] settable 0 266 267 ; "f" 6
[08] settable 0 268 269 ; "g" 7
[09] settable 0 270 271 ; "h" 8
[10] return 0 1
; end of function
In line [1], NEWTABLE is executed with an array part size of 0 and a hash part size of 8. On
lines [2] to line [9], key-value pairs are set using SETTABLE. The SETLIST instruction is
only for initializing array elements. Using SETTABLE to initialize the key-value pairs of a
table in the above example is quite efficient as it can reference the constant pool directly.
If there are both array elements and hash elements in a table constructor, both SETTABLE
and SETLIST will be used to initialize the table after the initial NEWTABLE. In addition, if
the last element of the table constructor is a function call or a vararg operator, then the B
operand of SETLIST will be 0, to allow objects from R(A+1) up to the top of the stack to be
initialized as array elements of the table.
In the above example, the table is first created in line [1] with its reference in register 0, and it
has both array and hash elements to be set. The size of the array part is 3 while the size of the
hash part is also 3.
-79
79--
在上例中,首先在行[1]中创建在 0 号寄存器中引用的表,它的数组和散列元素都要设
置。数组部分的尺寸是 3,散列部分的尺寸也是 3。
Lines [2]–[4] loads the values for the first 3 array elements. Lines [5]–[7] sets the 3 key-value
pairs for the hash part of the table. In lines [8] and [9], the call to function foo is made, and
then in line [10], the SETLIST instruction sets the first 3 array elements (in registers 1 to 3)
plus whatever additional results returned by the foo function call (from register 4 onwards.)
This is accomplished by setting operand B in SETLIST to 0. For the first block, operand C is
1 as usual. If no results are returned by the function, the top of stack is at register 3 and only
the 3 constant array elements in the table are set.
Note that only the last function call in a table constructor retains all results. Other function
calls in the table constructor keep only one result. This is shown in the above example. For
vararg operators in table constructors, please see the discussion for the VARARG instruction
for an example.
注意,只有表构造器中的最后一个函数调用保留所有结果。表构造器中的其他函数调
用值保留一个结果。这在上例中展示。对表构造器中的 vararg 操作符,请看 VARARG
指令的例子的讨论。
-80
80--
14 Closures and Closing 创建和结束闭包
The final two instructions of the Lua virtual machine are a little involved because of the
handling of upvalues. The first is CLOSURE, for instantiating function prototypes:
因 为 upvalue 处 理 的 关 系 , Lua 虚 拟 机 的 最 后 两 条 指 令 有 点 复 杂 。 第 一 个 是
CLOSURE,用于实例化函数原型:
If the function prototype has no upvalues, then CLOSURE is pretty straightforward: Bx has
the function number and R(A) is assigned the reference to the instantiated function object.
However, when an upvalue comes into the picture, we have to look a little more carefully:
-81
81--
; function [0] definition (level 2)
; 1 upvalues, 0 params, 2 stacks
.function 1 0 0 2
.upvalue "u" ; 0
[1] getupval 0 0 ; u
[2] return 0 2
[3] return 0 1
; end of function
In the example, the upvalue in the level 2 function is u, and within the main chunk there is a
single function prototype (indented in the listing above for clarity.) In the top-level function,
line [1], the closure is made. In line [3] the function reference is saved into global p. Line [2]
is a part of the CLOSURE instruction (it not really an actual MOVE,) and its B field specifies
that upvalue number 0 in the closed function is really local u in the enclosing function.
这是另一个例子,带 3 层函数原型:
local m \
>local
function p() \
>>function
>> local n \
>> function q() return m,n end \
end
>>end
; function [0] definition (level 1)
; 0 upvalues, 0 params, 2 stacks
.function 0 0 2 2
.local "m" ; 0
.const "p" ; 0
-82
82--
[4] return 0 1
; end of function
First, look at the top-level function and the level 2 function – there is one upvalue, m. In the
top-level function, the closure in line [1] has one more instruction following it (the MOVE),
for the upvalue m. This is similar to the previous example.
Next, compare the level 2 function and the level 3 function – now there are two upvalues, m
and n. The m upvalue is found 2 levels up. In the level 2 function, the closure in line [1] has
m) – it uses GETUPVAL to
two instructions following it. The first is for upvalue number 0 (m
indicate that the upvalue is one or more level lower down. The second is for upvalue number
n) – it uses MOVE which indicate that the upvalue is in the same level as the CLOSURE
1 (n
instruction. For both of these pseudo-instructions, the B field is used to point either to the
upvalue or local in question. The Lua virtual machine uses this information (CLOSURE
information and upvalue lists) to manage upvalues; for the programmer, upvalues just works.
The last instruction to be covered in this guide, CLOSE, also deals with upvalues:
本指南涉及的最后一条指令,CLOSE,也处理 upvalue:
-83
83--
If a local is used as an upvalue, then the local variable need to be placed
somewhere, otherwise it will go out of scope and disappear when a lexical
block enclosing the local variable ends. CLOSE performs this operation for
all affected local variables for do end blocks or loop blocks. RETURN also
does an implicit CLOSE when a function returns.
如果局部变量被用作 upvalue,则该局部变量需要被置于某处,否则,当词法
块封闭局部变量末端时,它将超出作用域并消失。CLOSE 为 do end 块或循
环块的所有受影响的局部变量执行此操作。当函数返回时 RETURN 也做一个
隐式的 CLOSE。
通过例子更容易理解 CLOSE:
do \
>do
>> local p,q \
>> r = function() return p,q end \
end
>>end
; function [0] definition (level 1)
; 0 upvalues, 0 params, 3 stacks
.function 0 0 2 3
.local "p" ; 0
.local "q" ; 1
.const "r" ; 0
p and q are local to the do end block, and they are upvalues as well. The global r is assigned
an anonymous function that has p and q as upvalues. When p and q go out of scope at the
end of the do end block, both variables have to be put somewhere because they are part of
the environment of the function instantiated in r. This is where the CLOSE instruction comes
in.
-84
84--
In the top-level function, the CLOSE in line [5] makes the virtual machine find all affected
locals (they have to be open upvalues,) take them out of the stack, and place them in a safe
place so that they do not disappear when the block or function goes out of scope. A RETURN
instruction does an implicit CLOSE so the latter won’t appear very often in listings.
Here is another example which illustrates a rather subtle point with CLOSE (thanks to Rici
Lake for this nugget):
In the above example, a function is instantiated within a loop. In real-world code, a loop may
instantiate a number of such functions. Each of these functions will have its own p upvalue.
The subtle point is that the break (the JMP on line [4]) does not jump to the RETURN
instruction in line [7]; instead it reaches the CLOSE instruction on line [6]. Whether or not
execution exits a loop normally or through a breakbreak, the code within the loop may have
caused the instantiation of one or more functions and their associated upvalues. Thus the
enclosing do end block must execute its CLOSE instruction; if we always remember to
-85
85--
associate the CLOSE with the do end block, there will be no confusion.
上例中再循环内实例化一个函数。在现实的代码中,循环可以实例化很多这样的函
数。这些函数中的每个都将具有自己的 p upvalue。(译注-多个闭包可共享同一个
upvalue,同一个 upvalue 必是原来的同一个局部变量,局部于循环内的变量在每次迭
代时都是不同的,因此这里的 p upvalue 是同一个。)微妙之处是 break break(行 [4]的
JMP)不是跳到行[7]中的 RETURN 指令;而是到达行[6]的 CLOSE 指令。不论执行绪
是正常退出循环还是通过 break break,循环内的代码可能已经引起了一个或多个函数及其
关联的 upvalue 的实例化。因此封闭的 do end 块必须执行器 CLOSE 指令;如果我们
一直牢记把 CLOSE 与 do end 块关联起来将不会有混乱。(译注-看似内部带有闭包
的程序块结束时,如果没有 RETURN 则会生成 CLOSE 作为循环结束后的第一条指
令,而 break 是跳到该第一条指令处的,如此理解。)
CLOSE also appears when for loops are used in the same manner. When using loop indices
or loop iterators as upvalues to instantiate functions, each instantiation will have its own
unique upvalue. This is the expected behaviour in Lua 5.1 if loop indices or iterators are to be
considered as locals to the loop body. Previously, Lua 5.0.2 considers loop indices or iterators
to be local to a block enclosing the entire loop, and instantiation of multiple functions only
results in a single upvalue shared between the functions. Please see the section on loop
instructions for sample code that illustrates this behaviour.
-86
86--
15 Comparing Lua 5.0.2 and Lua 5.1 比较 Lua 5.0.2 和 Lua
5.1
The following is list of changes to the Lua virtual machine instructions from version 5.0.2 to
version 5.1. This list is non-exhaustive, only changes noted during the writing of this guide
are listed. For the details, please read the relevant sections. If you are not familiar with Lua
5.0.2 virtual machine instructions, please read the older Lua 5.0.2 version of this guide.
-87
87--
• 函数原型头部增加了定义结束行。Is_vararg 标志有重大改变;现在有 3 个字
段。
• For a function prototype, debug data has been pushed to the end, while the code list
has been brought to the front. The list of constants can have LUA_TBOOLEAN.
• 对于函数原型,调试数据推到末端,而编码表提前了。常量表可有
LUA_TBOOLEAN。
• For RK(B) or RK(C) operands, an MSB flag is used instead of a biasing number to
differentiate registers and constants.
• 对于 RK(B)或 RK(C),使用 MSB 标志代替偏置数来区分寄存器和常量。
• LOADNILs at the start of a function are now optimized away.
• 现在函数起始处的 LOADNIL 被优化掉了。
• Limited constant folding is performed for arithmetic instructions, namely: ADD, SUB,
MUL, DIV, POW, MOD and UNM.
• 对算数指令执行受限的常量折叠,即:ADD、SUB、MUL、DIV、POW、
MOD 和 UNM。
• The MOD instruction is new.
• 新增 MOD 指令。
• The LEN instruction is new.
• 新增 LEN 指令。
• The VARARG instruction is new.
• 新增 VARARG 指令。
• What used to be TEST in 5.0.2 is now TESTSET in 5.1.
• 5.0.2 中的 TEST 变成现在 5.1 中的 TESTSET。
• The TEST instruction is new.
• 新增 TEST 指令。
• The FORPREP instruction is new.
• 新增 FORPREP 指令。
• FORLOOP behaviour has changed.
• FORLOOP 的行为改变了。
• The semantics of the loop index for FORLOOP has changed.
• FORLOOP 的循环索引的语义改变了。
• TFORLOOP behaviour has changed.
• TFORLOOP 的行为改变了。
• TFORPREP has been deleted. Lua 5.1 no longer supports old-style generic loops.
• 删除了 TFORPREP。Lua5.1 不再支持旧式的泛型循环。
• The semantics of loop iterators for TFORLOOP has changed.
• 用于 TFORLOOP 的循环迭代器的语义改变了。
• The limit to the complexity of while conditions has been removed.
• 移除了对 while 条件的复杂度限制。
• The encoding of sizes for NEWTABLE has changed.
• NEWTABLE 的尺寸编码改变了。
• SETLIST behaviour has changed.
• SETLIST 的行为改变了。
• SETLISTO has been deleted. Its functionality has been merged into SETLIST.
• 删除了 SETLISTO。它的功能合并到了 SETLIST 中。
-88
88--
16 Digging Deeper 深入探究
For studying larger snippets of Lua code and its disassembly, you can try ChunkSpy’s
various disassembly functions. Both vmmerge5 and ChunkSpy can merge source code lines
into a disassembly listing. ChunkSpy can provide more detail, because it processes every bit
of a binary chunk.
A good way of studying how any instruction functions is to find where its opcode appears in
the Lua sources. For example, to see what MOVE does, look for OP_MOVE in lparser.c
(the parser), lcode.c (the code generator) and lvm.c (the virtual machine.) From the code
implementing OP_MOVE, you can then move deeper into the code by following function
calls. I found this approach (bottoms up, following the execution path from generated
opcodes to the functions that performs code generation) is a little easier than following the
recursive descent parser’s call graph. Once you have lots of little pictures, the big picture will
form on its own.
I hope you have enjoyed, as I did, poking your way through the internal organs of this Lua
thingy. Now that the Lua internals seem less magical and more practical, I look forward to
some Dr Frankenstein experiments with my newfound knowledge...
17 Acknowledgements 致谢
The author gratefully acknowledges valuable feedback from Rici Lake and Klaas-Jan Stol.
-89
89--
Changes: 变更:
20060313 Initial public release, adapted from the Lua 5.0.2 version of the document.
Thanks to Rici Lake for info about the semantics of for loops in Lua 5.1.
初始公开发布,由本文档的 Lua 5.0.2 版改编而来。感谢 Rici Lake 在 Lua
5.1 中的 for 循环语义的资料(方面的帮助)。
-90
90--