Lua ll3
Lua ll3
0
Roberto Ierusalimschy, PUC-Rio
the l
e uag g an
Lua
1
W HAT
IS
L UA ?
W HAT
IS
L UA ? ( CONT.)
Associative arrays as single data structure rst-class values any value allowed as index (not only strings) very efcient implementation syntactic sugar: a.x for a["x"]
Several not-so-conventional features rst-class functions, lexical scoping, proper tail call, coroutines, dynamic overloading
W HY L UA ?
Light simple and small language, with few concepts core with approximately 60K, complete executable with 140K Portable written in clean C runs in PalmOS, EPOC (Symbian), Brew (Qualcomm), Playstation II, XBox, embedded systems, mainframes, etc. Efcient see benchmarks Easy to embed C/C++, Java, Fortran, Ruby, OPL (EPOC), C#
4
S OME A PPLICATIONS
Games LucasArts, BioWare, Microsoft, Relic Entertainment, Absolute Studios, Monkeystone Games, etc. Other Uses tomsrtbt - The most Linux on one oppy disk Crazy Ivan Robot (champion of RoboCup 2000/2001 in Denmark) chip layouts (Intel) APT-RPM (Conectiva & United Linux) Space Shuttle Hazardous Gas Detection System (ASRC Aerospace)
5
P OLL
My engine doesnt have scripting I made my own Lua C (with co-routines) Python Lisp Perl Ruby TCL Other
27.3% 26.3% 20.5% 9.75% 6.98% 1.45% 1.31% 1.16% 0.58% 4.51%
V IRTUAL M ACHINE
Most virtual machines use a stack model heritage from Pascal p-code, followed by Java, etc.
while a<lim do a=a+1 end 3 4 5 6 7 8 9 GETLOCAL GETLOCAL JMPGE GETLOCAL ADDI SETLOCAL JMP 0 1 4 0 1 0 -7 ; ; ; ; a lim to 10 a
; a ; to 3
A NOTHER M ODEL
FOR
V IRTUAL M ACHINES
Stack-machine instructions are too low level Interpreters add high overhead per instruction Register machines allow more powerful instructions
ADD 0 0 [1] ; a=a+1
Overhead to decode more complex instruction is compensated by fewer instructions registers for each function are allocated on the execution stack at activation time large number of registers (up to 256) simplies code generation
9
I NSTRUCTION F ORMATS
Three-argument format, used for most operators binary operators & indexing
31 23 22 14 13 6 5 0
OP
All instructions have a 6-bit opcode the virtual machine in Lua 5.0 uses 35 opcodes Operand A refers to a register usually the destination limits the maximum number of registers per function Operands B and C can refer to a register or a constant a constant can be any Lua value, stored in an array of constants private to each function
10
I NSTRUCTION E XAMPLES
0 0 0 0
259
assuming that the variable a is in register 0, t is in register 1, the number 1 is at index 3 in the array of constants, and the string "x" is at index 4.
11
I NSTRUCTION F ORMATS
There is an alternative format for instructions that do not need three arguments or with arguments that do not t in 9 bits used for jumps, access to global variables, access to constants with indices greater than 256, etc.
31
14 13
6 5
Bx
OP
12
I NSTRUCTION E XAMPLES
; a = x ; x = t ; a < 1 ?
assuming that the variable a is in register 0, t is in register 1, the number 1 is at index 3 in the array of constants, and the string "x" is at index 4. conceptually, LT skips the next instruction (always a jump) if the test fails. In the current implementation, it does the jump if the test succeed. jumps interpret the Bx eld as a signed offset (in excess-217 )
13
C ODE E XAMPLE
(all variables are local)
while i<lim do a[i] = 0 end -- Lua 4.0 2 3 4 5 6 7 8 9 GETLOCAL GETLOCAL JMPGE GETLOCAL GETLOCAL PUSHINT SETTABLE JMP 2 1 5 0 2 0 -8 ; ; ; ; ; i lim to 10 a i -- Lua 5.0 2 3 4 5 JMP SETTABLE LT JMP * 0 * * 1 2 256 2 1 -3 ; ; ; ; to 4 a[i] = 0 i < lim? to 3
; to 2
14
I MPLEMENTATION
OF
TABLES
Each table may have two parts, a hash part and an array part Example:
{n = 3; 100, 200, 300}
n nil
Header
15
TABLES : H ASH
PART
Hashing with internal lists for collision resolution Run a rehash when table is full:
key value link
0
key value link
val
nil nil
0 nil
val
insert key 4
val
Avoid secondary collisions, moving old elements when inserting new ones
key value link key value link
0 nil nil 4
val
0 nil 4
val
val val
16
val
insert key 3
TABLES : A RRAY
PART
Problem: how to distribute elements among the two parts of a table? or: what is the best size for the array? Sparse arrays may waste lots of space A table with a single element at index 10,000 should not have 10,000 elements How should next table behave when we try to insert index 5?
a = {n = 3; 100, 200, 300}; a[5] = 500
n nil 5 nil 500 3 100 200 300 nil n nil 3 100 200 300 nil 500 nil Header Header nil nil
17
C OMPUTING
The array part has size N , where N satises the following rules: N is a power of 2 the table contains at least N/2 integer keys in the interval [1, N ] the table has at least one integer key in the interval [N/2 + 1, N ]
18
C OMPUTING
Basic algorithm: to build an array where ai is the number of integer keys in the interval (2i1, 2i] array needs only 32 entries
Easy task, given a fast algorithm to compute log2 x the index of the highest one bit in x
19
C OMPUTING
total = 0 bestsize = 0 for i=0,32 do if a[i] > 0 then total += a[i] if total >= 2^(i-1) then bestsize = i end end end
20
P ERFORMANCE
program random (1e6) sieve (100) heapsort (5e4) matrix (50) bo (30) ack (8) Lua 4.0 1.03s 0.94s 1.04s 0.89s 0.74s 0.91s Lua 5 0.92s (89%) 0.79s (84%) 1.00s (96%) 0.78s (87%) 0.66s (89%) 0.84s (92%) Lua 5.0 1.08s (105%) 0.62s (66%) 0.70s (67%) 0.58s (65%) 0.69s (93%) 0.84s (92%) Perl 5.6.1 1.64s (159%) 1.29s (137%) 1.81s (174%) 1.13s (127%) 2.91s (392%) 4.77s (524%)
all test code copied from The Great Computer Language Shootout Lua 5 is Lua 5.0 without table-array optimization, tail calls, and dynamic stacks (related to coroutines). percentages are relative to Lua 4.0.
21
F INAL R EMARKS
Compiler for register-based machine is more complex needs some primitive optimizations to use registers Interpreter for register-based machine is more complex needs to decode instructions Requirements no more than 256 local variables and temporaries Main gains: avoid moves of local variables and constants fewer instructions per task potential gain with CSE optimizations
22