Creating a language
using only
assembly language.
Kernel/VM Tanken-tai #11
Koichi Nakamura
Codes
https://github.com/nineties/amber
Profile
Koichi Nakamura
twitter: @9_ties
developing an IoT device
http://idein.jp
I was a compiler writer
wrote compilers at student experiment
minCaml compiler by OCaml
minCaml compiler by Haskell
https://github.com/nineties/Choco
studied optimizing compilers at graduate school
wrote compilers for special purpose CPUs
Wanted to create my own language
name: Amber
It was rowl at first.
I wanted to enjoy the creation process itself.
How could I?
Lets play with limitations
1. Use assembly language only.
2. No libraries.
3. No code generators.
High-level langs. like C
libc etc.
flex/bison etc.
StrategyBootstrapping
Write language 1 by assembly language
Write a little bit high-level language 2 by language 1
Write Amber by language k
Write Amber by Amber
here now
Whats the point?
For fun.
To cultivate knowledge, techniques, know-hows of compiler-writing.
But its not cost-effective study method...
To feel a sense of gratitude and respect for predecessors.
Ill show the outline of my development process.
1. Created rowl0 by assembly language
Made a little bit high-level lang. more than asm.
language name: rowl0
compiler name: rlc
From regular expressions of tokens
Wrote a state transition diagram
Converted to jump table
And wrote the lexer
Wrote rowl0s syntax by BNF
Then wrote the parser
recursive descent method
Generates codes together with parsing
writing memory
management is difficult here.
parsing
generates codes without
building syntax trees.
code generation
Completed the first language rowl0!
no symbol tables.
function params must be
p0,p1,p2,...
to use local variables, allocate
stack mems by allocate(n) then
use
x0,x1,x2,...
2.Created a LISP rowl-core by rowl0
Made a LISP temporarily
language name: rowl-core
interpreter name: rlci
easy to implement
productivity improvement
Wrote lexer and parser
Writing became more comfortable
Wrote eval
No memory management
mmap and munmap is the only function
malloc, free
1. Does not recovery garbage memories
2. Allocates fresh memories for new objects
3. So, it will die eventually
When it can compile the next generation compilers, its no problem.
Completed a LISP rowl-core!
rich functions
lambda, map etc.
macros
3.Created a language to write VM by rowl-core
Decided to create a VM for the next generation
Created a language just for writing the virtual machine.
Defined it as a DSL in the LISP rowl-core
No need of writing lexer and parser!
Wrote the compiler like this
Now I could use higher-order functions
productivity was improved a lot
4.Created a virtual machine rlvm by the DSL
Wrote codes of VM with the DSL like this
Wrote a garbage collector
Copying GC
Cheneys algorithm
Wrote primitive functions
An application of meta-programming
The table of instructions of the VM
Generates various codes from the table
reflects changes of
instructions
automatically
vm_instructions
It is very easy to make
this kind of mechanism
with LISP
eval loop of the VM
Assembler
Disassembler
Assembler used internally in Amber
Linker
Wrote instruction sets
Floating point arithmetics
Multi-precision integer arithmetics
Exception handling
Delimited Continuation
Completed the virtual machine rlvm!
186 instructions
stack machine
copying GC
exception handling
shift/reset delimited continuation
floating-point arithmetics, multi-precision arithmetics
5.Created a tool chain for rlvm
There was no programming tools for rlvm
Created a tool chain for the VM
a programming language rowl1
its compiler
assembler
disassembler
linker
Wrote rowl1, assembler and compiler
Defined as a DSL of rowl-core
Wrote linker and disassembler
Wrote these tools by rowl1, so they run on rlvm
The linker requires GC since it uses a lot of memory
Example outputs of the disassembler
Ready to program on rlvm!
writing programs for rlvm
disassembling of byte-codes
supports separate compilation
Reached the starting line
6.Wrote Amber by rowl1
Started developing Amber
dynamic scripting language
instance-based object-oriented system
run on rlvm
Wrote an assembler
The former assembler
assembles codes ahead of
time and run on rlci
This assembler assembles
codes just in time and run
on rlvm
fills addresses by
backpatching
Wrote the object system
slots, messages and parent delegation
Wrote Ambers core feature on the system
dynamic pattern-matching engine
mechanism of partial function fusion
Wrote the compiler
Made Amber compiler as one of Amber objects
matching of syntax tree
compiler
pattern-matching engine
Ambers core system
object system
VM
resource management
Wrote closure-conversion
Wrote parsers
compiles parsers at run-time
each parser is a usual Amber object (closure)
parsers
compile
compiler
pattern-matching engine
Amber core system
object system
VM
very simple syntax
1. literals are expressions
2. for a symbol h and expressions e1,..,en (n>=0),
h{e1, ..., en} is an expression
3. no other form of Ambers expression
Used Packrat parsing method
scanner less
Encoding/decoding floating-point literal was difficult
3.14
wrote them by my self
because of no libc limitation
0x40091eb851eb851f
strtod, sprintf
require multi-precision integer arithmetic which I wrote before
Amber interpreter is completed!
dynamic scripting language
run on rlvm
instance-based object oriented system
dynamic pattern-matching engine
partial function fusion
lexical closure
I got modern programming language!
7. Created Ambers standard library
Amber has strong self extensibility
Ambers simple syntax is extended in a standard library
amber/lib/syntax/parse.ab
Builds its syntax during boot sequence
Only has very simple syntax at first
used string literal for comments
because there is no syntax for comments
Defines a syntax for defining syntaxes
Defines Ambers syntax with the syntax
Builds macro system
Gives meanings to syntaxes by macros
Now Amber got rich syntax
Extends object system
Now Amber got rich object system
Inheritence, mix-in etc.
Now the development is under suspension
No plans of further updates
Try following command to invoke Amber shell
See the outputs of the make command
%
%
%
%
git clone https://github.com/nineties/amber.git
cd amber
make; sudo make install
amber
Summary
I could reach relatively high-level language. Feel satisfied.
lang. for
writing VM
compiler
impl.
as
language
tool
rowl0
rlc
impl.
rlvm
rowl-core
rlci
rowl1
Amber
compiler
interpreter
self-extension
linker
disassembler
run