100% found this document useful (1 vote)

380 views14 pages

Write A Computer Language Using Go (Golang)

The document discusses writing a lexer in Go for an interpreter. It explains that a lexer takes source code as input and outputs tokens, transforming the representation from text to more easily interpretable tokens. The document defines the tokens needed for a subset of the Monkey programming language, including identifiers, integers, operators, and keywords. It then begins writing a lexer by defining a Token struct, constants for token types, and initializing a Lexer struct with methods like NextToken() to iterate over the tokens in the source code.

Uploaded by

Rf Hssn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

380 views14 pages

Write A Computer Language Using Go (Golang)

Uploaded by

Rf Hssn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Writing An Interpreter In Go

Thorsten Ball

Chapter 1

Lexing
1.1 - Lexical Analysis
In order for us to work with source code we need to turn it into a more accessible form. As easy as plain text
is to work with in our editor, it becomes cumbersome pretty fast when trying to interpret it in a programming
language as another programming language.
So, what we need to do is represent our source code in other forms that are easier to work with. Were going
to change the representation of our source code two times before we interpret it:

Figure 1.1: Source Code -> Tokens -> AST

The first transformation, from source code to tokens, is called lexical analysis, or lexing for short. Its
done by a lexer (also called tokenizer or scanner some use one word or the other to denote subtle differences
in behavior, which we can ignore for now).
Tokens itself are small, easily categorizable data structures that are then fed to the parser, which does the
second transformation and turns the tokens into an Abstract Syntax Tree.
Heres an example. This is the input one gives to a lexer:
"let x = 5 + 5;"

And what comes out of the lexer looks kinda like this:
[
LET,
IDENTIFIER(x),
EQUAL_SIGN,
INTEGER(5),
PLUS_SIGN,
INTEGER(5),
SEMICOLON
]

All of these tokens have the original source code representation attached ("let" in the case of LET, "+" in the
case of PLUS_SIGN, and so on). Some, like IDENTIFIER and INTEGER in our example, also have the concrete
values they represent attached: 5 (not "5"!) in the case of INTEGER and "x" in the case of IDENTIFIER. But
that last part varies from lexer to lexer (e.g. some convert the "5" to an integer in the parser, or even later).
1

A thing to note about this example: whitespace has been ignored. In our case thats okay, because whitespace
is not significant in the Monkey language. It doesnt matter if we type let x = 5; or let
x
=
5;.
In other languages, like Python, the whitespace is significant. That means the lexer cant just eat up the
whitespace and newline characters. It has to output them as tokens so the parser can later on make sense
of them (or output an error, of course, if there is not enough or too much whitespace).
A production-ready lexer might also attach the line number, column number and filename to a token. Why?
For example, to later output more useful error messages in the parsing stage. Instead of "error: expected
semicolon token" it can output "error: expected semicolon token instead of plus sign. line
42, column 23, [Link]".
Were not going to bother with that. Not because its too complex, but because it would take away from the
essential simpleness of the tokens and the lexer, making it harder to understand.

1.2 - Defining Our Tokens

The first thing we have to do is to define the tokens our lexer is going to output. Were going to start with
just a few token definitions and then add more when extending the lexer.
The subset of the Monkey language were going to lex in our first step looks like this:
let five = 5;
let ten = 10;
let add = fn(x, y) {
x + y;
};
let result = add(five, ten);

Lets break this down: which types of tokens does this example contain? First of all, there are the numbers
like 5 and 10. These are pretty obvious. Then we have the variable names x, y, add and result. There are
also these parts of the language that are not numbers, just words, but no variable names either, like let and
fn. Of course, there are also a lot of special characters: (, ), {, }, =, ,, ;.
The numbers are just integers and were going to treat them as such and give them a separate type. In the
lexer or parser we dont care if the number is 5 or 10, we just want to know if its a number. The same goes
for variable names: well call them identifiers and treat them the same. Now, the other words, that look
like identifiers, but arent really identifiers, since theyre part of the language, are keywords. We wont
group these together since it should make a difference in the parsing stage whether we encounter a let or
a fn. The same goes for the last category we identified: the special characters. Well treat them separately,
since it is a big difference whether or not we have a ( or a ) in the source code.
Lets define our Token data structure. Which fields does it need? As we just saw, we definitely need a type
attribute, so we can distinguish between integers and right bracket for example. And it also needs a field
that holds the literal value of the token, so we can reuse it later and the information whether a number
token is a 5 or a 10 doesnt get lost.
In a new token package we define our Token struct and our TokenType type:
// token/[Link]
package token
type TokenType string
type Token struct {
Type
TokenType

Literal string
}

We defined the TokenType type to be a string. That allows us to use many different values as TokenTypes,
which in turn allows us to distinguish between different types of tokens. Using string also has the advantage
of being easy to debug without a lot of boilerplate and helper functions: we can just print a string. Of
course, using a string might not lead to the same performance as using an int or a byte would, but for
this book a string is perfect.
As we just saw, there is a limited number of different token types in the Monkey language. That means we
can define the possible TokenTypes as constants. In the same file we add this:
// token/[Link]
const (
ILLEGAL = "ILLEGAL"
EOF
= "EOF"
// Identifiers + literals
IDENT = "IDENT" // add, foobar, x, y, ...
INT
= "INT"
// 1343456
// Operators
ASSIGN
= "="
PLUS
= "+"
// Delimiters
COMMA
= ","
SEMICOLON = ";"
LPAREN
RPAREN
LBRACE
RBRACE

=
=
=
=

"("
")"
"{"
"}"

// Keywords
FUNCTION = "FUNCTION"
LET
= "LET"
)

As you can see there are two special types: ILLEGAL and EOF. We didnt see them in the example above, but
well need them. ILLEGAL signifies a token/character we dont know about and EOF stands for end of file,
which tells our parser later on that it can stop.
So far so good! We are ready to start writing our lexer.

1.3 - The Lexer

Before we start to write code, lets be clear about the goal of this section. Were going to write our own lexer.
It will take source code as input and output the tokens that represent the source code. It will go through
its input and output the next token it recognizes. It doesnt need to buffer or save tokens, since there will
only be one method called NextToken(), which will output the next token.
That means, well initialize the lexer with our source code and then repeatedly call NextToken() on it to
go through the source code, token by token, character by character. Well also make life simpler here by
using string as the type for our source code. Again: in a production environment it makes sense to attach
filenames and line numbers to tokens, to better track down lexing and parsing errors, so it would be better

to initialize the lexer with a *[Link] or a set of them. But since that would add more complexity were
not here to handle, well start small and just use a string and ignore filenames and line numbers.
Having thought this through, we now realize that what our lexer needs to do is pretty clear. So lets create
a new package and add a first test that we can run continuously run to get feedback about the working of
the lexer. Were starting small here and will extend the test-case as we add more capabilities to the lexer:
// lexer/lexer_test.go
package lexer
import (
"testing"
"monkey/token"
)
func TestNextToken(t *testing.T) {
input := `=+(){},;`
tests := []struct {
expectedType
[Link]
expectedLiteral string
}{
{[Link], "="},
{[Link], "+"},
{[Link], "("},
{[Link], ")"},
{[Link], "{"},
{[Link], "}"},
{[Link], ","},
{[Link], ";"},
{[Link], ""},
}
var l *Lexer
l = New(input)
for i, tt := range tests {
tok := [Link]()
if [Link] != [Link] {
[Link]("tests[%d] - tokentype wrong. expected=%q, got=%q",
i, [Link], [Link])
break
}
if [Link] != [Link] {
[Link]("tests[%d] - literal wrong. expected=%q, got=%q",
i, [Link], [Link])
break
}
}
}

Of the course, the tests fail we havent written any code yet:
% go test ./lexer
# monkey/lexer
lexer/lexer_test.go:31: undefined: New
lexer/lexer_test.go:34: [Link] undefined (type *Lexer has no field or method NextToken)
FAIL
monkey/lexer [build failed]

So lets start by defining the New() function that returns *Lexer.

// lexer/[Link]
package lexer
type Lexer struct {
input
string
position
int // current position in input (points to current rune)
readPosition int // current reading position in input (after current rune)
ch
rune // current rune under examination
}
func New(input string) *Lexer {
l := &Lexer{
input:
input,
0,
position:
readPosition: 0,
ch:
-1,
}
return l
}

Most of the fields in Lexer are pretty self-explanatory. The ones that might cause some confusion right now
are position and readPosition. The reason for two pointers pointing into our input string is the fact
that we we will need to be able to peek further into the input, look after the current rune to see what
comes up next. readPosition always points to the next character in the input. position points to the
character in the input that corresponds to the ch rune. A first helper method called readRune() should
make the usage of these fields easier to understand:
// lexer/[Link]
import "unicode/utf8"
func (l *Lexer) readRune() {
if [Link] >= len([Link]) {
[Link] = -1
[Link] = [Link]
return
}
character, width := [Link]([Link][[Link]:])
[Link] = character
[Link] = [Link]
[Link] += width
}

The first thing readRune does is to check whether we reached the end of our input and cant read any
more runes. If thats the case then it sets [Link] to -1, which signifies end of file for us. It also updates
[Link] to the last read position, so that later know how far we read into the input before reaching the
end. And then it returns.
But in case we havent reached the end of the input yet readRune reads the next rune in our input.
It does this by calling [Link]. The reason for this call instead of just accessing
[Link][[Link]] is proper UTF-8 and Unicode support. Using [Link][[Link]] to get
the next character would not work when that character is multiple bytes wide. By using DecodeRuneInString
we leverage the power the rune data type gives us, which is being able to represent characters that are multiple bytes wide.
The [Link] field gets the updated to the read rune, [Link] is updated to the just used [Link]
and [Link] is advanced by the width of the just-read rune. That way, [Link] always
5

points to the next position where were going to read from next and [Link] always points to the position
where we last read. This will come in handy soon enough.
Lets use readRune in our New() function so our lexer is in a fully working state, with [Link], [Link]
and [Link] already initialized, before anyone calls NextToken():
// lexer/[Link]
func New(input string) *Lexer {
l := &Lexer{
input:
input,
position:
0,
readPosition: 0,
ch:
-1,
}
[Link]()
return l
}

Our tests should now tell us that calling New(input) doesnt result in problems anymore, but the
NextToken() method is still missing. Lets fix that by adding a first version:
// lexer/[Link]
import (
"monkey/token"
"unicode/utf8"
)
func (l *Lexer) NextToken() [Link] {
var tok [Link]
switch [Link] {
case '=':
tok = newToken([Link], [Link])
case ';':
tok = newToken([Link], [Link])
case '(':
tok = newToken([Link], [Link])
case ')':
tok = newToken([Link], [Link])
case ',':
tok = newToken([Link], [Link])
case '+':
tok = newToken([Link], [Link])
case '{':
tok = newToken([Link], [Link])
case '}':
tok = newToken([Link], [Link])
case -1:
[Link] = ""
[Link] = [Link]
}
[Link]()
return tok
}
func newToken(tokenType [Link], r rune) [Link] {
return [Link]{Type: tokenType, Literal: string(r)}

Thats the basic structure of the NextToken() method. We look at the current rune under examination
([Link]) and return a token depending on which rune it is. Before returning the token we advance our pointers
into the input so when we call NextToken() the [Link] field is already updated. A small function called
newToken helps us initializing these tokens.
Running the tests we can see that they pass:
% go test ./lexer
ok
monkey/lexer 0.007s

Great! Lets now extend the test case so it starts to resemble Monkey source code.
// lexer/lexer_test.go
func TestNextToken(t *testing.T) {
input := `let five = 5;
let ten = 10;
let add = fn(x, y) {
x + y;
};
let result = add(five, ten);
`
tests := []struct {
expectedType
[Link]
expectedLiteral string
}{
{[Link], "let"},
{[Link], "five"},
{[Link], "="},
{[Link], "5"},
{[Link], ";"},
{[Link], "let"},
{[Link], "ten"},
{[Link], "="},
{[Link], "10"},
{[Link], ";"},
{[Link], "let"},
{[Link], "add"},
{[Link], "="},
{[Link], "fn"},
{[Link], "("},
{[Link], "x"},
{[Link], ","},
{[Link], "y"},
{[Link], ")"},
{[Link], "{"},
{[Link], "x"},
{[Link], "+"},
{[Link], "y"},
{[Link], ";"},
{[Link], "}"},
{[Link], ";"},
{[Link], "let"},
{[Link], "result"},
{[Link], "="},

{[Link], "add"},
{[Link], "("},
{[Link], "five"},
{[Link], ","},
{[Link], "ten"},
{[Link], ")"},
{[Link], ";"},
{[Link], ""},
}
var l *Lexer
l = New(input)
for i, tt := range tests {
tok := [Link]()
if [Link] != [Link] {
[Link]("tests[%d] - tokentype wrong. expected=%q, got=%q",
i, [Link], [Link])
break
}
if [Link] != [Link] {
[Link]("tests[%d] - literal wrong. expected=%q, got=%q",
i, [Link], [Link])
break
}
}
}

The updated input we initialize the lexer with is a subset of the Monkey language. It contains all the symbols
we already successfully turned into tokens. But whats been added now causes our tests to fail: identifiers,
keywords and numbers.
Lets start with the identifiers and keywords. What our lexer needs to do is recognize whether the current
rune under examination is a letter and if so, it needs to read the rest of the identifier/keyword until it
encounters a non-letter-character. Having read that identifier/keyword, we then need to find out if it is a
identifier or a keyword, so we can use the correct [Link]. The first step is extending our switch
statement:
// lexer/[Link]
import (
"monkey/token"
"unicode"
"unicode/utf8"
)
func (l *Lexer) NextToken() [Link] {
var tok [Link]
switch [Link] {
// [...]
default:
if isLetter([Link]) {
[Link] = [Link]()
return tok
} else {
tok = newToken([Link], [Link])
}
}

// [...]
}
func (l *Lexer) readIdentifier() string {
position := [Link]
for isLetter([Link]) {
[Link]()
}
return [Link][position:[Link]]
}
func isLetter(ch rune) bool {
return 'a' <= ch && ch <= 'z' || 'A' <= ch && ch <= 'Z' ||
ch == '_' ||
ch >= [Link] && [Link](ch)
}

We added a default arm to our switch statement, so we can check for identifiers whenever the [Link] is not
one of the recognized runes. We also added the generation of [Link] tokens. If we end up there,
we truly dont know how to handle the current rune and declare it as [Link].
The isLetter helper function just checks whether the given argument is a letter. It does this by checking
checking whether its a normal ASCII letter or if its a unicode letter. Thats just a little dance we have to
do, to support Unicode.
Whats noteworthy about isLetter is that changing this function has a larger impact on the language our
interpreter will be able to parse than one would expect from such a small function. As you can see, in our
case it contains the check ch == '_', which means that well treat _ as a letter and allow it in identifiers
and keywords. That means we can use variable names like foo_bar. Other programming languages even
allow ! and ? in identifiers. If you want to allow that too, this is the place to sneak it in, probably in a
separate function called isValidIdentifierCharacter that can be used in combination with isLetter.
readIdentifier() does exactly what its name suggests: it reads in an identifier and advances our lexers
positions until it encounters a non-letter-character.
In the default: arm of the switch statement we use readIdentifier() to set the Literal field of our
current token. But what about its Type? Now that we have read in identifiers like let, fn or foobar, we
need to be able to tell user-defined identifiers apart from language keywords. We need a function that returns
the correct TokenType for the token literal we have. What better place than the token package to add such
a function?
// token/[Link]
var keywords = map[string]TokenType{
"fn": FUNCTION,
"let": LET,
}
func LookupIdent(ident string) TokenType {
if tok, ok := keywords[ident]; ok {
return tok
}
return IDENT
}

LookupIdent checks the keywords table to see whether the given identifier is in fact a keyword. If it is, it
returns the keywords TokenType constant. If it isnt, we just get back [Link], which is the TokenType
for all user-defined identifiers.
With this in hand we can now complete the lexing of identifiers and keywords:

// lexer/[Link]
func (l *Lexer) NextToken() [Link] {
var tok [Link]
switch [Link] {
// [...]
default:
if isLetter([Link]) {
[Link] = [Link]()
[Link] = [Link]([Link])
return tok
} else {
tok = newToken([Link], [Link])
}
}
// [...]
}

The early exit here, our return tok statement, is needed because when calling readIdentifier(), we call
readRune() repeatedly and advances our readPosition and position fields past the last character/rune
of the current identifier.
Running our tests now, we can see that let is identified correctly but the tests still fail:
% go test ./lexer
--- FAIL: TestNextToken (0.00s)
lexer_test.go:70: tests[1] - tokentype wrong. expected="IDENT", got="ILLEGAL"
FAIL
FAIL
monkey/lexer 0.008s

The problem is the next token we want: a IDENT token with "five" in its Literal field. Instead we get an
ILLEGAL token. Why is that? Because of the whitespace character between let and five. But in Monkey
whitespace is not significant. It shouldnt make a difference if we write let five or let
five. So we
need to skip over whitespace entirely.
// lexer/[Link]
func (l *Lexer) NextToken() [Link] {
var tok [Link]
[Link]()
switch [Link] {
// [...]
}
func (l *Lexer) skipWhitespace() {
for [Link] == ' ' || [Link] == '\t' || [Link] == '\n' || [Link] == '\r' {
[Link]()
}
}

This little helper function is found in a lot of parsers. Sometimes its called eatWhitespace and sometimes
consumeWhitespace and sometimes something entirely different. Which characters these functions actually
skip depends on the language. Some languages do create tokens for newline characters for example and throw
parsing errors if they are not at the correct place in the stream of tokens. We skip over newline characters
to make the parsing step later on a little easier. Also if you havent noticed, theres still a bug in there: one
of these invisible Unicode characters, which is effectively whitespace in this context, would cause our lexer
to trip and output an ILLEGAL token. Again: for simplicitys sake well ignore these for now. Whitespace in

the Monkey language consists of the space, tab, newline and carriage return characters.
With skipWhitespace() in place, the lexer trips over the 5 in the let five = 5; part of our test input.
And thats right, it doesnt know yet how to turn numbers into tokens. Its time to add this.
As we did previously for identifiers, we now need to add more functionality to the default arm of our switch
statement.
// lexer/[Link]
func (l *Lexer) NextToken() [Link] {
var tok [Link]
[Link]()
switch [Link] {
// [...]
default:
if isLetter([Link]) {
[Link] = [Link]()
[Link] = [Link]([Link])
return tok
} else if isDigit([Link]) {
[Link] = [Link]
[Link] = [Link]()
return tok
} else {
tok = newToken([Link], [Link])
}
}
// [...]
}
func (l *Lexer) readNumber() string {
position := [Link]
for isDigit([Link]) {
[Link]()
}
return [Link][position:[Link]]
}
func isDigit(ch rune) bool {
return '0' <= ch && ch <= '9'
}

As you can see, the added code closely mirrors the part concerned about reading identifiers and keywords.
The readNumber method is exactly the same as readIdentifier except for its usage of isDigit instead of
isLetter. We could probably generalize this by passing in the rune-identifying functions as arguments, but
wont, for simplicitys sake and ease of understanding.
The isDigit function is as simple as isLetter. It just returns whether the passed in rune is a Latin digit
between 0 and 9.
With this added, our tests pass:
% go test ./lexer
ok
monkey/lexer 0.008s

I dont know if you noticed, but we simplified things a lot in readNumber. We only read in integers. What
about floats? Or numbers in hex notation? Octal notation? We ignore them for now and just say that
Monkey doesnt support this. Of course, the reason for this is again the educational aim and limited scope
of this book.

Its time to pop the champagne and celebrate: we successfully turned the small subset of the Monkey language
we used in the our test case into tokens!
With this victory under our belt, its easy to extend the lexer so it can tokenize a lot more of Monkey source
code.

End Of Sample
Youve reached the end of the sample. I hope you enjoyed it. Youll soon be able to buy the full version of
the book (in multiple formats, including the code) online at:
[Link]

Sample
No ratings yet
Sample
15 pages
Monkey Language Lexer & REPL Guide
No ratings yet
Monkey Language Lexer & REPL Guide
5 pages
Lexical Analysis: 2/24/2018 John Roberts
No ratings yet
Lexical Analysis: 2/24/2018 John Roberts
7 pages
Lexical Analysis
No ratings yet
Lexical Analysis
6 pages
1 - Scanning Slides Sanyal Part1
No ratings yet
1 - Scanning Slides Sanyal Part1
22 pages
2.1 - Lexical Analysis
No ratings yet
2.1 - Lexical Analysis
102 pages
Lexical Analyzer: Tokenization Process
No ratings yet
Lexical Analyzer: Tokenization Process
37 pages
Unit 2 Lexical Analyzer
No ratings yet
Unit 2 Lexical Analyzer
30 pages
SML Chapter9
No ratings yet
SML Chapter9
40 pages
حاسبة
No ratings yet
حاسبة
58 pages
Compiler Design for Students
No ratings yet
Compiler Design for Students
40 pages
Chapter 2 - Lexical Analyser
No ratings yet
Chapter 2 - Lexical Analyser
38 pages
Lexical Analysis: Programming Languages Translators
No ratings yet
Lexical Analysis: Programming Languages Translators
21 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
10 pages
Ch3 Modified
No ratings yet
Ch3 Modified
80 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
14 pages
Chap 11
No ratings yet
Chap 11
28 pages
Compiler Construction: Chapter # 2 - Lexical Analysis Instructor: Ms. Raazia Sosan
No ratings yet
Compiler Construction: Chapter # 2 - Lexical Analysis Instructor: Ms. Raazia Sosan
53 pages
Lexing and Parsing in Compilers
No ratings yet
Lexing and Parsing in Compilers
78 pages
Lexical Analysis and Tokenization
No ratings yet
Lexical Analysis and Tokenization
4 pages
UNIT 2 Compiler Design
No ratings yet
UNIT 2 Compiler Design
23 pages
COS 320 Compilers: David Walker
No ratings yet
COS 320 Compilers: David Walker
38 pages
Unit 1 (B)
No ratings yet
Unit 1 (B)
69 pages
Lexical Analysis
No ratings yet
Lexical Analysis
5 pages
Chapter 2 - Lexical Analyser
No ratings yet
Chapter 2 - Lexical Analyser
39 pages
Unit 2 Lexical Analyzer
No ratings yet
Unit 2 Lexical Analyzer
63 pages
2-Lexical Analysis Part1
No ratings yet
2-Lexical Analysis Part1
39 pages
Chapter 2 Lexical Analyser
No ratings yet
Chapter 2 Lexical Analyser
40 pages
Lexical Analyser Lecture 4, 5, 6
No ratings yet
Lexical Analyser Lecture 4, 5, 6
66 pages
Lexical Analysis of Compiler
No ratings yet
Lexical Analysis of Compiler
13 pages
Parser Lexical Analysis
No ratings yet
Parser Lexical Analysis
6 pages
02 Lexical Analysis
No ratings yet
02 Lexical Analysis
86 pages
Lexical Analysis for CS Students
No ratings yet
Lexical Analysis for CS Students
29 pages
Chapter 2 - Lexical Analysis - Regular Expressions
No ratings yet
Chapter 2 - Lexical Analysis - Regular Expressions
27 pages
Lexical Analyzer with Lex and Flex
No ratings yet
Lexical Analyzer with Lex and Flex
8 pages
Chapter 2
No ratings yet
Chapter 2
67 pages
Lecture 3
No ratings yet
Lecture 3
4 pages
CC (CBCS 2019-2020)
No ratings yet
CC (CBCS 2019-2020)
73 pages
Token Lexeme Pattern
No ratings yet
Token Lexeme Pattern
4 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
31 pages
Compiler Lexical Analysis Guide
No ratings yet
Compiler Lexical Analysis Guide
26 pages
Comp Chap2
No ratings yet
Comp Chap2
36 pages
CD Aii Partb Ans
No ratings yet
CD Aii Partb Ans
8 pages
Lexical Analysis Techniques Guide
No ratings yet
Lexical Analysis Techniques Guide
20 pages
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part2
No ratings yet
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part2
62 pages
CD 1
No ratings yet
CD 1
92 pages
Part 1. Experiments With Javacc: Source Code Source Code
No ratings yet
Part 1. Experiments With Javacc: Source Code Source Code
3 pages
ch-2 Compiler Design
No ratings yet
ch-2 Compiler Design
9 pages
4-Intro To Flex and Bison-09!09!2024
No ratings yet
4-Intro To Flex and Bison-09!09!2024
28 pages
Lexical Analysis in Compilers
No ratings yet
Lexical Analysis in Compilers
5 pages
Pdf&rendition 1
No ratings yet
Pdf&rendition 1
14 pages
2 Lexing
No ratings yet
2 Lexing
73 pages
Chapter 2
No ratings yet
Chapter 2
6 pages
03LexicalAndSyntaxAnalysis 1
No ratings yet
03LexicalAndSyntaxAnalysis 1
25 pages
CH 3
No ratings yet
CH 3
66 pages
Lexical Analysis
No ratings yet
Lexical Analysis
44 pages
03 Lex Analysis
No ratings yet
03 Lex Analysis
61 pages
Lex and Yacc
No ratings yet
Lex and Yacc
8 pages
R.V. College of Engineering
No ratings yet
R.V. College of Engineering
56 pages
Manual CLI Fortigate 5.0
No ratings yet
Manual CLI Fortigate 5.0
178 pages
Error Otds
No ratings yet
Error Otds
2 pages
VTU Result 2022
No ratings yet
VTU Result 2022
2 pages
Thesisfs: Online Document Management System: Joseph Christian G. Noel William Yu Pierre Tagle, PHD
No ratings yet
Thesisfs: Online Document Management System: Joseph Christian G. Noel William Yu Pierre Tagle, PHD
6 pages
Data Analysis with Pandas
No ratings yet
Data Analysis with Pandas
50 pages
IMS Product Documentation Guide
100% (1)
IMS Product Documentation Guide
205 pages
Soal Ujian Oracle
0% (1)
Soal Ujian Oracle
26 pages
DX Error 10824 Troubleshooting Guide
No ratings yet
DX Error 10824 Troubleshooting Guide
4 pages
UnionBank Transfer Guide
No ratings yet
UnionBank Transfer Guide
21 pages
Hcia Big Data Practice
No ratings yet
Hcia Big Data Practice
35 pages
Database Systems I Notes
No ratings yet
Database Systems I Notes
43 pages
Sap Abap Ale and Idoc Explained
No ratings yet
Sap Abap Ale and Idoc Explained
9 pages
Network Simulation Assignment
No ratings yet
Network Simulation Assignment
5 pages
DriveSPC Standard Function Blocks
No ratings yet
DriveSPC Standard Function Blocks
115 pages
Tm445 Acopos Acp10 Software
No ratings yet
Tm445 Acopos Acp10 Software
44 pages
Inventory Management Guide
No ratings yet
Inventory Management Guide
9 pages
Assignment 1: Decision Support Systems IT445
No ratings yet
Assignment 1: Decision Support Systems IT445
4 pages
C# Programming: Microsoft 70-483 Exam Prep
No ratings yet
C# Programming: Microsoft 70-483 Exam Prep
14 pages
BCS403 DBMS 2
No ratings yet
BCS403 DBMS 2
2 pages
SQL JOIN Types Explained
No ratings yet
SQL JOIN Types Explained
8 pages
Database Testing Checklist
No ratings yet
Database Testing Checklist
4 pages
Hidden Node Problem
No ratings yet
Hidden Node Problem
5 pages
Registry Disable USB
No ratings yet
Registry Disable USB
6 pages
Nokia IP Networks and Services Fundamentals Sample Course Document en
No ratings yet
Nokia IP Networks and Services Fundamentals Sample Course Document en
45 pages
Introduction To Rubrik: This Is Cloud Data Management
No ratings yet
Introduction To Rubrik: This Is Cloud Data Management
17 pages
Python Interview Questions 1
100% (1)
Python Interview Questions 1
32 pages
TCP and UDP Port Usage Guide For Cisco Unified Communications Manager Release 9.0
No ratings yet
TCP and UDP Port Usage Guide For Cisco Unified Communications Manager Release 9.0
44 pages
Cohesity Netbackup 11 Adminitration Academy en
No ratings yet
Cohesity Netbackup 11 Adminitration Academy en
4 pages
Exercise7 Paging
No ratings yet
Exercise7 Paging
5 pages

Write A Computer Language Using Go (Golang)

Uploaded by

Write A Computer Language Using Go (Golang)

Uploaded by

Writing An Interpreter In Go

Figure 1.1: Source Code -> Tokens -> AST

1.2 - Defining Our Tokens

1.3 - The Lexer

So lets start by defining the New() function that returns *Lexer.

You might also like