MiniGo Spec
MiniGo Spec
MiniGo SPECIFICATION
Version 1.0
MiniGo’s SPECIFICATION
Version 1.2
1 Introduction
Go, also known as Golang, is a statically typed, compiled programming language developed
by Google. It was created in 2007 by Robert Griesemer, Rob Pike, and Ken Thompson, and
was officially released to the public in 2009. The language was designed to address the short-
comings of other programming languages in terms of performance, scalability, and ease of use,
particularly for large software systems. Go emphasizes simplicity, efficiency, and concurrency,
making it ideal for modern, distributed applications.
One of the key reasons for Go’s creation was to improve the development process at Google.
The team wanted a language that would be faster to compile than C++ and more efficient
in terms of concurrency management than languages like Java. Go introduces features such
as goroutines (lightweight threads) and channels, which facilitate concurrent programming and
make it highly effective for handling multiple tasks simultaneously.
Go has become increasingly popular, particularly in cloud computing, microservices, and De-
vOps. Its growing community and extensive standard library have contributed to its widespread
adoption. Some of the most notable projects developed using Go include Docker, Kubernetes,
Terraform, and the cloud infrastructure behind companies like Uber, Dropbox, and Netflix.
The language’s strong performance, ease of use, and scalability have made it a top choice for
building large, reliable systems that require high concurrency and low-latency processing.
MiniGo is a simplified version of the Go programming language, designed specifically for stu-
dents to practice building a compiler within a limited timeframe. It retains the core concepts
of Go, such as basic data types, structs, and interfaces, but removes more complex features like
goroutines, channels, and the extensive standard library. The goal of MiniGo is to provide a
manageable subset of Go that allows students to focus on fundamental concepts of program-
ming language implementation, including lexical analysis, parsing, semantics checking, and
code generation. By working with MiniGo, students gain hands-on experience in implementing
a programming language, which helps them understand the underlying principles of language
design and compilers, while also giving them the opportunity to create a working language from
scratch.
2 Program structure
A program in MiniGo consists of a single file that includes various declarations without strict
ordering. The file may contain constant declarations, variable declarations, type declarations
(such as structs and interfaces), and function declarations, all of which can appear in any order.
The program is structured around a mandatory ‘main‘ function, which serves as the entry point
of execution. The ‘main‘ function does not take parameters or return values, and it is where
the program’s execution begins.
3 Lexical structure
A MiniGo program is a sequence of characters from the ASCII character set. Whitespace
characters include blank spaces (’ ’), tabs (’\t’), form feeds (ASCII FF, ’\f’), and carriage
returns (ASCII CR, ’\r’). Newline characters (ASCII LF, ’
n’) are also treated as whitespace in most contexts, but they have a special role in terminating
statements, automatically inserting a semicolon where needed. Additionally, newlines are used
to determine line numbers in MiniGo, which helps to accurately report errors and provide
context in the compiler’s error messages.
In MiniGo, there are two types of comments: single-line comments and multi-line comments.
Single-line comments begin with // and extend to the end of the line, allowing developers
to add brief explanations or notes within the code. These comments are typically used for
inline documentation or to temporarily disable code during development. Multi-line comments
begin with /* and end with */. They can span multiple lines, making them useful for longer
descriptions or for commenting out large blocks of code. A notable feature of MiniGo’s multi-
line comments is that they support nesting, meaning you can place one /* ... */ comment inside
another. Both types of comments are ignored by the compiler and do not affect the execution
of the program, but they are important for improving code readability and maintaining clarity
in complex code sections.
/* This is a /* nested
multi-line
comment. */
*/
A token is a sequence of one or more characters in the source code that, when grouped to-
gether, acts as a single atomic unit of the language. In the MiniGo programming language,
tokens are the smallest meaningful elements in the code and are categorized into five types:
identifiers, keywords, operators, separators, and literals. Each token type serves a distinct
purpose, contributing to the syntactic and semantic structure of a MiniGo program.
3.3.1 Identifiers
In MiniGo, an identifier is the name used to identify variables, constants, types, functions, or
other user-defined elements within a program. Identifiers must adhere to the following rules:
3.3.2 Keywords
In MiniGo, keywords are reserved words that have special meaning within the language.
They are used to define the syntax and structure of a MiniGo program and cannot be used as
identifiers (e.g., variable names, function names, etc.). The following are the reserved keywords
in MiniGo:
• Keywords must be used only in their predefined context and cannot be redefined or used
as identifiers.
3.3.3 Operators
• +, -, *, /, %
• &&, ||, !
• .
These operators are essential for performing arithmetic, logical, relational, and other operations
in MiniGo. Their specific behaviors and precedence rules will be explained in detail later.
3.3.4 Separators
• (, )
• {, }
• [, ]
• ,
• ;
3.3.5 Literals
Literals in MiniGo represent fixed values directly written in the source code. They are classified
as follows:
• Integer Literals: Integer literals represent whole numbers and can be written in multiple
numeric systems:
– \n: Newline
– \t: Tab
– \r: Carriage return
– \": Double quote
– \\: Backslash
Examples of valid string literals: "hello", "123", "This is a string with a newline\n".
• Boolean Literals: Boolean literals represent the logical values true and false. They
are case-sensitive and must be written in lowercase.
Examples of valid boolean literals: true, false.
• Nil Literal: The nil literal, nil, represents the absence of a value. It is used in situations
where a value is optional or non-initialized.
Example of a valid nil literal: nil.
! && ||
+ - * / %
== != > >= < <=
A value of type float can represent real numbers, including those with decimal points.
The following operators can be used with float values:
+ - * / %
== != > >= < <=
The string type represents a sequence of characters enclosed in double quotes " ". The
following operations are supported for string values:
Examples:
str1 := "Hello"
str2 := "World"
str3 := str1 + " " + str2 // str3 == "Hello World"
str4 := "apple"
str5 := "banana"
result := str4 == str5 // result == false
MiniGo supports the array type, which represents a collection of elements of the same type.
• Arrays are indexed from 0, and the size of the array is fixed once it is defined.
• An array type declaration begins with a list of dimensions followed by a type which is any
of the primitive types (int, float, boolean, string) or composite types such as struct.
A dimension is a integer literal or constant enclosed in a pair of squared brackets [ ].
A struct in MiniGo is a composite data type that allows you to group different types of vari-
ables together into a single unit. It is useful for representing objects with different properties,
especially when you need to combine various data types under one logical entity.
To define a struct type in MiniGo, you declare a struct type using the type keyword followed
by the name of the struct and the struct keyword. The fields of the struct are enclosed within
curly braces { } and are defined by a list of field names along with their associated types, each
followed by a semicolon or a newline. The struct type declaration can end by a semicolon or
a newline.
Each field in a struct represents a property of the object, and the type of the field can be any
valid MiniGo type (including primitive types, arrays, other structs, and interfaces). You can
define a struct with any number of fields, and these fields may be of different types.
Here is an example of how to define a struct in MiniGo:
In this example, the Person struct has two fields: name (of type string) and age (of type int).
Notice that the fields are enclosed within curly braces { } and each field consists of a name
and a type, separated by a space.
Once you have defined a struct, you can create instances of it by initializing the struct with
values for its fields. For example,
Here, p is an instance of the Person struct with the field name set to "Alice" and the field age
set to 30. The values inside the curly braces { } are used to initialize the struct fields.
It is also possible to create an empty struct instance where the fields are initialized to their
zero values:
p := Person{}
In this case, p.name will be an empty string ("") and p.age will be zero.
Accessing the fields of a struct can be done using the dot notation:
p.age := 31
PutIntLn(p.age) // Output: 31
A struct in MiniGo does not have methods by default, but you can define methods associated
with a struct type. A method is a function that has a special receiver argument, which is the
instance of the struct.
For example, you can define a method Greet for the Person struct to return a greeting message:
An interface defines a set of methods that a type must implement. Any type that implements
the methods of an interface satisfies that interface.
To define an interface in MiniGo, you declare an interface type using the type keyword
followed by the name of the interface and the interface keyword. The methods of the interface
are enclosed within curly braces { }, and each method consists of the method name, a nullable
list of parameters enclosed in parentheses, and an optional return type. Parameters can be
defined as either a list of names sharing the same type (e.g., x, y int), or as individual name-
type pairs separated by commas (e.g., x int, y float). Each method declaration ends with a
semicolon or a newline. The interface declaration also ends by a semicolon or a newline.
Example of an interface declaration:
5.1 Variables
There are three kinds of variables: global variables, local variables, and function param-
eters.
1. Global variables: Global variables are declared outside any function in the program.
They are visible from the place where they are declared to the end of the program.
2. Local variables: Local variables are declared inside blocks(i.e., inside the body of func-
tions). They are visible only within the block where they are declared and all nested
blocks. A block is a list of statement enclosed by a pair of curly braces.
In MiniGo, a global or local variable is declared using the var keyword, followed by the variable
name, an optional type, an optional initialization value and a semicolon. If the type is omitted,
it is automatically inferred from the initialization expression. Initialization is done using an
equals sign (=) followed by an expression, and the value of this expression must be computable
at compile time.
Variables without an explicit initialization value are assigned the zero value of their type. For
instance:
5.2 Constants
In MiniGo, constants are values that cannot be changed once they are assigned. They are used
to define fixed values that are immutable throughout the program. Constants must be assigned
a value at the time of declaration, and their value cannot be altered later.
Constants can be divided into two types:
1. Global constants: These constants are declared outside of any function, and they are
accessible throughout the entire program, from the point of declaration to the end of the
program.
2. Local constants: These constants are declared inside a function or block. Their scope
is limited to that function or block, and they are not accessible outside of it.
To declare a constant, you use the const keyword followed by the constant’s name, its assigned
value, and a semicolon. The value of a constant can either be a literal constant (such as a num-
ber, string, or boolean) or an expression that evaluates to a value. Each constant declaration
must end by a semicolon.
Examples of constant declarations:
const Pi = 3.14;
It is important to note that when declaring a constant, the value must be determined at compile
time, meaning that the value can be an expression, but the expression must be fully evaluable
before runtime (i.e., without any dynamic computation).
In MiniGo, functions and methods are key components for creating reusable, modular code.
A function performs a specific task, while a method is a function associated with a particular
type, typically a struct or interface.
Functions
A function in MiniGo is a named block of code that performs a specific task. To declare a
function, use the func keyword followed by the function name, followed by a pair of parentheses
( ), which may optionally contain a comma-separated list of parameters. After the parameters
(if any), you can optionally specify a return type. Finally, the function body is enclosed in
curly braces
Example of a simple function:
Methods
In MiniGo, methods are a type of function that is associated with a specific type, usually a
struct or an interface. Methods allow you to define behavior that operates on the fields or
properties of an object.
To declare a method for a struct, write it similarly to a function, with the addition of a receiver
placed between the func keyword and the method name. The receiver consists of a name and
a type, enclosed in parentheses.
Example of a method in a struct:
In this example, the Add method is associated with the Calculator struct. The receiver c
refers to a Calculator instance, allowing the method to modify the struct’s value field.
Key Differences Between Functions and Methods
• Functions do not have access to the fields of a struct, while methods can access and
modify the struct’s fields.
• Methods can be declared for structs or interfaces, but not for primitive types or basic
data types in MiniGo.
6 Expressions
Expressions in MiniGo are constructs made up of operators and operands that work with
data to produce new data. An expression can involve constants, variables, calls, or results from
other operators.
In MiniGo, there are several types of operators, including arithmetic, relational, boolean and
operators for accessing array and struct elements. Each of these operators is applied to operands
of specific types and returns a result based on the operation.
MiniGo supports the standard arithmetic operators for working with integers and floating-point
numbers. In addition, operator + can be used to concat two strings. These operators include:
Note:
• The + operator:
– If both operands are of type string, the result is of type string (concatenation).
– If both operands are of type int or float, the result is of the same type as the
operands.
– If one operand is int and the other is float, the result is float.
– If both operands are of type int or float, the result is of the same type as the
operands.
– If one operand is int and the other is float, the result is float.
• The % operator:
– Works only with operands of type int, and the result is also of type int.
Relational operators are used to compare values. In MiniGo, these operators require that the
operands be of the same type, either both int, both float, or both string. The result of the
relational operations is always a boolean type.
The following relational operators are supported:
To access an element of an array, MiniGo uses the index operator [], where the expression inside
the brackets must be evaluated to an integer.
For example:
a[2][3] := b[2] + 1;
To access the fields of a struct, MiniGo uses the dot operator .. The dot operator is used to
reference a specific field of a struct.
For example:
person.name := "John";
person.age := 30;
6.6 Literal
In addition to the literals described in Section 3.3.5, MiniGo supports composite literals, such
as array and struct literals.
An array literal begins with an array type, followed by a list of elements enclosed in curly
braces.
For example:
A struct literal begins with the name of the struct, followed by a pair of curly braces containing
an optional, comma-separated list of elements. Each element consists of a field name, a colon,
and an expression.
For example:
In MiniGo, function and method calls are used to invoke the behavior defined by functions and
methods, respectively. Both function and method calls are similar, but there are key differences
based on whether they are associated with a function or a method of a type.
To call a function in MiniGo, use the function name followed by a pair of parentheses ()
containing the actual arguments (if any). The arguments in the parentheses must match the
parameters declared in the function’s signature, both in number and type.
For example, to call a function named add that takes two integer arguments, the call would be
written as:
add(3, 4)
If the function has no parameters, you can call it with empty parentheses:
reset()
A method call in MiniGo is similar to a function call, but with the addition of the receiver
type. Methods are associated with types (usually structs), and the receiver type is defined as
part of the method declaration.
To call a method, use the expression representing the instance of the type, followed by a dot .
and the method name, with the actual arguments in parentheses. The receiver type is implicitly
passed when the method is called.
For example, if calculator is an instance of a struct Calculator, and the method add is
defined for this struct, you would call it like this:
calculator.add(3, 4)
calculator.reset()
In MiniGo, operators are evaluated on the basis of their precedence and associativity. The
following table summarizes the precedence and associativity of all operators:
The expression in parentheses has highest precedence so the parentheses are used to change the
precedence of operators.
7 Statements
A statement in MiniGo specifies an action for the program to perform. Each statement must
end with a semicolon (;). However, the semicolon can be omitted if the statement ends with a
newline.
There are many kinds of statements, as described as follows:
7.3 If Statement
• The boolean expression inside the parentheses must evaluate to true or false.
• If the condition evaluates to true, the statements within the first block are executed.
• If the condition evaluates to false, the statements within the optional else block are
executed.
• Additional conditions can be checked using the else if clause, which follows the same
structure as the if clause.
For example,
if (x > 10) {
println("x is greater than 10");
} else if (x == 10) {
println("x is equal to 10");
} else {
println("x is less than 10");
}
Note:
• Every block must be enclosed in curly braces, even if it contains only one statement.
• The boolean expression must be valid and not produce runtime errors. For instance, all
variables within the expression must be declared and initialized.
• Nested if statements are allowed and can be used to create more complex decision-making
logic.
The for statement in MiniGo provides three common forms of iteration: basic form with only
a logical expression, form with initialization, and form for iterating over an array. Below is a
detailed description of each form.
In the basic form of a for loop, only a logical expression (condition) is used to control the loop
execution. The syntax is as follows:
for condition {
// statements
}
Here, the condition is a boolean expression that is evaluated before each iteration. The loop
continues executing as long as the condition evaluates to true. If the condition evaluates to
false, the loop exits.
Example:
for i < 10 {
// loop body
}
In this case, the loop will run as long as the value of i is less than 10.
In this form, the for statement includes an initialization expression, a condition expression,
and an update expression. The syntax is as follows:
• Initialization: This is executed only once before the loop begins and usually used to set
the initial state of variables used in the loop.. This is written in the form of assignment
statement or a variable declaration with initialization.
• Condition: This is a boolean expression that is evaluated before each iteration. The
loop continues as long as this expression evaluates to true. If the condition becomes
false, the loop terminates.
• Update: This part is written as an assignment statement which is executed after each
iteration. It is typically used to update or increment loop variables.
Example:
In this example, i is initialized to 0, and the loop will continue as long as i is less than 10.
After each iteration, i is incremented by 1.
In MiniGo, the for loop can also be used to iterate over the elements of an array using the
range keyword. The syntax for iterating over an array is as follows:
• index: This is a scalar variable keeping the index of the current element in the array. It
is of type int. In each iteration, index will take the value of the position of the element
within the array (starting from 0).
• value: This is a scalar variable keeping the value of the current element in the array.
It matches the type of the elements in the array. In each iteration, value will hold the
element at the current index.
• array: This is the array that is being iterated over. The array is the collection of elements
that the loop is iterating through. It must be of a fixed array type (since MiniGo does
not support slices or maps).
The range keyword is used to iterate over arrays in MiniGo, which only supports arrays (not
slices or maps). The loop iterates through each element, providing both the index and the
value of the element during each iteration.
Example 1 (Array iteration):
Example 2 (Array iteration without the index): If you do not need the index, you can omit it
by using the blank identifier .
In this example, only the value of each element is used, and the index is ignored.
The break statement in MiniGo is used to immediately terminate the execution of a for loop.
When break is encountered, the control flow will jump to the first statement that follows the
loop, effectively exiting the loop early.
The syntax for the break statement is:
break;
Note: The break statement does not require any condition. It immediately exits the nearest
enclosing for loop, regardless of the current iteration or condition.
Example: Breaking out of a for loop when a certain condition is met:
In this example, the for loop will terminate when the value of i reaches 5, and control will be
transferred to the first statement following the loop.
Important: The break statement is used exclusively in loops such as for. It can only break
out of the innermost loop in which it is placed.
The continue statement in MiniGo is used to skip the remaining part of the current iteration
of a loop and immediately proceed to the next iteration. When continue is encountered, the
control flow jumps back to the loop’s condition, effectively causing the next iteration to begin.
The syntax for the continue statement is:
continue;
Note: The continue statement does not terminate the loop; instead, it skips the rest of the
current iteration and continues with the next iteration.
Example: Skipping over an iteration in a for loop when a certain condition is met:
In this example, when the value of i is 5, the continue statement causes the loop to skip
the remaining part of the current iteration and continue with the next iteration, i.e., i will be
incremented and the next iteration will begin.
Important: The continue statement can only be used inside loops, such as for loops. It only
affects the current iteration and does not terminate the loop itself.
A function or method call statement is similar to a function or method call, except that
it is not part of an expression and is always terminated by a semicolon or a newline.
For example:
A return statement is used to transfer control and data back to the caller of the function in
which it appears. The return statement begins with the keyword return, which may optionally
be followed by an expression.
If the function or method has a return type, a return statement is required to return an
appropriate value.
8 Scope
In MiniGo, there are three kinds of scopes: global, function/method, and local.
• Global scope: This scope applies to all declarations that are made outside of functions
or methods. For variable and constant declarations, the scope extends from the point of
declaration to the end of the program. For other kinds of declarations, such as type or
function declarations, the scope applies throughout the entire program.
• Local scope: This scope applies to variables and constants declared within a block. The
scope begins at the point of declaration and extends until the end of the enclosing block.
9 Built-in Functions
For convenience, MiniGo provides the following built-in functions:
func getInt()int: reads and returns an integer value from the standard input
func putInt(i int): prints the value of the integer i to the standard output
func putIntLn(i int): same as putInt except that it also prints a newline
func getFloat() float: reads and returns a floating-point value from the standard input
func putFloat(f float): prints the value of the float f to the standard output
func putFloatLn(f float): same as putFloat except that it also prints a newline
func getBool()boolean: reads and returns a boolean value from the standard input
func putBool(b boolean): prints the value of the boolean b to the standard output
func putBoolLn(b boolean): same as putBoolLn except that it also prints a new line
func getString()string: reads and returns a string value from the standard input
func putString(s string): prints the value of the string to the standard output
func putStringLn(s string): same as putStringLn except that it also prints a new line
func putLn(): prints a newline to the standard output