0% found this document useful (0 votes)
15 views51 pages

A compiler for the π-Calculus: the backend

Our work covers the creation of the backend of the compiler for the π- calculus. The backend consists in the translation to an abstract assembly language. The abstract assembly language used in our compiler is MIL, a multi-threaded typed assembly language. MIL has the concept of types and locks. It uses the concurrency model of shared memory, where tuples are protected by a lock and may be accessed by multiple threads that have to acquire the exclusive right to alter the data. π- Calculus, on t

Uploaded by

Mohammed Kofil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views51 pages

A compiler for the π-Calculus: the backend

Our work covers the creation of the backend of the compiler for the π- calculus. The backend consists in the translation to an abstract assembly language. The abstract assembly language used in our compiler is MIL, a multi-threaded typed assembly language. MIL has the concept of types and locks. It uses the concurrency model of shared memory, where tuples are protected by a lock and may be accessed by multiple threads that have to acquire the exclusive right to alter the data. π- Calculus, on t

Uploaded by

Mohammed Kofil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Universidade dos Açores

Departamento de Matemática

A compiler for the π-Calculus:

the backend

Tiago Cogumbreiro

Orientador
Doutor Francisco Cipriano da Cunha Martins

August 29, 2007


Acknowledgements

I want to express my gratitude to my tutor Professor Francisco Martins. I


want to thank you for placing your confidence in me. For the enlightening
afternoons we spent working together. For the reassuring words of wisdom
that helped me keep focused. I am eager for our next endeavour together!
I am grateful to Professor Vasco Vasconcelos, for the continuous support
and especially for the provided chance.
Finally, I wish to thank the Centro de Investigação em Informática e Tec-
nologias da Informação, of the Universidade Nova de Lisboa, for the financial
support.

i
Abstract

Our work covers the creation of the backend of the compiler for the π-
calculus. The backend consists in the translation to an abstract assembly
language. The abstract assembly language used in our compiler is MIL, a
multi-threaded typed assembly language.
MIL has the concept of types and locks. It uses the concurrency model of
shared memory, where tuples are protected by a lock and may be accessed by
multiple threads that have to acquire the exclusive right to alter the data. π-
Calculus, on the other hand, uses the concurrency model of message passing,
where processes communicate through channels passing information amongst
each other.
We start by describing the target language’s (MIL) syntax, semantics and
type discipline, in Chapter 1. Afterwards we show a few usage examples,
showing off the basic operations of MIL. Finally we present a runtime library
that implements the π-calculus communication in the MIL language.
Compilers need to make various operations over an abstract syntactic
tree. The visitor pattern is chosen to solve this problem. During our work
in the compiler we envisioned an extension to the classic Visitor pattern.
Chapter 2 explains the advantages of using the Extended Visitor pattern
versus the usual approach.
The final chapter (Chapter 3) describes the translation process. It starts
by explaining the framing step, where variables are arranged and structured
according to a certain scope. Afterwards we an overview of the translation to
MIL, followed by an architectural in depth analysis of the implementation.

ii
Contents

1 MIL: Multi-threaded Typed Assembly Language 1


1.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Operational Semantics . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Type Discipline . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 The π-Calculus Runtime . . . . . . . . . . . . . . . . . . . . . 10
1.6.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6.2 Implementation . . . . . . . . . . . . . . . . . . . . . . 11
1.6.3 Usage Examples . . . . . . . . . . . . . . . . . . . . . . 12

2 Extending The Visitor Pattern 21


2.1 Intent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Applicability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6 Collaborations . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.7 Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.8 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.9 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.10 Known Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 The Backend 28
3.1 Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.2 Implementation . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

iii
3.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.2 Frame Indexing . . . . . . . . . . . . . . . . . . . . . . 34
3.2.3 Type Adapting . . . . . . . . . . . . . . . . . . . . . . 35
3.2.4 Code Generation Helper . . . . . . . . . . . . . . . . . 36

4 Conclusion 38

5 Appendix 39

iv
Chapter 1

MIL: Multi-threaded Typed


Assembly Language

Chip multiprocessors (CMP) are becoming a more realistic choice to the


micro processor market. This is because technological advances on single
processors are becoming more complex, power hungry and expensive. How-
ever, to take full advantage of this new architecture, we need the software
industry to adopt the multi-threaded programming paradigm.
MIL [8] proposes a solution to this problem [5] by combining a typed
assembly language with multi-threaded programming.
A typed assembly language provides the possibility for “executing trusted
code safely and efficiently” [4]. It ensures that foreign code never accesses
hidden resources of the host, allowing for programs to provide safe program
extensions. It also allows untrusted compilers to generate an assembly code
that can be compiled with a single trusted compiler. Because types ex-
ist pointers cannot be fabricated or forged and jumps can only be done to
checked code. Checks are also made on registers’ content before a code block
is run.
Multi-threaded programming at assembly level allows to correctly struc-
ture inter-thread synchronisation. The type system of this language enforces
the absence of race conditions.

1
1.1 Architecture
MIL envisages an abstract CMP with a shared main memory. Each processor
core owns a number of registers and an instruction cache. The main memory
is divided into a heap (for storing data and code blocks) and a run pool
(for storing suspended threads). Data blocks, kept in the main memory, are
represented by tuples and are protected by a lock. Blocks of code define the
needed registers (including the type each one needs), a list of required locks,
and an instruction set. The run pool contains all the idle threads. It may
happen that there are more threads to be run than the number of processors.

1.2 Syntax
The syntax of our language is described by the grammar in Figures 1.1, 1.2,
and 1.7. We postpone the exposure of types to Section 1.4. We rely on a set
of heap labels ranged over l, and a disjoint set of type variables ranged over
by α, β.
Most of the proposed instructions, represented in Figure 1.1, are standard
in assembly languages. Instructions are organised in sequences, ending with
a jump or with a yield. The instruction yield frees the processor to execute
another thread from the thread pool.
The abstract machine is defined by the number of processors available (N)
and the number of registers (R), as depicted in Figure 1.2.
An abstract machine can be in two possible states: halted or running. A
running machine comprises a heap, a thread pool, and an array of processors.
Heaps are maps from labels into heap values that tuples or code blocks.
Tuples are vectors of values protected by some lock. Code blocks comprise a
signature and a body. The signature of a code block, which is enforced by the
type system to be a ∀[~ α].(Γ requires Λ), describes a universal operator ∀[~
α]
that abstracts types used in the signature and used in the body, a register file
Γ that describes the type of each register, and the locks hold by the processor
when jumping to this code block. The body is a sequence of instructions that
are executed by a processor.
A thread pool is a multiset of pairs, each of which contains a pointer
(i.e., a label) to a code block and a register file. A processor array contains
N processors, where each is composed of a register file, a set of locks, and a
sequence of instructions.

2
registers r ::= r1 | . . . | rR
integer values n ::= . . . | -1 | 0 | 1 | . . .
lock values b ::= -1 | 0 | 1 | . . .
values v ::= r | n | b | l | pack τ, v as τ | packL α, v as τ |
v[τ ] | ?τ
instructions ι ::=
control flow r := v | r := r + v | if r = v jump v |
memory r := malloc [~τ ] guarded by α |
r := v[n] | r[n] := v |
unpack α, r := unpack v |
lock α, r := newLock b | α := newLockLinear
r := tslE v | r := tslS v | unlockE v | unlockS v |
fork fork v
inst. sequences I ::= ι; I | jump v | yield

Figure 1.1: Instructions

1.3 Operational Semantics


Thread pools are managed by the rules illustrated in Figure 1.3. Rule R-
halt stops the machine when it finds an empty thread pool and, at the same
time, all processors are idle, changing the machine state to halt. Otherwise,
if there is an idle processor and a pair is in the thread pool, then rule R-
schedule assigns a new thread to the running processor. Rule R-fork
places a new thread in the pool, taking the ownership of locks required by
the forked code block.
Operation semantics regarding locks are depicted in Figure 1.4. The
newLock creates a new lock in three possible states, according to the param-
eter: locked exclusively (when the parameter is -1), locked shared (when the
parameter is 1), and unlocked (when the parameter is 0). The scope of α is
the rest of the code block. A tuple with the value of the parameter of the
newLock is allocated in the heap and register r is made to point it. For ex-
ample, a new lock in the unlocked state allocates the tuple h0iβ . When the

3
lock sets λ ::= α1 , . . . , αn
permissions Λ ::= (λ, λ, λ)
register files R ::= {r1 : v1 , . . . , rR : vR }
processor p ::= hR; Λ; Ii
processors array P ::= {1 : p1 , . . . , N : pN }
thread pool T ::= {hl1 , R1 i, . . . , hln , Rn i}
heap values h ::= hv1 . . . vn iα | τ {I}
heaps H ::= {l1 : h1 , . . . , ln : hn }
states S ::= hH; T ; P i | halt

Figure 1.2: Abstract machine

lock is created in the exclusive lock state, the new lock variable β is added
to the set of exclusive locks held by the processor. Similarly, when the lock
is created in the shared lock state, the new lock variable β is added to the
set of shared locks held by the processor.
Linear locks are created by newLockLinear. They are initialised in the
locked state. The new lock variable β is added to the set of linear locks.
The Test and Set Lock, present in many machines designed with multiple
processes in mind, is an atomic operation that loads the contents of a word
into a register and then stores another value in that word. These two opera-
tions (the load and the store) are indivisible. There are two variations of the
Test and Set Lock in our language: tslE and tslS. When tslE is applied to
an unlocked state the type variable α is added to the set of exclusive locks
and the value becomes h-1iα . Various threads may read values from a tuple
locked in shared state, hence when tslS is applied to a shared or to an un-
locked lock the value of contained in the tuple is incremented, reflecting the
number of readers holding the shared lock, and then the type variable α is
added to the set of hold shared locks. When tslE is applied in a tuple with a
number greater than −1, it places a 0 in the target register.
Shared locks are unlocked with unlockS and the number of readers is
decremented. The running processor must hold the shared lock. Exclusive
locks are unlocked with unlockE, while the running processor holds the ex-
clusive lock.

4
∀i.P (i) = h ; ; yieldi
(R-halt)
h ; ∅; P i → halt
H(l) = ∀[ ].( requires Λ){I}
hH; T ] {hl, Ri}; P {i : h ; ; yieldi}i → hH; T ; P {i : hR; Λ; Ii}i
(R-schedule)
R̂(v) = l H(l) = ∀[ ].( requires Λ){ }
hH; T ; {i : hR; Λ ] Λ ; (fork v; I)i}i → hH; T ∪ {hl, Ri}; P {i : hR; Λ0 ; Ii}i
0

(R-fork)

Figure 1.3: Operational semantics (thread pool)

Rules related to memory instructions are illustrated in Figure 1.5. Values


can be stored in a tuple, when the lock that guards the tuple is hold by the
processor in the set of exclusive locks or in the set of linear locks. A value
can be loaded from a tuple if the lock guarded by it is hold by the processor
in any set of locks. The rule for malloc allocates a new tuple in the heap and
makes r point to it. The size of the tuple is that of sequence of types [~τ ], its
values are uninitialised values.
The transition rules for the control flow instructions, illustrated in Fig-
ure 1.6, are straightforward [6]. They rely on the function R̂ that works on
registers or on values, by looking for values in registers, in packs, and in
universal concretions.


 R(v) if v is a register

0 0
pack τ, R̂(v ) as τ if v is pack τ, v 0 as τ 0



R̂(v) = packL α, R̂(v 0 ) as τ if v is packL α, v 0 as τ

R̂(v 0 )[τ ] if v is v 0 [τ ]





v otherwise

1.4 Type Discipline


The syntax of types is exposed in Figure 1.7. A type of the form h~σ iα
describes a tuple that is protected by lock α, that resides in the heap.
Each type ~σ is either initialised (τ ) or uninitialised (?τ ). A type of form

5
∀[~α].(Γ requires Λ) describes a code block; a thread jumping into such a
block must instantiate all the universal variables α~ , it must also hold a reg-
ister file type Γ as well as the locks in Λ. The types lock(α), lockE(α), and
lockS(α) describe singleton types, respectively the lock type, the lock type
exclusive, and the lock type shared. The types ∃α.τ and ∃l α.τ define the
existential operator in [2]. The recursive type µα.τ allows the definition of
recursive data structures.
The type system is presented in Figures 1.8 to 1.11. Typing for values is
illustrated in Figure 1.8. Heap values are distinguished from operands (that
include registers as well) by the form of the sequent. Notice that lock values
(-1, 0, and 1) have any lock type. Also, uninitialised value ?τ has type ?τ ;
we use the same syntax for a uninitialised value (at the left of the colon) and
its type (at the right of the colon). A formula σ <: σ 0 allows to “forget”
initialisations.
Instructions are checked against a typing environment Ψ (mapping labels
to types, and type variables to the kind Lock: the kind of singleton lock
types), a register file type Γ holding the current types of the registers, and
a tuple Λ that comprises three sets of lock variables (the permission of the
code block), that are, respectively, exclusive, shared, and linear.
Rule T-yield requires that all shared and that all exclusive locks must
have been released prior to ending the thread. Only the thread that acquired
a lock may release it.
Rule T-fork splits the permission into two tuples, Λ and Λ0 : one goes
with the forked thread, the other remains with the current thread, according
to the permissions required by the target code block.
Rules T-new-lock1, T-new-lock-1, and T-new-lockL each adds
the type variable into the respective set of locks. Rules T-new-lock0,
T-new-lock1, and T-new-lock-1 assign a lock type to the register. Rules
T-tslE and T-tslS require that the value under test holds a lock; disal-
lowing testing a lock already held by the thread. Rules T-unlockE and
T-unlockS make sure that only held locks are unlocked. Finally, the rules
T-criticalE and T-criticalS ensure that the current thread holds the
exact number of locks required by the target code block. Each of these rules
also adds the lock under test to the respective set of locks of the thread. A
thread is guaranteed to hold the lock only after (conditionally) jumping to a
critical region. A previous test and set lock instructions may have obtained
the lock, but as far as the type system goes, the thread holds the lock after
the conditional jump.

6
The typing rules for memory and control flow are depicted in Figure 1.10.
The rule for malloc makes sure that the lock α is in scope, meaning that it
must be preceded by a newLock, in the same code block, or that the type
variable must be abstracted in the universal value operator. Values can be
loaded from tuples if the guarding type variable is in one of the set of locks.
Values can be stored in tuples if the guarding type variable is in the set of
exclusive locks or in the set of linear locks.
The rules for typing machine states are illustrated in Figure 1.11. They
should be easy to understand. The only remark goes to heap tuples, where
we make sure that all locks protecting the tuples are in the domain of the
typing environment.

1.5 Examples
To exemplify how MIL works, we show interprocessor communication. We
create a tuple of shared memory and then create two threads that try to
write in it concurrently. The lock α is passed to each code block, because it
is not in the scope of the forked threads.
main ( ) {
α , r1 := newLock 0
r2 := malloc [ i n t ] guarded by α
fork thread1 [α]
fork thread2 [α]
}
Each thread tries to acquire lock α with a different strategy. The first
thread (thread1) uses a technique called spin lock :
t h r e a d 1 ∀ [ α ] (r1 : h l o c k (α) i ˆα , r2 : h ? i n t i ˆα) {
r3 := t s l E r1 −− e x c l u s i v e b e c a u s e we want t o w r i t e
i f r3 = 0 jump c r i t i c a l R e g i o n 1 [ α ]
jump t h r e a d 1 [ α ]
}
The code block loops actively, not releasing the processor, until it even-
tually grabs the lock.
c r i t i c a l R e g i o n 1 ∀ [ α ] (r1 : h l o c k (α) i ˆα , r2 : h ? i n t i ˆα)
r e q u i r e s (α ; ; ) {
r2 [ 0 ] := 1

7
unlock r1
yield
}
The second thread (thread2) uses a different technique called a sleep lock :
t h r e a d 2 ∀ [ α ] (r1 : h l o c k (α) i ˆα , r2 : h ? i n t i ˆα) {
r3 := t s l E r1 −− e x c l u s i v e b e c a u s e we want t o w r i t e
i f r3 = 0 jump c r i t i c a l R e g i o n 2
fork thread2 [α]
}
This strategy features a cooperative approach. Instead of actively trying
to grab the lock, it forks a copy of that thread and yields the processor to
another thread in the pool.
c r i t i c a l R e g i o n 2 ∀ [ α ] (r1 : h l o c k (α) i ˆα , r2 : h ? i n t i ˆα)
r e q u i r e s (α ; ; ) {
r2 [ 0 ] := 2
u n l o c k E r1
yield
}
These two techniques have advantages over each other. A spin lock is
faster. It should be used when there is a reasonable expectation that the
lock will be available in a short period of time. A short coming of the spin
lock is demonstrated in this example:
main ( ) {
α , r1 := newLock -1
fork r e l e a s e [α]
jump s p i n L o c k [ α ]
}
The code block main creates a lock and forks a thread that unlocks it, thus
taking the ownership of the lock when forked:
r e l e a s e ∀ [ α ] (r1 : h l o c k (α) i ˆα) r e q u i r e s (α ; ; ) {
unlock r1
yield
}
The code block spinLock uses the spin lock technique to acquire the lock’s
permission, as exemplified in the first example. The problem with this pro-
gram is that it only works with machines with more than one processor.

8
Otherwise, because the spin locking thread does not relinquish the usage of
the processor, the forked process that will unlock l will never be able to do
so. The sleep lock technique, however, does context switching, which is an
expensive operation (i.e., degrades performance).
Libraries written in MIL use continuation passing style. In this model of
programming the user passes a continuation (a label pointing to a code block)
to the library’s procedure. When computation is finished, the procedures
runs the continuation label (either by forking or by jumping).
In continuation passing style, it is useful to pass user data to the con-
tinuation code. With existential types, it is possible to abstract the type of
the user data. A data structure (a tuple) is created to keep the continuation
label and the user data. Let ContinuationT ype stands for
∀[α].((r1 : h?intiα ) requires (; ; α))
Let P ackedU serData stands for
∃X.h∀[α].((r1 : X) requires (; ; α)), Xiα
A sketch of this usage is:
main ( ) {
α := newLockLinear
r2 := malloc [ i n t ] guarded by α
r1 := malloc [ C o n t i n u a t i o n T y p e , h ? i n t i ˆα ] guarded by α
r1 [ 0 ] := c o n t i n u a t i o n
r1 [ 1 ] := r2
r1 := pack r1 , h ? i n t i ˆα as PackedUserData
jump l i b r a r y [α]
}

l i b r a r y [ α ] (r1 : Pac kedUs erData ) {


−− do some c o m p u t a t i o n . . .
x , r1 := unpack r1 −− we do n o t need t h e p a c k e d t y p e h e r e
r2 := r1 [ 0 ] −− t h e c o n t i n u a t i o n
r1 := r1 [ 1 ] −− t h e u s e r d a t a
jump r2 [ α ]
}

continuation ContinuationType {
−− do some work
}

9
The code block main allocates the user data h?intiα and places it into a
tuple, along with the label pointing to the continuation. The tuple is then
packed and passed to the library, which eventually calls the continuation by
unpacking the packed data and jumping to the callback.

1.6 The π-Calculus Runtime


1.6.1 Architecture
The π-calculus runtime is a MIL library that is based in the virtual ma-
chine defined in [7]. The runtime implements the communication between
processes. There are two procedures defined in this library: writeMessage
and readMessage. A channel is defined by a data structure that keeps ei-
ther callbacks to read a message or messages waiting to be consumed. The
Figure 1.12 shows the abstract representation of the runtime. This runtime
supports only monadic π-calculus, but since polyadic π-calculus can be en-
coded by monadic π-calculus, no expressivity is lost.
The writeMessage and the readMessage procedures work nearly the same.
When a message is sent, by using the procedure writeMessage, we verify if
there are input continuations so that the message is delivered to an input
continuation. If there are no input continuations the message is enqueued.
When the procedure readMessage is called, we supply it a callback (to han-
dle the transmitted message) and verify if there are enqueued messages. If
there are enqueued messages in the channel, the callback is invoked with an
enqueued message as a parameter, otherwise the callback is enqueued until a
message is written. So, using Java to describe the algorithm, this is a sketch
of the implementation:
void readMessage ( I n p u t C o n t i n u a t i o n cont ) {
i f ( messages . s i z e ( ) i 0) {
Message msg = m e s s a g e s . remove ( 0 ) ;
c o n t . r e d u c e ( msg ) ;
} else {
c o n t i n u a t i o n s . add ( c o n t ) ;
}
}
The method writeMessage:
v o i d w r i t e M e s s a g e ( Message msg ) {

10
i f ( c o n t i n u a t i o n s . s i z e ( ) i 0) {
I n p u t C o n t i n u a t i o n c o n t = c o n t i n u a t i o n s . remove ( 0 ) ;
c o n t . r e d u c e ( msg ) ;
r e d u c e ( cont , msg ) ;
} else {
m e s s a g e s . add ( msg ) ;
}
}

1.6.2 Implementation
A channel needs to keep two attributes: a list of messages and a list of input
continuations. MIL does not allow the use of values as indexes in load oper-
ations (only literals) and it does not permit the allocation of dynamic sizes
of memory, hence we cannot have a tuple with dynamic length to implement
the list of elements. Our implementation uses locks to impose the queueing
of messages and of continuations. A Channel is the type:
hint, M essage, InputContinuationiα
The first element of the tuple specifies the contents of the channel:
• 0 if it has no messages and no input continuations;
• 1 if it has at least one queued message;
• 2 if it has at least one queued input continuation;
The second element of the tuple is a queued message and the third element
is a queued input continuation.
MIL has no notion of what classes are so we must adapt the code to
the language. A Continuation class is simply a callback with user data at-
tached to it (the implementation of the class). Using the continuation pass-
ing style explained in Section 1.5, the input callback is the data structure
InputContinuation :

∃X.h∀[α, M essage].((r1 : X,
r2 : Channel,
r3 : hlock(α)iα ) requires (α; ; )),
Xi

11
Because messages can be of any type, we abstract them with the universal
operator in both procedures. Let the signature of readMessage stands for:
∀[α, M essage].((r1 : InputContinuation,
r2 : Channel,
r3 : hlock(α)iα ) requires (α; ; ))
Let the signature of writeMessage stands for
∀[α, M essage].((r1 : M essage,
r2 : Channel,
r3 : hlock(α)iα ) requires (α; ; ))
The initialisation of a channel is the responsibility of the client. Since no
encapsulation is possible in an assembly language, the consistency of Channel
is responsibility of the user of the library.

1.6.3 Usage Examples


To use any of the two procedures we first need to initialise the Channel data
structure. To do so we need to create a dummy message and a dummy input
continuation. Our implementation provides a dummy continuation ( sink );
the dummy message must be created by the user. This example shows the
initialisation of a channel of type (int). Let the type variable IntContinuation
stands for
∃X.h∀[α](r1 : X,
r2 : int,
r3 : hlock(α)iα ) requires (α; ; ),
Xiα
First we initialise a dummy InputContinuation :
r2 := malloc [ ∀ [ α ] (r1 : i n t , −− t h e t y p e o f t h e u s e r d a t a
r2 : i n t , −− t h e t y p e o f t h e message
r3 : h l o c k (α) i ˆα −− t h e l o c k o f t h e c h a n n e l
) r e q u i r e s (α ; ; ) , −− we h o l d t h e l o c k
i n t ] guarded by α −− t h e u s e r d a t a ( an i n t )
r2 [ 0 ] := s i n k [ i n t ] [ i n t ]
r2 [ 1 ] := 0

12
The label sink has the type of the message abstracted and the type of the
environment abstracted so it can be used to fill a dummy InputContinuation .
The label sink just unlocks the lock α and yields the processor. The user
data we use is an integer, because it is more convenient. Now we need to
hide the user data with the existential operator:
r2 := pack r2 , i n t as I n t C o n t i n u a t i o n
Now, with the InputContinuation in register r2 we are able to create the channel:
r1 := malloc [
int , −− t h e s t a t u s o f t h e c h a n n e l
int , −− t h e t y p e o f t h e message
I n t C o n t i n u a t i o n −− t h e p ac k e d c o n t i n u a t i o n
] guarded by α
r1 [ 0 ] := 0 −− ’ 0 ’ marks an empty c h a n n e l
r1 [ 1 ] := 0 −− t h e dummy message
r1 [ 2 ] := r2 −− t h e dummy i n p u t c o n t i n u a t i o n
To send a message with the literal 10, emulating the process xh10i, we
use the channel initialised in r1 .
r2 := r1 −− move t h e c h a n n e l t o t h e s e c o n d p a r a m e t e r
r1 := 10 −− move t h e message t o t h e f i r s t p a r a m e t e r
jump w r i t e M e s s a g e
If we wish to read a message through channel x, emulating the process
x(a).P , we use the channel initialised in r1 . Let PType stands for the type:

∀[α](r1 : hiα ,
r2 : int,
r3 : hlock(α)iα ) requires (α; ; )

Process P can be sketched below:


P PType {
−− p r o c e s s P
}
To receive a message in the code block P we do:
r2 := r1 −− move c h a n n e l ’ x ’ t o t h e s e c o n d p a r a m e t e r
r1 := malloc [ PType , h i ˆα ] guarded by α
r4 := malloc [ ] guarded by α

13
r1 [ 0 ] := P −− t h e c o n t i n u a t i o n
r1 [ 1 ] := r4 −− empty u s e r d a t a
r1 := pack r1 , h i ˆα as I n t C o n t i n u a t i o n
jump readMessage

14
P (i) = hR; Λ; (α, r := newLock 0; I)i l 6∈ dom(H) β 6∈ Λ
β
hH; T ; P i → hH{l : h0i }; T ; P {i : hR{r : l}; Λ; I[β/α]i}i
(R-new-lock 0)
P (i) = hR; Λ; (α, r := newLock 1; I)i l 6∈ dom(H) β 6∈ Λ
β
hH; T ; P i → hH{l : h1i }; T ; P {i : hR{r : l}; (λE , λS ] {β}, λL ); I[β/α]i}i
(R-new-lock 1)
P (i) = hR; Λ; (α, r := newLock -1; I)i l 6∈ dom(H) β 6∈ Λ
β
hH; T ; P i → hH{l : h-1i }; T ; P {i : hR{r : l}; (λE ] {β}, λS , λL ); I[β/α]i}i
(R-new-lock -1)
P (i) = hR; Λ; (α := newLockLinear; I)i β 6∈ Λ
hH; T ; P i → hH; T ; P {i : hR; (λE , λS , λL ] {β}); I[β/α]i}i
(R-new-lockL)
P (i) = hR; Λ; (r := tslS v; I)i R̂(v) = l H(l) = hbiα b≥0
α
hH; T ; P i → hH{l : hb + 1i }; T ; P {i : hR{r : 0}; (λE , λS ] {α}, λL ); Ii}i
(R-tslS-acq)
P (i) = hR; Λ; (r := tslS v; I)i H(R̂(v)) = h-1iα
(R-tslS-fail)
hH; T ; P i → hH; T ; P {i : hR{r : -1}; Λ; Ii}i
P (i) = hR; Λ; (r := tslE v; I)i R̂(v) = l H(l) = h0iα
hH; T ; P i → hH{l : h-1iα }; T ; P {i : hR{r : 0}; (λE ] {α}, λS , λL ); Ii}i
(R-tslE-acq)
P (i) = hR; Λ; (r := tslE v; I)i H(R̂(v)) = hbiα b 6= 0
hH; T ; P i → hH; T ; P {i : hR{r : b}; Λ; Ii}i
(R-tslE-fail)
P (i) = hR; (λE , λS ] {α}, λL ); (unlockS v; I)i R̂(v) = l H(l) = hbiα
hH; T ; P i → hH{l : hb − 1iα }; T ; P {i : hR; (λE , λS , λL ); Ii}i
(R-unlockS)
P (i) = hR; (λE ] {α}, λS , λL ); (unlockE v; I)i R̂(v) = l H(l) = h iα
hH; T ; P i → hH{l : h0iα }; T ; P {i : hR; (λE , λS , λL ); Ii}i
(R-unlockE)

Figure 1.4: Operational semantics (locks)

15
P (i) = hR; Λ; (r := malloc [~τ ] guarded by α; I)i l 6∈ dom(H)
~ iα }; T ; P {i : hR{r : l}; Λ; Ii}i
hH; T ; P i → hH{l : h?τ
(R-malloc)
P (i) = hR; Λ; (r := v[n]; I)i H(R̂(v)) = hv1 ..vn ..vn+m iα
(R-load)
hH; T ; P i → hH; T ; P {i : hR{r : vn }; Λ; Ii}i
P (i) = hR; Λ; (r[n] := v; I)i
R(r) = l H(l) = hv1 ..vn ..vn+m iα
(R-store)
hH; T ; P i → hH{l : hv1 .. R̂(v)..vn+m iα }; T ; P {i : hR; Λ; Ii}i

Figure 1.5: Operational semantics (memory)

P (i) = hR; Λ; jump vi H(R̂(v)) = {I}


(R-jump)
hH; T ; P i → hH; T ; P {i : hR; Λ; Ii}i
P (i) = hR; Λ; (r := v; I)i
(R-move)
hH; T ; P i → hH; T ; P {i : hR{r : R̂(v)}; Λ; Ii}i
P (i) = hR; Λ; (r := r0 + v; I)i
(R-arith)
hH; T ; P i → hH; T ; P {i : hR{r : R(r0 ) + R̂(v)}; Λ; Ii}i
P (i) = hR; Λ; (if r = v jump v 0 ; )i
R(r) = v H(R̂(v 0 )) = {I}
(R-branchT)
hH; T ; P i → hH; T ; P {i : hR; Λ; Ii}i
P (i) = hR; Λ; (if r = v jump ; I)i R(r) 6= v
(R-branchF)
hH; T ; P i → hH; T ; {i : hR; Λ; Ii}i
P (i) = hR; Λ; (α, r := unpack v; I)i R̂(v) = pack τ, v 0 as
hH; T ; P i → hH; T ; P {i : hR{r : v 0 }; Λ; I[τ /α]i}i
(R-unpack)
P (i) = hR; Λ; (α, r := unpack v; I)i R̂(v) = packL β, v 0 as
hH; T ; P i → hH; T ; P {i : hR{r : v 0 }; Λ; I[β/α]i}i
(R-unpackL)

Figure 1.6: Operational semantics (control flow)

16
types τ ::= int | h~σ iα | ∀[~ α].(Γ requires Λ) | lock(α) |
lockE(α) | lockS(α) | ∃α.τ | ∃l α.τ | µα.τ | α
init types σ ::= τ | ?τ
register file types Γ ::= r1 : τ1 , . . . , rn : τn
typing environment Ψ ::= ∅ | Ψ, l : τ | Ψ, α : : Lock

Figure 1.7: Types

` hσ1 , . . . , τn , . . . , σn+m iα <: hσ1 , . . . , ?τn , . . . , σn+m iα (S-uninit)


n≤m
(S-reg-file)
` r0 : τ0 , . . . , rm : τm <: r0 : τ0 , . . . , rn : τn
` σ <: σ 0 ` σ 0 <: σ 00
` σ <: σ (S-ref, S-trans)
` σ <: σ 00
` τ 0 <: τ
Ψ ` n : int Ψ ` b : lock(α) Ψ `?τ : ?τ
Ψ, l : τ 0 ` l : τ
(T-label,T-int,T-lock,T-uninit)
0
Ψ ` v : τ [τ /α] α∈ / τ, Ψ Ψ ` v : τ [β/α] α∈/ β, Ψ
Ψ ` pack τ, v as ∃α.τ : ∃α.τ 0 0 Ψ ` packL β, v as ∃ α.τ : ∃l α.τ
l

(T-pack,T-packL)
Ψ ` v: τ
Ψ; Γ ` r : Γ(r) (T-reg,T-val)
Ψ; Γ ` v : τ
Ψ; Γ ` v : ∀[αβ].(Γ ~ 0
requires Λ)
(T-val-app)
Ψ; Γ ` v[τ ] : ∀[β].(Γ ~ 0 [τ /α] requires Λ[τ /α])

Figure 1.8: Typing rules for values Ψ ` v : σ and for operands Ψ; Γ ` v : σ

17
Ψ; Γ; (∅, ∅, λL ) ` yield (T-yield)
0 0 0
Ψ; Γ ` v : ∀[].(Γ requires Λ) Ψ; Γ; Λ ` I ` Γ <: Γ
(T-fork)
Ψ; Γ; Λ ] Λ0 ` fork v; I
Ψ, α : : Lock; Γ{r : hlock(α)iα }; Λ ` I α 6∈ Ψ, Γ, Λ
(T-new-lock 0)
Ψ; Γ; Λ ` α, r := newLock 0; I
Ψ, α : : Lock; Γ{r : hlock(α)iα }; (λE , λS ] {α}, λL ) ` I α 6∈ Ψ, Γ, Λ
Ψ; Γ; Λ ` α, r := newLock 1; I
(T-new-lock 1)
α
Ψ, α : : Lock; Γ{r : hlock(α)i }; (λE ] {α}, λS , λL ) ` I α 6∈ Ψ, Γ, Λ
Ψ; Γ; Λ ` α, r := newLock -1; I
(T-new-lock -1)
Ψ, α : : Lock; Γ; (λE , λS , λL ] {α}) ` I α 6∈ Ψ, Γ, Λ
(T-new-lockL)
Ψ; Γ; Λ ` α := newLockLinear; I
Ψ; Γ ` v : hlock(α)iα Ψ; Γ{r : lockS(α)}; Λ ` I α 6∈ Λ
(T-tslS)
Ψ; Γ; Λ ` r := tslS v; I
Ψ; Γ ` v : hlock(α)iα Ψ; Γ{r : lockE(α)}; Λ ` I α 6∈ Λ
(T-tslE)
Ψ; Γ; Λ ` r := tslE v; I
Ψ; Γ ` v : hlock(α)iα α ∈ λS Ψ; Γ; (λS \ {α}, λE , λL ) ` I
Ψ; Γ; (λS , λE , λL ) ` unlockS v; I
(T-unlockS)
α
Ψ; Γ ` v : hlock(α)i α ∈ λE Ψ; Γ; (λS , λE \ {α}, λL ) ` I
Ψ; Γ; (λS , λE , λL ) ` unlockE v; I
(T-unlockE)
Ψ; Γ ` r : lockS(α) Ψ; Γ ` v : ∀[].(Γ0 requires (λE , λS ] {α}, λ0L ))
Ψ; Γ; Λ ` I ` Γ <: Γ0 λ0L ⊆ λL
Ψ; Γ; (λE , λS , λL ) ` if r = 0 jump v; I
(T-criticalS)
Ψ; Γ ` r : lockE(α) Ψ; Γ ` v : ∀[].(Γ0 requires (λE ] {α}, λS , λ0L ))
Ψ; Γ; Λ ` I ` Γ <: Γ0 λ0L ⊆ λL
Ψ; Γ; Λ ` if r = 0 jump v; I
(T-criticalE)

Figure 1.9: Typing rules for instructions (thread pool and locks) Ψ; Γ; Λ ` I

18
Ψ, α : : Lock; Γ{r : h?τ~ iα }; Λ ` I ~τ 6= lock( ), lockS( ), lockE( )
Ψ, α : : Lock; Γ; Λ ` r := malloc [~τ ] guarded by α; I
(T-malloc)
Ψ; Γ ` v : hσ1 ..τn ..σn+m iα Ψ; Γ{r : τn }; Λ ` I τn 6= lock( ) α ∈ Λ
Ψ; Γ; Λ ` r := v[n]; I
(T-load)
Ψ; Γ ` v : τn Ψ; Γ ` r : hσ1 ..σn ..σn+m iα τn 6= lock( )
α
Ψ; Γ{r : hσ1 .. type(σn )..σn+m i }; Λ ` I α ∈ λE ∪ λL
(T-store)
Ψ; Γ; Λ ` r[n] := v; I
Ψ; Γ ` v : τ Ψ; Γ{r : τ }; Λ ` I
(T-move)
Ψ; Γ; Λ ` r := v; I
Ψ; Γ ` r0 : int Ψ; Γ ` v : int Ψ; Γ{r : int}; Λ ` I
0
(T-arith)
Ψ; Γ; Λ ` r := r + v; I
Ψ; Γ ` v : ∃α.τ Ψ; Γ{r : τ }; Λ ` I α 6∈ Ψ, Γ, Λ
(T-unpack)
Ψ; Γ; Λ ` α, r := unpack v; I
Ψ; Γ ` v : ∃l α.τ Ψ, β : : Lock; Γ{r : τ }; Λ ` I α 6∈ Ψ, Γ, Λ
Ψ; Γ; Λ ` α, r := unpack v; I
(T-unpackL)
Ψ; Γ ` r : int Ψ; Γ ` v : ∀[].(Γ0 requires (λE , λS , λ0L ))
Ψ; Γ; Λ ` I ` Γ <: Γ0 λ0L ⊆ λL
(T-branch)
Ψ; Γ; Λ ` if r = 0 jump v; I
Ψ; Γ ` v : ∀[].(Γ0 requires (λE , λS , λ0L )) ` Γ <: Γ0 λ0L ⊆ λL
Ψ; Γ; Λ ` jump v
(T-jump)

where type(τ ) = type(?τ ) = τ .

Figure 1.10: Typing rules for instructions (memory and control flow)
Ψ; Γ; Λ ` I

19
∀i.Ψ ` R(ri ) : Γ(ri )
(reg file, Ψ ` R : Γ )
Ψ ` R: Γ
∀i.Ψ ` P (i) Ψ ` R: Γ Ψ; Γ; Λ ` I
(processors, Ψ ` P )
Ψ`P Ψ ` hR; Λ; Ii
∀i.Ψ ` li : ∀[~αi ].(Γi requires ){ } Ψ ` Ri : Γi [β~i /~
αi ]
Ψ ` {hl1 , R1 i, . . . , hln , Rn i}
(thread pool, Ψ ` T )
~ : : Lock; Γ; Λ ` I
Ψ, α ∀i.Ψ, α : : Lock ` vi : σi
Ψ ` ∀[~
α].(Γ requires Λ){I} : ∀[~ α].(Γ requires Λ) Ψ, α : : Lock ` h~v iα : h~σ iα
(heap value, Ψ ` h : τ )
∀l.Ψ ` H(l) : Ψ(l)
(heap, Ψ ` H )
Ψ`H
Ψ`H Ψ`T Ψ`P
` halt (state, ` S )
` hH; T ; P i

Figure 1.11: Typing rules for machine states

Channel InputContinuation
+messages: List<Message> +reduce(msg:Message)
+continuations: List<InputContinuation>
+readMessage(input:InputChannel)
+writeMessage(msg:Message)

Figure 1.12: Class diagram of the π-calculus runtime

20
Chapter 2

Extending The Visitor Pattern

2.1 Intent
The Visitor [3] encapsulates the application of an operation over an object
structure. This pattern facilitates the addition of new operations without
modifying the classes on which they operate. Our extension separates three
concerns: traversal, object structure, and operation.

2.2 Motivation
The object structure traversed by the visitor is usually defined by a class
hierarchy, using inheritance and composition. Consider Figure 2.1, one object
structure for this class hierarchy is: the tree is composed by instances of
Addition, leaves are instances of Number.
This is an ad hoc object structure, thus it cannot be predicted, nor in-
ferred automatically. The convention, is that the attributes of an object that
share the same base class are the children of that object (like the Addition),
but this is only true in a subset of problems. By using this convention, it
is possible to create tools that generate visitors automatically. Conventions,
however, have drawbacks, like the lack of flexibility and the absence of intro-
spection. Object structures that are different from this convention cannot
be target of code generation tools.
An alternative approach to specify the object structure is the implemen-
tation of a common interface that embodies the relation between objects,
like the Composite pattern [3]. This approach provides introspection, gen-

21
IntegerExpression

Number Addition
+left: IntegerExpression
+right: IntegerExpression

Figure 2.1: An example of a class hierarchy.

eralising the retrieval of the children of a class of objects, but imposes the
implementation of an interface on every object that needs to be navigated,
that may not possible.
Two common techniques of making the Visitor reusable is by using the
Decorator pattern, where the decorator implements the traversal of the tree
and the decorated implements the code logic, and the Template Method,
where inheritance is used to separate the traversal from the code logic. Both
techniques, however, are hindered by a hard-coded object structure, making
the generic implementations brittle to change. The concept of traversal and
the code logic are mixed at the interface level, to workaround this the sep-
aration of responsibilities is performed using the Decorator pattern or using
inheritance, but the interface for both concerns is still the same (i.e., the
Visitor ).
The Guided Visitor [1] proposes the separation of navigation from com-
putation code. With this kind of visitor, the object structure is obtained
in the class where the traversal algorithm is performed, in the Guide. By
decoupling these two concepts it is possible to reuse traversal strategies and
apply them to different object structures.
We propose breaking the visitor into three concerns, each represented by
an interface, solving an isolated problem: object structure, traversal, and
computation code.

22
2.3 Applicability
Use the extended visitor when:

• the object structure is dynamic. By having an object structure that is


nor hardcoded into a class hierarchy, it is possible to make it dynamic.

• you are dealing with complex object structures. Object structures are
an object, hence they can be composed or extended, as any other class.

• your requirements change. The extended visitor is adaptable. The


object structure can be altered without affecting the class hierarchy, or
the visitor itself. The same thing goes to the traversal strategy.

2.4 Structure

ObjectStructure Traversal Visitor

IntObjectStructure DepthFirst Calculator

Client

IntegerExpression

Number Addition

Figure 2.2: The structure of the extended visitor.

23
:Traversal :ObjectStructure :Visitor

before(node)
traverse(node)

getChildren(node)

traverse(childNode)

after(node)

Figure 2.3: Sequence diagram of a visitor.

2.5 Participants
The Visitor has two methods that represent two events: before traversing
the children of the node and afterwards traversal. The visitor defines an ab-
stract interface where the operation that is applied to the object structure is
implemented. The ConcreteVisitor ( Calculator ) is the actual implementation
of the Visitor interface. A Node is any element that may be traversed. It can
be of any type. The ObjectStructure is an interface for the retrieval the chil-
dren of a node, if possible. A ConcreteObjectStructure ( IntObjectStructure )
is an implementation of the previous interface. The Traversal is the inter-
face for the controller of the application of the operation (the visitor). A
ConcreteTraversal (DepthFirst).

2.6 Collaborations
A client that uses the Extended Visitor pattern must create three objects,
an object structure, a traversal strategy, and the visitor that applies the
operation. When an element is visited two methods, corresponding to two
events are called, one before its children are visited and one after its children
are visited. Figure 2.3 illustrates the traversal using a depth visitor.

24
Visitor ObjectStructure

+before(node:Object) +getChildren(node:Object)
+after(node:Object)

queries
guides

Traversal

+traverse(struct:ObjectStructure,visitor:Visitor,
node:Object)

Figure 2.4: Class diagram of the implementation.

2.7 Consequences
Adding new Nodes is easy. Contrary to the usual Visitor pattern, adding a
new Node to the object structure is just a matter of updating the Object-
Structure’s implementation.
Visiting a class is more verbose. In the classic Visitor, because all the
concerns are mixed within the same class, a single parameter is needed, the
Node to visit. The Extended Visitor needs the user to explicitly create a
traversal object, an object structure object, and a visitor object.
Leverages encapsulation. The classic Visitor demands a close coupling
between the Visitor and the Nodes, since the Node calls methods of the
Visitor and vice-versa. With the Extended Visitor, Nodes are not aware,
even at interface level, of the visitor. The visitor may, or may not, be aware
of the interface of the Node too, since it receives instances casted to Object.
Code reuse. Because of the separation of concerns and cohesion of classes,
it is possible to reuse parts of the Extended Visitor in various situations, with
different classes of Nodes.

2.8 Implementation
We define three classes that embody separation of concerns. The class dia-
gram presented in Figure 2.4. The Traversal is the controller class responsible
for guiding the Visitor through the tree generated by the ObjectStructure . The

25
implementation of the ObjectStructure describes the children of a group of ob-
jects, that may share the same base class (or may not). The implementation
of the Visitor depends on the operation that needs to be applied over a
structure of objects.

2.9 Sample Code


We now show the implementation of the Extended Visitor, to implement a
calculator of integer expressions, using the class hierarchy depicted in Fig-
ure 2.1. First we start by defining the IntObjectStructure :
I t e r a b l e g e t C h i l d r e n ( O b j e c t node ) {
i f ( node i n s t a n c e o f A d d i t i o n ) {
A d d i t i o n add = ( A d d i t i o n ) node ;
r e t u r n A r r a y s . a s L i s t ( add . l e f t , add . r i g h t ) ;
}
throw new U n s u p p o r t e d N o d e E x c e p t i o n ( node ) ;
}
The only node that has children are the ones of type Addition.
Next we implement a generic depth first traversal strategy:
public void t r a v e r s e ( O b j e c t S t r u c t u r e s t r u c t , V i s i t o r
visitor ,
O b j e c t node ) {
v i s i t o r . b e f o r e ( node ) ;
try {
f o r ( O b j e c t c h i l d : s t r u c t . g e t C h i l d r e n ( node ) ) {
traverse ( struct , visitor , child ) ;
}
} catch ( U n s u p p o r t e d N o d e E x c e p t i o n e ) {
// no c h i l d r e n , s k i p i t
}
v i s i t o r . a f t e r ( node ) ;
}
Finally the visitor Calculator , that calculates an integer expression, is
implemented with this method:
int total = 0;
p u b l i c v o i d b e f o r e ( O b j e c t node ) {
i f ( node i n s t a n c e o f Number ) {

26
t o t a l += ( ( Node ) o b j ) . v a l u e ;
}
}

2.10 Known Uses


Compilers use the visitor pattern extensively. Having a powerful tool to
create more expressive algorithms and that allows code reuse, lessens the
burned of its implementation.
All previous choices of visitors dismiss a feature we believe gives our ex-
tension the edge: metadata. With metadata it is possible to empower a
visitor with knowledge. For example, we may be able to know the paren-
t/child relationship of a node, or the name of the attribute that references the
node. This opens up new possibilities, like rule based visitors, where there is
a rule base dispatching algorithm in the visitor for a client class, leveraging
the expressive power of the client code. By decorating the traversed elements
with metadata it is possible to take advantage of these benefits.
When you need different dispatching algorithms for the same traversal
and object structure. With metadata, it is possible to create visitors that
dispatch method calls to a specialised client class.

27
Chapter 3

The Backend

3.1 Framing
3.1.1 Introduction
A frame is a data structure that holds variables available in a certain scope.
Frames are used to specify parameters, local variables, and global variables
defined in a function. An frame of a function is allocated when the function
is called and deallocated when the function exits.
The parameters and local variables are instantiated when a function is
called. Local variables, parameters, and global variables should only exist
whilst they are needed. If a local variable is only used for the computation
of a temporary value, the memory associated with it should be freed when
the computation is over, or, at least, when the function is finished. When a
function (the nested) is defined inside another function (the nesting), it may
reference variables defined in the nesting function. These variables, however,
must not be freed when the nesting function exits, their values must be stored
inside the nested function’s frame.
In the π-calculus there are no functions or local variables. Input processes
are analogous to functions, in what framing is concerned. Frames define the
names known in a certain scope. Names used by processes must be present
in a certain scope, defined as a global variable, or defined as a parameter
of an input channel. Frames specify the free names available to a group of
processes.
The rules for defining a variable in a frame is the following:

28
a(x).(a<x> | b(y).y<x>)
Frame name: ’a’
Parameters: ’x’
Globals: ’a’

a<x> | b(y).y<x>

a<x> b(y).y<x>
Frame name: ’b’
Parameters: ’y’
Globals: ’x’
y<x>

Figure 3.1: How frames annotate an AST

(F1) the names used in the arguments of an output are variables;

(F2) the name of the input channel used in a replication is a variable;

(F3) the global variables defined in a frame are variables in the parent frame;

A variable is a global variable of a frame if it is not a parameter (of that


frame).
Figure 3.1 illustrates an example of framing is applied to a simple AST.
We explain the process from the leaves to the root process.
The output process yhxi uses a name as a parameter, with rule F1 it
becomes a variable in the enclosing frame. Moving to the parent process
b(y).yhxi, because it is an input process, a new frame is defined. The variable
x is global in that frame, because it is not a parameter of the input process.
The output process ahxi, similarly, uses a name (x) as a paremeter, then x
becomes a variable in the related frame. Both processes, ahxi and b(y).yhxi,
declared the variable x as global, hence x is a variable in the frame defined
by the input process a(x).ahxi | b(y).yhxi. Since the input process has a
parameter named x, the variable x is not global.

3.1.2 Implementation
Our compiler creates a representation of frames in the same step it performs
semantic checking. A new frame is mapped to each input process. Frames

29
Frame
+parent: Frame globals
+name: Symbol Variable
+depth: int parameters
+type: PiType
+position: int

Figure 3.2: The class diagram of Frame

are named after the name of the channel. Figure 3.2 illustrates the class
diagram of frames.
Stacks are used in visitors to allow the communication in nodes of different
depth. The visitor navigating the AST holds a stack for defining the current
frame. Each time it visits an input process a new frame is created and the
names of the arguments of the channel are defined as parameters of that
frame. This frame is pushed into a stack of frames. The top of the stack
is the current frame enclosing a group of processes. When the visitor leaves
the input process the current frame is removed from the top of the stack of
frames.
When the visitor enters an output process it adds all the instances of type
NameValue, present in the arguments, as variables to the current frame.
There is a stack in the visitor to hold Replication objects, allowing the
visitor to know wether an InputPrefix instance is contained in a Replication
instance. When a visitor enters an InputPrefix that is child of a Replication
instance it adds a variable (with the name of the channel) to the current
frame.

3.2 Translation
After semantic analysis we translate the AST to an abstract assembly lan-
guage. The translation can be performed in the same traversal where se-
mantic analysis is done, but this usually makes the code more complex.
Translating directly to a concrete assembly language reduces the portability
of the compiler, since it bounds one compiler to one platform.
Abstract assembly languages generalise concepts that exist in various

30
inputPrefix(input)
:SymbolTableProcessVisitor

1: pushFrame(input)

1.1: <<create>>
:SymbolTable frame:Frame

1.2: put(input, frame)

1.3: push(frame)

:Stack<Frame> :Map<InputPrefix,Frame>

Figure 3.3: Sequence diagram of how frames are created.

platforms. Various compilers may target the same abstract assembly lan-
guage, that will enable code generation for various architectures.
We show an overview of the implementation of the translation step, fol-
lowed by an in-depth analysis of each component that is part of the backend.

3.2.1 Overview
Our compiler targets MIL, an abstract typed assembly language that has
the concept of threads. In our compiler, code generation is performed after
semantic checking. The code generation traverses the AST twice: first to
associate a MIL Label to each process and then to translate the nodes. The
generated code uses the runtime library defined in Chapter 1.6.
By using composite visitors [9], through the decorator pattern [3], we
apply various operations separately, i.e., each in its own class, in the same
traversal. For example, the class FramePusherVisitor makes the decorated visi-
tor access a current frame while traversing the AST. This visitor is used both
in the process labeling traversal and in the code generation traversal. The
ScopeVisitor makes the symbol table be coherent to the decorated visitor. In

31
pi

absyn semant framing translate

Figure 3.4: The package diagram of the compiler

this case, the class is used only once, but since the code for handling the
symbol table is separated from the code generation, it is motivation enough
to create a new visitor.
The basic idea of the translation is: sequential operations are sequentially
performed in a single thread and processes being run in parallel are each run
in separate threads.
So, considering that each process has a label attached to it, the process
P | Q roughly translates to:
fork P[α]
f o r k Q[ α ]
The code generation for the nil process, 0, is pretty straight forward:
yield
The process xh10i has this translation sketch:
r1 := 10
r2 := x
jump w r i t e M e s s a g e [ α ]
The input process x(a).P is a bit trickier, since we need to pack the user
data (which we will address further), but we can sketch it as:
r1 := malloc [ C o n t i n u a t i o n T y p e , UserDataType ] guarded by α
r1 [ 0 ] := c o n t i n u a t i o n
r1 [ 1 ] := u s e r d a t a
r1 := pack UserDataType , r1 as P a c k e d C o n t i n u a t i o n
r2 := x
jump readMessage [α]
where the continuation is something like:

32
Translate
Tree preprocessing
Code generation

ScopeVisitor CodeHelper

ProcessLabelerVisitor
EnvironmentCreator RegisterPool

FramePusherVisitor

TypeAdapter FrameIndexerFactory

Pi to MIL type adapter

ChannelRepresentation FrameIndexer

UniqueSymbolFactory MilConstants GenerateMilUtil

MIL’s utility functions MIL’s frame adapter

Figure 3.5: The class diagram of the package translate.

33
a := r2 −− r e c e i v e t h e v a l u e o f t h e p a r a m e t e r
jump P [ α ]
Finally, because we only allow replicated input processes, the translation
of the replication is similar to the input, but with a different continuation.
Consider the continuation of the process !x(a).P :
a := r1
fork P
jump g r a b L o c k [ α ]
Where the generated output of the code fragment grabLock is something like:
r5 := t s l E r1
i f r5 = 0 jump t r y A g a i n
f o r k g r a b L o c k [ α ] −− s l e e p s p i n l o c k
yield
The last block (grabBlock) tries to read the message again:
r1 := malloc [ C o n t i n u a t i o n T y p e , UserDataType ] guarded by α
r1 [ 0 ] := c o n t i n u a t i o n
r1 [ 1 ] := u s e r d a t a
r1 := pack UserDataType , r1 as P a c k e d C o n t i n u a t i o n
r2 := x
jump readMessage [α]

3.2.2 Frame Indexing


The frame indexing is where we map an abstract frame to a concrete data
structure, in MIL. Concrete frames are tuples that arrange the global vari-
ables and then the parameters sequentially, as illustrated by Figure 3.6. We
also implement operations to retrieve the concrete location of the data struc-
ture for a certain variable.
A frame indexer is implemented in the class FrameIndexer, depicted in
Figure 3.7. We are able to obtain the π-calculus type for a certain index.
We can also retrieve the name of the variable in a certain index. Finally, it
is possible find the index associated with a certain variable.
The FrameIndexerFactory is a Flyweight factory [3]. We use this pattern to
optimize the creation speed and the memory usage of instances of FrameIndexer
.

34
Indexed Frame

Frame: a x y a
Global Variables: x:Str, y:Int
Parameters: a:Int Str Int Int

Figure 3.6: Frame indexing example.

FrameIndexer
+names: Symbol[]
+types: PiType[]
+getIndexFor(name:Symbol): int

Figure 3.7: The class diagram of the FrameIndexer

3.2.3 Type Adapting


Throughout all code generation, adapting from π-calculus types to MIL types
is a recurring task. The class TypeAdapter implements this operation. The
normal adapting is straight forward in case of primitive types, an integer
type in π-calculus maps to an integer type in MIL; the analogous is true to
the string type. Link types are mapped to the Channel type specified in the
π-calculus runtime, see Chapter 1.6.
To generate the appropriate Channel for a certain link type we use the
class ChannelRepresentation. The TypeAdapter is also a Flyweight factory of
instances of ChannelRepresentation, since these are used very frequently in code
generation.
The class ChannelRepresentation generates a MIL Channel according to the
argument (the Message) of the link type, remember we are dealing with
monadic π-calculus. This is a recursive operation, since the type of the
parameter must also be adapted from π-calculus to MIL. Each channel rep-
resentation creates a TypeFragment, associating a type variable to the adapted
type. This way, code generation is more comprehensible to a human and less
code is generated, since a type variable is considerably smaller than the
Channel type.

35
3.2.4 Code Generation Helper
To aid the translation we use the class CodeHelper, a Façade [3] to type
adapting, to frame indexing, to register pooling, and to environment creation.
Register pooling allows the reuse of registers. Environment creation is the
initialisation of a frame in MIL.
Register pooling is performed in the class RegisterPool . This class has two
methods, one to allocate a register and another to free a register. When a
register is allocated we return it if there is none in the pool. When a register
is freed, it is placed in the pool. If we allocate a register and there is a register
in the pool, that one is returned, not needing to increase the register use. It
is possible to know how many registers are being used at the same time.
Environment creation is implemented in the class EnvironmentCreator. It
generates a MIL type for a given frame. First, it uses the FrameIndexerFactory
to index the variables. Afterwards, the EnvironmentCreator adapts each π-
calculus type to a MIL type. Finally, a tuple is created and bound to a type
variable. This tuple comprises the adapted types (from the indexed frame).
The environment is passed between processes and contains the values of the
existing variables. When communicating with the runtime, it is the user data
sent to the input continuation.
The CodeHelper contains a group of methods to generate blocks of code
(code fragments). There is a concept of a current code block where instruc-
tions are appended to. There is a method to close the current code block,
where we free the registers not freed explicitly and generate the code signa-
ture. The closed code block is then added as a new code fragment to the
generated tree.
There are two methods in the CodeHelper that are used to generate code
for a process. These were not placed in the Translate class because unit testing
was easier, and because the idea is that most code generation is handled by
the CodeHelper, not the Translate , which should be used more like a mediator
between the traversal and the code generation.
Code generation for the output process is straightforward. If the param-
eter is a literal, its valued is converted to a MIL value. If the parameter is a
name, then we use the frame indexing to know where the name is located in
the environment variable. The value is loaded from the environment to the
register that passes the message. Afterwards the a jump to the writeMessage
is done.
When the input process is run, a new frame is created, this corresponds

36
to the allocation and initialisation of a new frame object. The initialisation
is the copy the global variables present in the new frame from the old en-
vironment. After frame switching, we prepare the call for the readMessage,
defined in the runtime.

37
Chapter 4

Conclusion

In our work we describe MIL. We implement a π-calculus runtime library


written in that language. Afterwards, we introduce an extension of the Visi-
tor design pattern, outlining its uses and advantages over the classical imple-
mentation. By separating the traversal, the object structure, and the visitor,
we increase the reuse and maintenance of the code. Finally, we show the
backend of the compiler. Our backend translates the π-calculus to MIL.

38
Chapter 5

Appendix

This is an example of a π-calculus program:

(new a:(int))(!a(x).a<x> | a<1>)

This is the generated MIL code:

version error 0.0

registers 7

packed: exists Env. <[l](r1: Env, r2: int, r3: <lock(l)>^l) requires (l;;), Env>^l

channel: <int, int, packed>^l

mainEnv: <channel>^l

main []() requires (;;) {


l, r3 := newLock -1 -- Create the closure lock.
r5 := malloc [channel] guarded by l
r6 := malloc [[l](r1: int, r2: int, r3: <lock(l)>^l) requires (l;;), int] guarde
-- allocate channel: (int)
r6[0] := sink[int][int] -- initialize the input channel continuation with a dumm
r6[1] := 0 -- store the dummy environment into the input channel
r6 := pack int, r6 as packed -- pack the input channel
r7 := malloc [int, int, packed] guarded by l -- allocate data for channel: (int)
r7[0] := 0 -- the initial status is 0 (empty)

39
r7[2] := r6 -- move the dummy input channel
r7[1] := 0 -- move the dummy output message
r5[0] := r7 -- initialize (int)
r1 := r5 -- Move the environemnt to the first register
jump main_new_a[l]
}

main_new_a [l](r1: mainEnv, r3: <lock(l)>^l) requires (l;;) {


jump main_parallel[l]
}

main_parallel [l](r1: mainEnv, r3: <lock(l)>^l) requires (l;;) {


unlockE r3 -- Unlock the closure lock.
fork main_parallel_left[l] -- Fork left
fork main_parallel_right[l] -- Fork right
yield
}

main_parallel_left [l](r1: mainEnv, r3: <lock(l)>^l) requires (;;) {


r5 := tslE r3 -- Try to grab the lock.
if r5 = 0 jump main_replication[l] -- Got the lock. Run the left process.
fork main_parallel_left[l] -- Try to grab the lock later.
yield
}

main_parallel_right [l](r1: mainEnv, r3: <lock(l)>^l) requires (;;) {


r5 := tslE r3 -- Try to grab the lock.
if r5 = 0 jump main_out_a[l] -- Got the lock. Run the right process.
fork main_parallel_right[l] -- Try to grab the lock later.
yield
}

main_replication [l](r1: mainEnv, r3: <lock(l)>^l) requires (l;;) {


jump a_in_a[l]
}

aEnv: <channel, int>^l

40
a_in_a [l](r1: mainEnv, r3: <lock(l)>^l) requires (l;;) {
r5 := malloc [channel, int] guarded by l -- alloc space for the new env ’a’
r7 := r1[0]
r5[0] := r7
r5[1] := 0 -- initialize ’x’
r2 := r1[0] -- move the channel a as 2nd arg
r7 := malloc [[l](r1: aEnv, r2: int, r3: <lock(l)>^l) requires (l;;), aEnv] guar
r7[0] := read_replicate_a
r7[1] := r5
r1 := pack aEnv, r7 as exists Env. <[l](r1: Env, r2: int, r3: <lock(l)>^l) requi
jump inputMessage[l][int]
}

read_replicate_a [l](r1: aEnv, r2: int, r3: <lock(l)>^l) requires (l;;) {


r1[1] := r2
fork a_out_a[l]
jump try_reading_again_a[l]
}

try_reading_again_a [l](r1: aEnv, r3: <lock(l)>^l) requires (;;) {


r7 := tslE r3
if r7 = 0 jump read_again_a[l]
fork try_reading_again_a[l]
yield
}

read_again_a [l](r1: aEnv, r3: <lock(l)>^l) requires (l;;) {


r2 := r1[0]
r7 := malloc [[l](r1: aEnv, r2: int, r3: <lock(l)>^l) requires (l;;), aEnv] guar
r7[0] := read_replicate_a
r7[1] := r1
r1 := pack aEnv, r7 as exists Env. <[l](r1: Env, r2: int, r3: <lock(l)>^l) requi
jump inputMessage[l][int]
}

a_out_a [l](r1: aEnv, r3: <lock(l)>^l) requires (l;;) {


r2 := r1[0] -- the channel ’a’
r1 := r1[1] -- the output message is the variable ’x’

41
jump outputMessage[l][int]
}

main_out_a [l](r1: mainEnv, r3: <lock(l)>^l) requires (l;;) {


r2 := r1[0] -- the channel ’a’
r1 := 1 -- the output message is an integer literal
jump outputMessage[l][int]
}

InputContinuation: exists Env. <[l](r1: Env, r2: Message, r3: <lock(l)>^l) require

Channel: <int, Message, InputContinuation>^l

ReduceCode: [l,Message] (r2: Channel,


r3: <lock(l)>^l) requires (l;;)

OutputCode: [l,Message] (r1: Message,


r2: Channel,
r3: <lock(l)>^l) requires (l;;)

OutputUnlockedCode: [l,Message] (r1: Message,


r2: Channel,
r3: <lock(l)>^l) requires (;;)

InputCode: [l,Message] (r1: InputContinuation,


r2: Channel,
r3: <lock(l)>^l) requires (l;;)

InputUnlockedCode: [l,Message] (r1: InputContinuation,


r2: Channel,
r3: <lock(l)>^l) requires (;;)

sink [Env,Message,l](r1: Env, r2: Message, r3: <lock(l)>^l) requires (l;;) {


unlockE r3
yield
}

outputMessage OutputCode {

42
r4 := r2[0] -- Grab the status of the channel.
if r4 = 0 -- Empty. Place our message into the channel.
jump outputFill[l][Message]
r4 := r4 - 1 -- if r4 = 1
if r4 = 0 -- Has a message already. Try again.
jump outputScheduleMessage[l][Message]
r4 := r4 - 1 -- if r4 = 2
if r4 = 0 -- Has an input channel. Redux.
jump outputReduce[l][Message]

-- should never reach this code


jump outputMessage[l][Message]
}

outputFill OutputCode {
r2[0] := 1
r2[1] := r1
unlockE r3
yield
}

outputScheduleMessage OutputCode {
unlockE r3
jump outputGrabLock[l][Message]
}

outputGrabLock OutputUnlockedCode {
r4 := tslE r3
if r4 = 0 -- we grab the lock, back to the begining
jump outputMessage[l][Message]

-- try again:
fork outputGrabLock[l][Message]
yield
}

outputReduce OutputCode {
r2[1] := r1

43
jump reduce[l][Message]
}

reduce ReduceCode {
r2[0] := 0 -- flag it as empty
r4 := r2[2] -- the input
r2 := r2[1] -- the message

x, r4 := unpack r4 -- unpack the tuple


r1 := r4[1] -- the env
r4 := r4[0] -- the continuation
jump r4[l]
}

inputMessage InputCode {
r4 := r2[0]
if r4 = 0
jump inputFill[l][Message]

r4 := r4 - 1 -- if r4 = 1
if r4 = 0
jump inputReduce[l][Message]

r4 := r4 - 1 -- if r4 = 2
if r4 = 0
jump inputSchedule[l][Message]
-- should never reach this code
jump inputMessage[l][Message]
}

inputFill InputCode {
r2[0] := 2 -- has an input
r2[2] := r1 -- put the input into the pool
unlockE r3 -- we are done; yield
yield
}

inputReduce InputCode {

44
r2[2] := r1
jump reduce[l][Message]
}

inputSchedule InputCode {
unlockE r3
jump inputGrabLock[l][Message]
}

inputGrabLock InputUnlockedCode {
r4 := tslE r3
if r4 = 0 jump inputMessage[l][Message]
fork inputGrabLock[l][Message]
yield
}

45
Bibliography

[1] Martin Bravenboer and Eelco Visser. Guiding visitors: Separating navi-
gation from computation, November 29 2001.

[2] Cormac Flanagan and Martn Abadi. Types for safe locking, February 02
1999.

[3] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. De-
sign Patterns: Elements of Reusable Object-Oriented Software. Addison-
Wesley, 1994.

[4] Greg Morrisett, David Walker, Karl Crary, and Neal Glew. From sys-
tem F to typed assembly language. ACM Transactions on Programing
Language and Systems, 21(3):527–568, 1999.

[5] Kunle Olukotun and Lance Hammond. The future of microprocessors.


Queue, 3(7):26–29, 2005.

[6] Benjamin C. Pierce. Advanced Topics In Types And Programming Lan-


guages. MIT Press, November 2004.

[7] David N. Turner. The Polymorphic Pi-Calculus: Theory and Implementa-


tion. PhD thesis, LFCS, University of Edinburgh, June 1996. CST-126-96
(also published as ECS-LFCS-96-345).

[8] Vasco T. Vasconcelos and Francisco Martins. A multithreaded typed


assembly language. In Proceedings of TV06 - Multithreading in Hardware
and Software: Formal Approaches to Design and Verification, 2006.

[9] J. M. W. Visser and Joost Visser. Visitor combination and traversal


control. Technical report, July 11 2001.

46

You might also like