0% found this document useful (0 votes)
10 views17 pages

Unboxing Using Specialisation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views17 pages

Unboxing Using Specialisation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/2442838

Unboxing using Specialisation

Article · December 1996


DOI: 10.1007/978-1-4471-3573-9_7 · Source: CiteSeer

CITATIONS READS
13 164

3 authors, including:

Simon Loftus Peyton Jones


Microsoft
385 PUBLICATIONS 19,381 CITATIONS

SEE PROFILE

All content following this page was uploaded by Simon Loftus Peyton Jones on 23 July 2013.

The user has requested enhancement of the downloaded file.


Unboxing using Specialisation
Cordelia Hall Simon L. Peyton Jones
University of Glasgow University of Glasgow
Patrick M. Sansom
University of Glasgowy

Abstract
In performance-critical parts of functional programs substantial perfor-
mance improvements can be achieved by using unboxed, instead of boxed,
data types. Unfortunately, polymorphic functions and data types cannot
directly manipulate unboxed values, precisely because they do not con-
form to the standard boxed representation. Instead, specialised, monomor-
phic versions of these functions and data types, which manipulate the
unboxed values, have to be created. This can be a very tiresome and
error prone business, since specialising one function often requires the
functions and data types it uses to be specialised as well.
In this paper we show how to automate these tiresome consequential
changes, leaving the programmer to concentrate on where to introduce
unboxed data types in the rst place.

1 Introduction
Non-strict semantics certainly add to the expressive power of a language [8].
Sometimes the performance cost of this extra expressiveness is slight, but not
always. It can happen that an inner loop of a program is made seriously less
ecient by non-strictness. For example, consider the following fragment of a
complex-number arithmetic package:
data Complex = Cpx Float Float

addCpx :: Complex -> Complex -> Complex


addCpx (Cpx r1 i1) (Cpx r2 i2) = Cpx (r1+r2) (i1+i2)

In a strict language, a complex number would be represented a pair of un-


boxed oating point numbers, and addCpx would actually perform the addi-
tions of the components. In a non-strict language such as Haskell [7], though,
a complex number is a pair of pointers to possibly-unevaluated thunks. The
function addCpx cannot actually force these thunks, since they might be bot-
tom, but rather must build further thunks representing (r1+r2) and (i1+i2)
respectively. If complex arithmetic is in the inner loop of the program, the
performance penalty is quite substantial.
 This paper appeared in Functional Programming, Glasgow 1994, Workshops in Comput-
ing, Springer Verlag, 1995.
y Authors' address: Dept of Computing Science, University of Glasgow, Glasgow G12 8QQ,
Scotland. E-mail: fsimonpj,[email protected].
Let us suppose, then, that a pro ler has focussed the programmer's atten-
tion on this arithmetic package. What can be done to make it more ecient?
Peyton Jones & Launchbury [16] suggested that unboxed data types could be
made \ rst class citizens", and that the programmer be allowed to declare data
types involving them, thus:
data Complex = Cpx Float# Float#

Here, Float# is the type of unboxed oating-point numbers. (An immediate


consequence is that the components of a complex number must be evaluated
before the complex number itself is constructed, so that the representation is
stricter than before.) Modi cations of this kind can have a dramatic impact
on the performance of some programs.
There is a catch, though. Suppose we wanted to transform a list of complex
numbers to a list of their imaginary components. We might try to write:
imags :: [Complex] -> [Float#]
imags cs = [im | Cpx re im <- cs]

Unfortunately, we cannot form a list of unboxed oating point numbers, because


both the size and the pointer-hood of a Float# di ers from that of a pointer.
Instead, a new data type must be declared for lists of Float#:
data LFloat = NilF
| ConsF Float# LFloat

Alas, none of the usual list-manipulating functions (map, filter etc) work over
LFloat , so new versions of them have to be de ned, and so it goes on.
In general, Peyton Jones & Launchbury [16] put forward the restriction
that: polymorphic functions (and data constructors) cannot be used at unboxed
types. We use the term \creeping monomorphism" to describe the sad necessity
to declare new functions simply because of this restriction. The goal of this
paper is to lift the restriction, by automating the production of new versions
of existing functions and data constructors.
Even the humble + function in addCpx is an example. When the Complex
data type is changed, type inference will nd that r1 and r2 are of type Float#.
Since + is an overloaded function, with type
(+) :: Num a => a -> a -> a

Peyton Jones & Launchbury would prohibit + from being applied to a value
of type Float#. However, if the restriction is lifted, and Float# is made an
instance of class Num, then the code for addCpx will compile without modi ca-
tion.
Our goal is to allow the programmer to use pro ling information [19] to
improve run-time performance by making minimal changes to data type decla-
rations and type signatures. The system we describe in this paper propagates
these changes throughout the program, compiling specialised versions of poly-
morphic functions and constructors where they are now used at unboxed types.
We begin by describing our Core language (Section 2). We then examine
the use of polymorphic functions and data types in the presence of unboxed val-
ues (Section 3), formalising Peyton Jones & Launchbury's unboxing restriction,
describing the process of specialisation which enables us to relax this restriction
(Section 3.1), and presenting a partial evaluator which performs this speciali-
sation (Section 3.2). In Section 4 we discuss the practical implications, before
presenting some preliminary results (Section 5) and discussing related work
(Section 6).

2 The Core Language


Our source language is Haskell, but the language we discuss in this paper is the
intermediate language used by our compiler, the Core language [15]. There are
three reasons for studying this intermediate language. First, it allows us to focus
on the essential aspects of the algorithm, without being distracted by Haskell's
syntactic sugar. Second, Haskell's implicit overloading is translated into explicit
function abstractions and applications so that no further special treatment of
overloading is necessary. Third, the type abstractions and applications which
are implicit in a Haskell program, are made explicit in the Core program.
Thus, each polymorphic application which manipulates unboxed values can
easily be identi ed by looking at the type arguments in the application. Our
transformation identi es any applications involving unboxed types and replaces
this with an appropriately specialised version of the function.
The syntax of the Core language is given in Figure 1. It is an explicitly
typed second-order functional language with (recursive) let, (saturated) data
constructors, algebraic case and explicit boxing. This language combines the
higher-order explicit typing of Core-XML [11] with the explicit boxing and type
structure of Peyton Jones & Launchbury [16].
In this language the argument of an application is always a simple variable.
A non-atomic argument is handled by rst binding it to a variable using a let-
expression. This restriction re ects the fact that a non-atomic argument must
be bound to a heap-allocated closure before the function is applied. In the case
of a strict, unboxed value ( UnboxedType), the let-expression evaluates
2
the bound expression and binds the result value before the function is applied.
A type normalised expression [6] is one that satis es the following two con-
ditions. 1) The type abstractions occur only as the bound expression of let-
expressions; i.e. let x: =  1: 2:  n:e1 in e2 which we will abbreviate

with let x: =  1  n:e1 in e2 . 2) Type applications are only allowed for
variables; i.e. x 1
f n which we will write x 1 n . Henceforth we
g   f g f  g
will assume that all expressions are type normalised.
2.1 Data constructors
In this second-order language a data constructor must be (fully) applied to
both the type arguments of the data type and the value arguments of the data
object being constructed. For example, the standard list data type
data List = Nil | Cons (List )

has two constructors, Nil and Cons. The implied constructor declarations
might be expressed in the higher-order calculus as follows:
Expression e ::= x
x::e
j
exj
let x: = e1 in e2
j
 :e
j
e 
j f g
C  1  n x1 x a
j f  g 
case e of Cj xj 1 :j 1
j f  xjaj :jaj -> ej mj=1
g

PolyType  ::= 8 :  j

MonoType  ::=  v j

BoxedType  ::=
j 1 2
!
j  1  n

UnboxedType v ::= int# j float# j char#


j  
# 1 n 

Figure 1: Core Language Syntax

Nil : 8:List
=  :[Nil]
Cons : 8: List
! List !
=  :v1 : :v2 : List :[Cons v1 v2 ]

where [ ] indicates actual construction of the data object. Even though the
constructor Nil has an arity of zero the higher-order constructor still requires
a type parameter to indicate what type it is being used at, e.g. Nil Int . In f g
general, a data declaration has the form
data  1 n = C1 11 1a1 |
  | Cm m1 mam 

which gives rise to m higher-order constructors with the form


Cj : 1 n:j 1
8  jaj  1
!  ! !n 
=  1 n:vj 1 :j 1: vjaj :jaj :[Cj vj 1 vjaj ]
  

where n is the number of type parameters of the data type and aj is the arity
of the data constructor. Since the number of type parameters is determined by
the arity of the type constructor, , it is the same for all data constructors of
that type.
2.2 Well-formed expressions
We call an expression e well-formed under type assumption ? if we can derive
the typing judgement ? e : . The typing rules are quite standard [11] and
`
are not given here (but see Section 3).
2.3 Notation
For notational convenience we abbreviate sequences such as x1 xn with x, 
where n = length x. This is extended to sequences of pairs which are abbrevi-
ated with paired sequences, e.g. xj 1 :j 1 xjaj :jaj is abbreviated with xj :j .

We use for syntactical identity of expressions.


3 Polymorphism and Unboxed Values


A pure polymorphic function is usually compiled by treating all polymorphic
values in a uniform way. Typically all such values are required to be represented
as a pointer to a heap allocated closure i.e. they must be boxed. For example,
consider the permuting combinator C:
Cf xy = f yx (Haskell)
C = a b c:f :a b c:x:b:y:a:f y x
! ! (Core)
To generate the code for C we must 1know the representation of the polymorphic
values, x and y, being manipulated. By insisting that such polymorphic values
are always boxed we can compile code which assumes that such values are
always represented by a single pointer into the heap.
It follows that a polymorphic function can only be used at boxed types,
since the representation of an unboxed type violates the assumption above.
We impose a restriction in the typing rules for expressions which prevents a
polymorphic function being applied to an unboxed type.2
? e : : ()
? e  : [= ]  UnboxedType
` 8
62 
` f g

A similar restriction is imposed in the typing rule for data constructors. This
prohibits the construction of polymorphic data objects with unboxed compo-
nents, e.g. List Float#.
These restrictions cause the \creeping monomorphism", described in Sec-
tion 1, since the programmer must declare suitable monomorphic versions of
any polymorphic functions and data types used at unboxed types. This can be
exceedingly tedious and error prone.
To address this problem we propose to relax the unboxing restriction ( ), 
allowing the programmer unrestricted use of unboxed values. During the com-
pilation we undertake automatically to generate the necessary monomorphic
function versions: converting the unrestricted program into one which satis es
the unboxing restriction ( ). We can then generate code which directly ma-

nipulates the unboxed values since their type, and hence their representation,
1 The representation information that is typically required is the size of a value and the
position of any heap pointers (so that all roots can be identi ed during garbage collection).
When more sophisticated calling conventions are used, such as passing arguments in registers,
the actual type may also a ect the treatment of a value. For example a boxed value may
be passed in a pointer register, an Int# in an integer register, and a Float# in a dedicated
oating point register.
2 This restriction is equivalent to \Restriction 1: loss of polymorphism" in Peyton Jones
& Launchbury [16].
is known at compile time. For example, here is the monomorphic version of C
which manipulates Float#s:
C 0 = f : Float# Float# Float#:x: Float#:y: Float#:f y x
! !

Since the code generator knows that x and y have type Float# it produces code
which manipulates oating point numbers, instead of pointers. This is the only
di erence between the code produced for C and C 0.
3.1 Specialisation
The transformation of program with unrestricted use of unboxed types into
one which satis es the unboxing restriction above is performed using a partial
evaluator. The idea is to remove all type applications involving unboxed types
by creating new versions of the functions being applied, specialised on the
unboxed types. These specialised versions are created by partially evaluating
the unboxed type applications.
Before launching into the de nition of the partial evaluator itself, we give an
overview of the algorithm. Each time a function (or constructor) is applied to
a sequence of types, a new version of the function (or constructor), specialised
on any unboxed types in the application, is created, unless such a version has
already been created. For example, given the code3
append fInt#g
xs (map f[Int#] Int#g
(sum fInt#g) (append f[Int#]g yss zss)

a version of append, specialised at type Int#, is created. Given the de nition


of append:
append =  :xs : [ ]:ys : [ ]:e

the specialised version, append_Int#, is:


append_Int# = append fInt#g
= ( : : : : :e)
xs [ ] ys [ ] fInt#g
= ( : : : :e)[ = ]
xs [ ] ys [ ] Int#
 : : :
= xs [Int#] ys [Int#] Int# :e[ = ]
The name of the specialised version, append_Int#, is constructed by appending
the specialising type(s) to the original name.
When a function is applied to a boxed type, there is no need to specialise
on that type argument since the polymorphic version, which assumes a boxed
type will suce. Consequently, we make the specialisation polymorphic in any
boxed type arguments. For example, the application of map is only specialised
on the second type argument, Int#, since the rst type argument, [Int#], is
a boxed type.
map =  :f : ! :xs : [ ]:e
map_*_Int# =   :map f  Int#g
3 For notational convenience we use the standard [ ] list notation, where [ ]  List .
=  :( :f : :xs : [ ]:e)  Int#
! f g
=  :(f : :xs : [ ]:e)[ = ; Int#= ]
!
=  :f :  Int#:xs : [  ]:e[ = ; Int#= ]
!

A * is used to indicate a boxed type argument in which the specialised version


remains polymorphic. This reduces the number of specialised versions created
since all boxed type arguments will be treated as a * type when determining the
specialisation required. For example, the application map Bool Int# would
f g
also use the specialised version map_*_Int#.
The applications can now be modi ed to use the specialised versions, with
all unboxed types new removed from the application. The nal version of the
code for the example above is:
append_Int#
xs (map_*_Int# f[Int#]g
sum_Int# (append_* f[Int#]g yss zss))

In summary the specialisation algorithm is:


while the unboxing restriction ( ) is not satis ed:

1. Find a type application, f  , involving an unboxed type.
f g
2. Create a suitably specialised version of f (if it does not
already exist).
3. Use the specialised version at this application site, removing
the unboxed types from the application.
Since all polymorphic values must be let-bound (see Figure 1), the de nition
of f, which has to be specialised, will always be visible in the enclosing scope.
Notice that the specialised versions must themselves be specialised since the
substitution of the unboxed type over the body of the function may introduce
further unboxed type applications. To ensure termination in the presence of
recursive functions we rely on Hindley-Milner type inference having guaranteed
that all recursive references occur at the same type; thus no new versions of a
function will be created while specialising its body, since we must be creating
the specialised version required.
3.2 The partial evaluator
The specialisation algorithm is eciently implemented using a partial evaluator.
The partial evaluator, , takes an expression with unrestricted use of un-
T
boxed types, and two environments: one containing the polymorphic let-
bindings and the other the specialised versions of those let-bindings which
have been created so far. It returns a triple containing an equivalent expression
which satis es the unboxing restriction, a modi ed environment of specialised
versions, and a set of specialised data types required.
T :: Exp !BEnv SEnv
! (Exp; Senv; TSet)
!

 2 BEnv :: Name (Type; Exp)


!
 2 SEnv :: Name ((Name; [Type])
! ! (Type; Exp))
2 TSet :: (Name; [Type])
f g
The environments are partial maps with suitable domain and lookup func-
tions. BEnv simply maps a variable name to its type and unrestricted expres-
sion. SEnv is a nested environment, mapping a variable name to an environment
containing the specialised versions for that variable. The domain of this spe-
cialised environment is the variable name and the specialising types (a vector
containing unboxed types and *s), which uniquely identi es the specialised ver-
sion. We use a subscript notation xv to refer to the specialised version. We
also use the notation for the empty environment and [x v] to extend (or
fg !
modify) an environment  with the mapping x v. !
TSet is the set of specialised data types required by the expression. The
partial evaluator does not explicitly specify the data type transformation |
it just collects the data types required. These are subsequently given to the
code generator which creates the required constructor functions directly from
the data type speci cations.
The partial evaluator is de ned in Figure 2. The equations for simple vari-
ables (1), -abstraction (2), application (3), monomorphic let-binding (4), and
case (8) are quite straightforward.
For a polymorphic let-binding (equation 5) let x : = e in e the body e
is evaluated using the following environments:  extended with the binding for
x; and  extended with an empty set of specialised bindings for x. (We assume
that all bound variables have unique names.) The set of specialised bindings
for x, returned in the modi ed specialisation environment  1 , are then let-
bound and returned. For simplicity, we assume that the target form of the core
language allows the set of specialised bindings to be bound in a single let.
A polymorphic application (equation 6) is replaced with an application of
an appropriately specialised version of the binding. The auxiliary function
spectys (Figure 3) determines:

v : the unboxed types on which the binding must be specialised. A * type


indicates that the specialised version is still polymorphic in that type
parameter.
: the boxed types the specialised version remains polymorphic in. These
correspond to the * types in v.
The specialised version xv is then applied to the remaining boxed type argu-
ments  and returned. The auxiliary function specfn (Figure 3) is used to
extend the environment  with a newly created specialisation (if it does al-
ready contain it). The original binding is extracted from  and the specialising
types substituted for the corresponding type variable. (The usual alpha sub-
stitution to avoid capture is assumed.) The partial evaluator is then applied
to the specialised body, e0, in an environment,  0 , extended with the specialisa-
tion being created. This ensures termination in the presence of recursion since
recursive references will assume that the required specialisation already exists
(see Section 3.1).
Finally, a constructor application (equation 7) is replaced with an applica-
tion of an appropriately specialised version of the constructor and returned with
a speci cation of the specialised data type required. The global environment ?
maps constructors to their data type.
T [[x]   = ([[x] ; ; ) fg (1)
T [[ x ::e]   (2)
= let (e0 ;  1; 1 ) = e T
in ([[ x ::e0] ;  1 ; 1 )
T [[e x]   (3)
= let (e0 ;  1; 1 ) = T e
in ([[e0 x] ;  1 ; 1 )
T [[let x : = e1 in e2 ]   e1  :e ;  MonoType
j 6 2 (4)
= let (e01 ;  1; 1 ) = e1   1 T
(e02 ;  2; 2 ) = e 2 T
in ([[let x : = e01 in e02] ;  2 ; 1 2 ) [

T [[let x : = e in e]   e  :e ;  PolyType
j  2 (5)
= let (e0 ;  1; 1 ) = e [x (; e1)] [x
T ] ! ! fg
in ([[let ( 1 x) in e0 ] ;  1 [x ]; ) ! ?

T [[x  ]  
f g (6)
= let (v; ) = spectys 
in
case xv dom ( x) of
2
True ([[xv  ] ; ; )
! f g fg
False let ( 1 ; 1 ) = specfn x v  
!
in ([[xv  ] ;  1 ; 1 ) f g

T [[C  x]  
f g (7)
= let (v; ) = spectys 
 = ?C
in ([[Cv  x] ; ; v )
f g f g

T [[case e of Cj xj :j ej m
f
j=1]  
! g (8)
= let (e0 ;  0; 0 ) = e 0 T
(e01 ;  1; 1 ) = e1   T

(e0n ;  m ; m ) = en   m?m1 T

in ([[case e of Cj xj :j e0j j=1 ] ;  m ; 0


0 f ! g [[
m)

Figure 2: The partial evaluator T


spectys 
= let v = [v  j ; v = if  BoxedType then
2  else ]
 = [  j ;  BoxedType]
2
in (v; )
specfnxv
= let ( :;  :e) =
8 x
n = length
e 0 = e [(if0 vi 0 then vi else i )= i]ni=1
6 
(e0 00;  1; 1 ) = T e 
 = [x ( x)[xv (; e)]]
! !

 = [ i i [1::n]; vi ]
j  
e =   :e00
 0 =  [(if vi then vi else i)= i ]ni=1
6 
 = 8 : 0
in ( 1; 1 )

Figure 3: Specialisation functions

4 Practical Considerations
The specialiser described above interacts with a number of other language fea-
tures including: overloading, and separate module compilation. We address
these issues and discuss the process of introducing unboxing below.
4.1 Overloaded functions
In Haskell many primitive functions, such as comparison and addition, are
overloaded. This allows these operations to be applied to a number of di erent
types. For example, addition belongs to the class Num
class Num a where
(+) :: a -> a -> a
...

which has instances for types such as Int, Integer, Float and Complex. Each
of these instances provides a de nition of the function which is called when it
is used at that type. For example:
instance Num Int where
(+) x y = plusInt x y
...

Since an overloaded function can now be applied to an unboxed type (it was
prohibited by the unboxing restriction ( ) before), it makes sense to introduce

new instances declarations for these unboxed types. For example:
instance Num Int# where
(+) x y = plusInt# x y
...
This allows us to manipulate unboxed values in the same way as we manipu-
late their boxed counterparts, greatly reducing the code modi cations required
when introducing unboxing. It also overloads the literals, allowing us to write
1 instead of 1#, where an Int# is required.

4.2 Character I/O


In Haskell I/O is often a major performance bottleneck. One reason for this is
that the I/O operations read and write strings i.e. [Char]. Given the ability to
manipulate unboxed values directly, it would be nice to extend the I/O system
to provide the ability to read and write strings containing unboxed characters
i.e. [Char#], as well. One approach would be to introduce a parallel set of I/O
operations, such as appendChan#, which read and write [Char#]. This would
give the programmer the ability to choose unboxed I/O if desired.
Unfortunately these unboxed I/O operations must be used explicitly (since
they require the # in the name). We are currently exploring an alternative
approach which overloads the original I/O operations, enabling them to output
lists of Char or lists of Char#.
4.3 Separate module compilation
In a language with separate module compilation type information ows from
the de ning module to the importing module. However, specialisation requires
information about the use of a function to ow from the importing module back
to the de ning module. For example, consider the module structure:
module Tree (Tree(..), maptree) where
data Tree k a = Leaf k a | Branch k (Tree a) (Tree a)
maptree :: (a->b) -> Tree k a -> Tree k b
maptree f t = ...

module Use where


import Tree (Tree, maptree)
unbox_inttree :: Tree Int# Int -> Tree Int# Int#
unbox_inttree inttree = maptree int_to_int# inttree
In this example, module Use requires the maptree_*_Int#_Int# version of the
imported function maptree. However, since Use imports Tree, Tree must be
compiled before Use. When we compile Tree there is no requirement to create
the maptree_*_Int#_Int# version of maptree since we have no information
about module Use. When we subsequently compile Use we are faced with the
problem that the required version of maptree has not been created.
One simple solution is to place this responsibility on the programmer: re-
quiring them to request any specialised versions, which are not automatically
generated, using pragmas. For example:
{-# SPECIALISE maptree :: (Int->Int#) -> Tree Int# Int
-> Tree Int# Int# #-}
The SPECIALISE pragma is converted to the corresponding second-order type
application: maptree Int Int# Int# . This is then processed by the partial
f g
evaluator and the specialised versions maptree_*_Int#_Int# produced. The
specialised versions of the Tree data type: Tree_Int#_* and Tree_Int#_Int#,
will also be created.
The existence of all specialised versions created is recorded in the mod-
ule's interface. If any specialised versions required by an importing module are
not in the interface an error message is generated and the programmer has to
add the appropriate specialise pragma to the declaring module and recompile.
Unfortunately, the amount of programmer intervention and recompilation re-
quired is very unsatisfactory. To reduce these overheads, we plan to develop a
scheme which automatically propagates the SPECIALISE pragmas back to the
appropriate source modules and only recompiles once.
4.4 Introducing unboxing
In a lazy language, the programmer has to be careful when introducing unbox-
ing, since an unboxed value is also strict. It is only safe to introduce unboxing
where the implied strictness does not cause the program to bottom. This is
normally not a problem, since the programmer is usually aware of the strictness
implications.
We suggest that the programmer ensure that any intended unboxing is
made explicit by introducing data type declarations with unboxed components
or explicit type signatures for unboxing polymorphic data types. For example,
the intention to unbox the list of prime numbers could be speci ed using the
type signature:
primes :: [Int#]

After introducing this unboxing signature type errors may occur where the
unboxed data structure is created and used.4 In modifying the code to correct
these type errors the programmer has to introduce explicit unboxing/boxing
coercions at the \boundaries" of the unboxed values. We believe this is a \good
thing", since the programmer is forced to identify these boundaries and consider
the strictness implications. If the unboxing does cause the program to bottom
the boundaries can be moved or the unboxing modi cations abandoned.
It remains to be seen what the practical overheads of introducing this form
of explicit unboxing are. However, we believe that when performance is an issue,
and resources are allocated to improving it5, it is essential that the programmer
has access to language features, such as this, which enable them to optimise
the execution.

5 Preliminary Results
We have not yet completed the implementation of the specialiser. However,
we do have some preliminary results for programs in which we have introduced
4 These boundary type errors will not occur where the unboxed values are created/used
by overloaded functions which have instances for the unboxed type (see Section 4.1).
5 We would also avocate that such improvements are carefully directed at the actual hot-
spots identi ed by an execution pro ler [18].
Program Brief Description Unboxing Modi cations
clausify converts logical formula to their unbox the character symbols
clausal form [17,18]
life list based implementation of unbox the integers used in the
Conway's Life algorithm [2] board representation
pseudoknot oating point intensive molecu- unbox all integers and oating
lar biology application [5] point numbers
#lines #lines modi ed/added
Program code Unbox Boundary Specialise Overload Speedup
clausify 112 1 3 25 3 1.25x
life 75 1 1 42 9 1.06x
pseudoknot 3146 3 1 10 23666 4.42x
Figure 4: Preliminary results

some unboxing and performed the necessary specialisation by hand. These are
summarised in Figure 4. The Unbox column reports the modi cations required
to introduce the unboxing while the Boundary column reports the modi cations
required to coerce data at the boundaries of the unboxing (see Section 4.4). The
Specialise and Overload columns report the modi cations which we expect to
be automated (either by specialisation or as a result of extending the class
operations to the unboxed types). The small number of changes required to
introduce the unboxing is very encouraging.

6 Related Work
Other treatments of polymorphism in the presence of unboxed values fall into
two categories. The rst automatically introduces coercions which box/unbox
values when they are passed to/from a polymorphic function [6,10,13,20]. The
costs of creating and manipulating boxed values is only incurred when poly-
morphic code is used. This has the unfortunate consequence that it penalises
the performance of polymorphic code, since unboxing is not possible.
The second approach compiles each polymorphic function in such a way
that it can manipulate unboxed values. This is done by passing enough addi-
tional information at runtime to describe the representation of the values being
manipulated [4,14]. This scheme also penalises the performance of polymor-
phic code since it must interpret the representation information. It has the
unfortunate property that the performance penalty is paid even when the code
is manipulating boxed values.
In contrast, our scheme generates specialised versions of polymorphic func-
tions and data types which directly manipulate unboxed values. The perfor-
mance of polymorphic code is not penalised since the polymorphism is removed
precisely where it would impose a performance penalty. It also enables arbi-
trary data types to contain unboxed components. Traded o against this is the
6 The large number of modi cations for pseudoknot were due to the amount of literal data
(about 70% of the program) which had to by unboxed by added #s.
resulting code expansion and the diculties associated with separate module
compilation.
In a strict language, such as ML, both boxed and unboxed values have the
same semantics. Consequently, the approaches to unboxing in strict languages
focus on automatically unboxing values, because doing so is always possible
when the type is known at compile time [4,6,10,14,20]. In a non-strict language,
such as Haskell, unboxed values can only be introduced if we can be sure that
the implied strictness will not change the behaviour of the program. Rather
than relying on the often poor results of a strictness analyser, we ask the
programmer to indicate where the unboxing is to be introduced. A similar
approach is taken by Nocker & Smetsers [13]. They require the programmer to
introduce explicit strictness annotations which they then use to safely unboxed
values.
The use of partial evaluation to produce specialised code is not new. Hall [3]
uses partial evaluation of special type arguments to create specialised versions
which produce and consume an optimised list representation, while both Au-
gustsson [1] and Jones [9] use partial evaluation to eliminate the overheads
of overloading: creating versions which are specialised on their dictionary
arguments.7 Jones [9] also proposes an overloaded implementation of data
types which cause the specialisation of overloading to specialises the data types
as well. Unfortunately this requires a more powerful system of type classes.

7 Future Work
Our immediate goal is to complete the implementation of the specialiser, and
develop a scheme for automatically propagating information about the required
specialisations back to the declaring module. This should enable us to exper-
iment with the unboxing of large programs by examining the practicalities of
introducing the unboxing and the performance improvements which result.
We also plan to experiment with monomorphisation in general. By modify-
ing the de nition of spectys (Figure 3) the partial evaluator can be directed to
introduce an arbitrary degree of monomorphisation. For example, if we de ne
spectys  = (; [ ]) we get a completely monomorphic program. Our intention
is to explore the practical bene ts of optimisations which require monomorphic
code to produce good results.

Bibliography
[1] L Augustsson, \Implementing Haskell overloading," Conference on Func-
tional ProgrammingLanguages and Computer Architecture, Copenhagen,
Denmark, June 1993, 65{73.
[2] M Gardner, \Wheels, Life and Other Mathematical Amusements," W.H.
Freeman and Company, New York, 1993.

7 Within our compiler we also use our partial evaluator to eliminate overloading by spe-
cialising on all overloaded type arguments, in addition to any unboxed type arguments. Care
must be taken to ensure that the dictionary argument(s), introduced by the translation into
the Core language, are also eliminated.
[3] CV Hall, \Using Hindley-Milner type inference to optimise list repre-
sentation," Conference on Lisp and Functional Programming, Orlando,
Florida, June 1994.
[4] R Harper & G Morrisett, \Compiling Polymorphism Using Intensional
Type Analysis," Technical Report CMU-CS-94-185, School of Computer
Science, Carnegie Mellon University, Sept 1994.
[5] PH Hartel et al., \Pseudoknot: a oat-intensive benchmark for functional
compilers," in Proc Sixth International Workshop on the Implementation
of Functional Languages, Norwich, JRW Glauert, ed., University of East
Anglia, Norwich, Sept 1994.
[6] F Henglein & J Jorgensen, \Formally optimal boxing," 21st ACM Sym-
posium on Principles of Programming Languages, Portland, Oregon, Jan
1994, 213{226.
[7] P Hudak, SL Peyton Jones, PL Wadler, Arvind, B Boutel, J Fairbairn, J
Fasel, M Guzman, K Hammond, J Hughes, T Johnsson, R Kieburtz, RS
Nikhil, W Partain & J Peterson, \Report on the functional programming
language Haskell, Version 1.2," ACM SIGPLAN Notices 27(5), May 1992.
[8] John Hughes, \Why functional programming matters," The Computer
Journal 32(2), April 1989.
[9] MP Jones, \Partial evaluation for dictionary-free overloading," Research
Report YALE/DCS/RR-959, Dept of Computer Science, Yale University,
April 1993.
[10] X Leroy, \Unboxed objects and polymorphic typing," 19th ACM Sympo-
sium on Principles of Programming Languages, Albuquerque, New Mex-
ico, Jan 1992, 177{188.
[11] JC Mitchell & R Harper, \On the type structure of Standard ML," ACM
Transactions on Programming Languages and Systems 15(2), April 1993,
211{252.
[12] R Morrison, A Dearle, RCH Conner & AL Brown, \An ad-hoc approach to
the implementation of polymorphism," ACM Transactions on Program-
ming Languages and Systems 13(3), July 1991, 342{371.
[13] E Nocker & S Smetsers, \Partially strict non-recursive data types," Jour-
nal of Functional Programming 3(2), April 1993, 191{217.
[14] A Ohori & T Takamizawa, \A polymorphic unboxed calculus and ecient
compilation of ML," Research Institute for Mathematical Sciences, Kyoto
University, Japan, 1994.
[15] SL Peyton Jones, CV Hall, K Hammond, WD Partain & PL Wadler,
\The Glasgow Haskell compiler: a technical overview," Joint Framework
for Information Technology (JFIT) Technical Conference Digest, Keele,
March 1993, 249{257.
[16] SL Peyton Jones & J Launchbury, \Unboxed values as rst class citi-
zens," Conference on Functional Programming Languages and Computer
Architecture, Cambridge, Massachusetts, Sept 1991.
[17] C Runciman & D Wakeling, \Heap pro ling of lazy functional programs,"
Journal of Functional Programming 3(2), April 1993, 217{245.
[18] PM Sansom, \Execution pro ling for non-strict functional languages,"
PhD thesis, Research Report FP-1994-09, Dept of Computing Science,
University of Glasgow, Sept 1994.
[19] PM Sansom & SL Peyton Jones, \Time and space pro ling for non-strict,
higher-order functional languages," 22nd ACM Symposium on Principles
of Programming Languages, San Francisco, California, Jan 1995.
[20] PJ Thiemann, \Unboxed values and polymorphic typing revisited," in
Proc Sixth International Workshop on the Implementation of Functional
Languages, Norwich, JRW Glauert, ed., University of East Anglia, Nor-
wich, Sept 1994.

View publication stats

You might also like