0% found this document useful (0 votes)
506 views

Domain Specific Language Implementation With Dependent Types

The document discusses a talk about implementing domain specific languages with dependent types in Idris. It begins with an introduction to Idris and dependent types, showing how types can be parameterized by values. It discusses functions over dependent types and how their types can specify invariants. It also covers dependent pairs, the with rule, and theorem proving in Idris. The document provides an example of representing natural numbers as binary values and proving the representation is accurate using the type system.

Uploaded by

Edwin Brady
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
506 views

Domain Specific Language Implementation With Dependent Types

The document discusses a talk about implementing domain specific languages with dependent types in Idris. It begins with an introduction to Idris and dependent types, showing how types can be parameterized by values. It discusses functions over dependent types and how their types can specify invariants. It also covers dependent pairs, the with rule, and theorem proving in Idris. The document provides an example of representing natural numbers as binary values and proving the representation is accurate using the type system.

Uploaded by

Edwin Brady
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Domain Specific Language

Implementation with Dependent Types


Edwin Brady
[email protected]

University of St Andrews
FFLunch, October 26th 2009

FFLunch, October 26th 2009 – p.1/28


Introduction

This talk will be in two parts:

 An introduction to functional programming with


dependent types, using the language Idris.
 Data types and functions
 Techniques: Dependent pairs, with rule, views,
theorem proving
 Domain Specific Language (DSL) implementation.
 A type safe interpreter
 Code generation via specialisation
 Verification of resource usage
 Preliminary results
FFLunch, October 26th 2009 – p.2/28
Idris

Idris is an experimental purely functional language with


dependent types
(http://www. s.st-and.a .uk/~eb/Idris).

 Compiled, via C, with reasonable performance


(more on this later).
 Loosely based on Haskell, similarities with Agda,
Epigram.
 Some features:
 Primitive types (Int, String, Char, . . . )
 Interaction with the outside world via a C FFI.
 Integration with a theorem prover, Ivor.

FFLunch, October 26th 2009 – p.3/28


What are Dependent Types?

Dependent types allow types to be parameterised by


values, giving a more precise description of data.
Some data types in Idris:

FFLunch, October 26th 2009 – p.4/28


What are Dependent Types?

Dependent types allow types to be parameterised by


values, giving a more precise description of data.
Some data types in Idris:

data ListInt = Nil | Cons Int ListInt; -- Monomorphi list.

FFLunch, October 26th 2009 – p.4/28


What are Dependent Types?

Dependent types allow types to be parameterised by


values, giving a more precise description of data.
Some data types in Idris:

data List a = Nil | Cons a (List a); -- Polymorphi list.

FFLunch, October 26th 2009 – p.4/28


What are Dependent Types?

Dependent types allow types to be parameterised by


values, giving a more precise description of data.
Some data types in Idris:

data List : # -> # where -- Alternative syntax.


-- # is the type of types.
Nil : List a
| Cons : a -> List a -> List a;

FFLunch, October 26th 2009 – p.4/28


What are Dependent Types?

Dependent types allow types to be parameterised by


values, giving a more precise description of data.
Some data types in Idris:

data Nat = O | S Nat;

data Ve t : # -> Nat -> # where -- List with size (ve tor).
VNil : Ve t a O
| VCons : a -> Ve t a k -> Ve t a (S k);

FFLunch, October 26th 2009 – p.4/28


What are Dependent Types?

Dependent types allow types to be parameterised by


values, giving a more precise description of data.
Some data types in Idris:

data Nat = O | S Nat;


infixr 5 :: ; -- Define an infix operator

data Ve t : # -> Nat -> # where -- List with size (ve tor).
VNil : Ve t a O
| (::) : a -> Ve t a k -> Ve t a (S k);

We say that Ve t is parameterised by the element type


and indexed by its length.
FFLunch, October 26th 2009 – p.4/28
Functions

The type of a function over vectors describes invariants


of the input/output lengths.
e.g. the type of vAdd expresses that the output length is
the same as the input length:
vAdd : Ve t Int n -> Ve t Int n -> Ve t Int n;
vAdd VNil VNil = VNil;
vAdd (x :: xs) (y :: ys) = x + y :: vAdd xs ys;

The type checker works out the type of n implicitly, from


the type of Ve t.

FFLunch, October 26th 2009 – p.5/28


Dependent Pairs

Sometimes, we don’t know what the index of a return


type should be — especially if it is read from user input.
readVe reads user input and adds to an accumulator:
readVe : Ve t Int n -> IO ( p | Ve t Int p );
readVe xs = do { putStr "Number: ";
val <- getInt;
if val == -1 then return << _, xs >>
else (readVe (val :: xs));
};

Dependent pairs pair a value with a predicate on that


value.

FFLunch, October 26th 2009 – p.6/28


Dependent Pairs

Syntax:

 (p : T | P) — dependent pair type, where p, of type T


is in scope in P. Type annotation T is optional.
 << x , y >> — dependent pair value.
Examples:
x : (n : Nat | LT (S (S O)) n); -- a number greater than 2
x = << (S (S (S O))), ltO >>;

y : (n:Nat)->(p | LE (mult n n) p /\ LT p (mult (S n) (S n)));


-- approximate square root fun tion
-- (good lu k writing this...)
FFLunch, October 26th 2009 – p.7/28
The with Rule

The with rule allows dependent pattern matching on


intermediate values:
vfilter : (a -> Bool) -> Ve t a n -> (p | Ve t a p);
vfilter f VNil = << _, VNil >>;
vfilter f (x :: xs) with (f x, vfilter xs f) {
| (True, << _, xs' >>) = << _, x :: xs' >>;
| (False, << _, xs' >>) = << _, xs' >>;
}

The underscore _ means either match anything (on the


left of a clause) or infer a value (on the right).

FFLunch, October 26th 2009 – p.8/28


Libraries

Libraries can be imported via in lude "lib.idr". All


programs automatically import prelude.idr which
includes, among other things:
 Primitive types Int, String and Char, plus Nat, Bool
 Tuples, dependent pairs.
 Fin, the finite sets.
 List, Ve t and related functions.
 Maybe and Either
 The IO monad, and foreign function interface.

FFLunch, October 26th 2009 – p.9/28


An example

Consider this problem:

Given a natural number, convert it to binary.


How can we use the type system to confirm
that the binary representation is an accurate
representation of the original?

FFLunch, October 26th 2009 – p.10/28


Binary representation

We can write a type for binary numbers which expresses


its relationship to a natural number:
data Binary : Nat -> # where
bEnd : Binary O
| bO : Binary n -> Binary (plus n n)
| bI : Binary n -> Binary (S (plus n n));

The conversion function’s type will guarantee that the


result is a representation of the input. . .
natToBin : (n:Nat) -> Binary n;

. . . but how do we write it?

FFLunch, October 26th 2009 – p.11/28


Building Binary Numbers

Perhaps we could repeatedly halve a number, and record


its parity to get each bit in turn:
half : (n:Nat) -> (Nat & Bool);
half O = (O, False);
half (S O) = (O, True);
half (S (S k)) with half k {
| (j, parity) = (S j, parity);
}

But. . .

FFLunch, October 26th 2009 – p.12/28


Building Binary Numbers

. . . why does this not type check?


natToBin : (n:Nat) -> Binary n;
natToBin O = bEnd;
natToBin (S k) with half k {
| (j, False) = bI (natToBin j);
| (j, True) = bO (natToBin (S j));
}

FFLunch, October 26th 2009 – p.13/28


Building Binary Numbers

. . . why does this not type check?


natToBin : (n:Nat) -> Binary n;
natToBin O = bEnd;
natToBin (S k) with half k {
| (j, False) = bI (natToBin j);
| (j, True) = bO (natToBin (S j));
}

Idris reports:
Can't unify Binary (S k) and Binary (S (plus j j))

But we know that k=plus j j, because that’s how we


wrote half. How can we convince the typechecker?
FFLunch, October 26th 2009 – p.13/28
The Trouble with Bool. . .

. . . is that a value of type Bool throws away the


knowledge gained in computing it. Instead, we create a
more informative type for holding the result of halving a
number:
data Half : Nat -> # where
even : (n:Nat) -> Half (plus n n)
| odd : (n:Nat) -> Half (S (plus n n));

half : (n:Nat) -> Half n;

Now whenever we halve a number, the type explains


where the result came from.

FFLunch, October 26th 2009 – p.14/28


Halving a Nat

Half is a view of Nat. Any Nat can be obtained by


doubling another Nat, and possibly adding one:
half : (n:Nat) -> Half n;
half O = even O;
half (S O) = odd O;
half (S (S k)) with half k {
half (S (S (plus j j))) | even j ?= even (S j); [hSe℄
half (S (S (S (plus j j)))) | odd j ?= odd (S j); [hSo℄
}

The ?= indictes that the indices of the left and right are
different. [hSe℄ and [hSo℄ are lemmas to be proved.

FFLunch, October 26th 2009 – p.15/28


Theorem Proving

To complete the definition, we’ll need to prove hSe and


hSo. Their types show how the LHS and RHS differ:
Idris> :t hSe
(j : Nat) -> (value : Half (S (plus j (S j)))) ->
Half (S (S (plus j j)))
Idris> :t hSo
(j : Nat) -> (value : Half (S (S (plus j (S j))))) ->
Half (S (S (S (plus j j))))

We can do these interactively [quick demo], and paste


the result into the program.

FFLunch, October 26th 2009 – p.16/28


Theorem Proving

When we complete the proof, we paste the script into the


program. In this case:
hSe proof {
%intro;
%use value;
%rewrite plus_nSm {n=j} {m=j};
%refl;
%qed;
};

I generally put such things at the end — much like an


appendix to a paper!

FFLunch, October 26th 2009 – p.17/28


Building Binary Numbers

Using the new version of half, we have enough


information in the type to build a binary number, and
prove it’s the right one:
natToBin : (n:Nat) -> Binary n;
natToBin O = bEnd;
natToBin (S k) with half k {
natToBin (S (plus j _)) | even _ = bI (natToBin j);
natToBin (S (S (plus j _))) | odd _ ?= bO (natToBin (S j));
[ntbOdd℄
}

[ntbOdd℄ has the same structure as [hSe℄ and [hSo℄.

FFLunch, October 26th 2009 – p.18/28


Aside — Termination

Many dependently typed languages require that all


programs are total (i.e. terminate, with an answer) for
good reasons:

 Decidability of type checking


 Strong correctness proofs

Idris checks for totality (structural recursion, strictly


positive data types, coverage), but does not require it.
 If it did, natToBin would have been harder to write!
The type checker will not unfold non-terminating
definitions, and we cannot make total correctness claims.
But partial correctness may often be good enough!
FFLunch, October 26th 2009 – p.19/28
A Type Safe Interpreter

A common introductory example to dependent types is


the type safe interpreter. The pattern is:

 Define a data type which represents the language


and its typing rules.
 Write an interpreter function which evaluates this
data type directly.

[demo: interp.idr]

FFLunch, October 26th 2009 – p.20/28


A Type Safe Interpreter

Notice that when we run the interpreter on functions


without arguments, we get a translation into Idris:
Idris> interp Empty test
\ x : Int . \ x0 : Int . x0 + x
Idris> interp Empty double
\ x : Int . x+x

Idris implements %spe and %freeze annotations which


control the amount of evaluation at compile time.

[demo: interp.idr again]

FFLunch, October 26th 2009 – p.21/28


A Type Safe Interpreter

We have partially evaluated these programs. If we can


do this reliably, and have reasonable control over, e.g.,
inlining, then we have a good recipe for efficient DSL
implementation:

 Define the language data type


 Write the interpreter
 Specialise the interpreter w.r.t. real programs

If we trust the host language’s type checker and code


generator — admittedly we still have to prove this, but
only once! — then we can trust the DSL implementation.

FFLunch, October 26th 2009 – p.22/28


Resource Usage Verification

We have applied the type safe interpreter approach to a


family of domain specific languages with resource usage
properties, in their type:
 File handling
 Memory usage
 Concurrency (locks)
 Network protocol state

FFLunch, October 26th 2009 – p.23/28


Example — file handling

The resource aware DSLs track resource usage in their


type:
data Lang : (Ve t FileState n) -> (Ve t FileState n') ->
Ty -> # where
-- ...
| OPEN : (p:Purpose) -> (fd:Filepath) ->
(Lang ts (sno ts (Open p)) (TyHandle (S n)))
| GETLINE : (i:Fin n) -> (p:OpenH i Reading ts) ->
(Lang ts ts (TyLift String));

The interpreter tracks an environment of resources:


interp : (Env ts) -> (Lang ts ts' T) ->
(IO (Env ts' & interpTy T));
FFLunch, October 26th 2009 – p.24/28
Example — file handling

For example, here is a program in the file handling DSL


which copies the contents of a file:
opy : Filepath -> Filepath -> FileSafe TyUnit;
opy infile outfile = program 2 -- requires 2 handles
do { fh1 <- OPEN Reading infile;
fh2 <- OPEN Writing outfile;
WHILE (do { e <- EOF (handle fh1) ?;
return (not e); })
(do { str <- GETLINE (handle fh1) ?;
PUTLINE (handle fh2) str ?;
});
CLOSE (handle fh1) ?; -- proof obligations
CLOSE (handle fh2) ?;
}; FFLunch, October 26th 2009 – p.25/28
Example — file handling

By specialising the interpreter for this language, we


obtain a version of opy without the interpreter overhead:
opyS = \inf, outf => interp Empty ( opy inf outf) %spe ;

In a later talk (and a paper I’ve nearly started writing) I’ll


explain how we can handle. . .

 Control structures
 Recursive definitions
 Code duplication

. . . by following some simple, concrete rules about how to


define the interpreter.
FFLunch, October 26th 2009 – p.26/28
Preliminary Results

. . . are very preliminary. but promising. For the file


example, copying a large file (/usr/share/di t/words)
ten times:
Version Time
Idris, specialised 0.571s
Idris, interpreted 2.762s
C implementation 0.329s
Java implementation 1.833s
The overhead over the C version appears to be caused
by the garbage collector. I’m investigating further. . .
More results to come: file processing, parsing, network
transport protocol.
FFLunch, October 26th 2009 – p.27/28
Conclusion

Dependent types allow us to implement embedded DSLs


with rich specification/verification. Also:

 We need an evaluator for type checking anyway, so


why not use it for specialisation?
 Related to MetaOCaml/Template Haskell, but
free!
 If (when?) we trust the Idris type checker and
code generator, we can trust our DSL.
 DSL programs will be as efficient as we can
make Idris.
 More experimental data needed: what about
programs with more computation than IO?
FFLunch, October 26th 2009 – p.28/28

You might also like