Gregory Meredith
Gregory Meredith
Gregory Meredith
L.G. Meredith
c Draft date June 9, 2010
Contents
Contents i
Preface 1
2 Toolbox 21
2.1 Introduction to notation and terminology . . . . . . . . . . . . . . . . 21
2.1.1 Scala . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.2 Maths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Introduction to core design patterns . . . . . . . . . . . . . . . . . . . 21
2.2.1 A little history . . . . . . . . . . . . . . . . . . . . . . . . . . 21
i
ii CONTENTS
vii
viii LIST OF FIGURES
ix
x LIST OF TABLES
Preface
The book you hold in your hands, Dear Reader, is not at all what you expected...
1
2 LIST OF TABLES
Chapter 1
Where are we; how did we get here; and where are we going?
3
4 CHAPTER 1. MOTIVATION AND BACKGROUND
now. Or maybe you’ve got a nasty perf issue you want to address and are looking
here for a resolution. If that’s the case, then maybe this book isn’t for you because
this book is really about a point of view, a way of looking at programming and
computation. In some sense this book is all about programming and complexity
management because that’s really the issue that the professional programmer is
up against, today. On average the modern programmer building an Internet-based
application is dealing with no less than a dozen technologies. They are attempting
to build applications with nearly continuous operation, 24x7 availability servicing
100’s to 1000’s of concurrent requests. They are overwhelmed by complexity. What
the professional programmer really needs are tools for complexity management. The
principle aim of this book is to serve that need in that community.
The design patterns expressed in this book have been developed for nearly fifty
years to address exactly those concerns. Since Scala isn’t nearly fifty years old you
can guess that they have origins in older technologies, but Scala, it turns out, is
an ideal framework in which both to realize them and to talk about their ins and
outs and pros and cons. However, since they don’t originate in Scala, you can also
guess that they have some significant applicability to the other eleven technologies
the modern professional programmer is juggling.
of dealing with these different aspects of simultaneous execution are not up to the
task of supporting development at this scale. The core issue is complexity. The
modern application developer is faced with a huge range of concurrency and concur-
rency control models, from transactions in the database to message-passing between
server components. Whether to partition her data is no longer an option, she’s
thinking hard about how to partition her data and whether or not this “eventual
consistency” thing is going to liberate her or bring on a new host of programming
nightmares. By comparison threads packages seem like quaint relics from a time
when concurrent programming was a little hobby project she did after hours. The
modern programmer needs to simplify her life in order to maintain a competitive
level of productivity.
Functional programming provides a sort of transition technology. On the one
hand, it’s not that much of a radical departure from mainstream programming like
Java. On the other it offers simple, uniform model that introduces a number of key
features that considerably improve productivity and maintainability. Java brought
the C/C++ programmer several steps closer to a functional paradigm, introducing
garbage collection, type abstractions such as generics and other niceties. Languages
like OCaml, F# and Scala go a step further, bringing the modern developer into
contact with higher order functions, the relationship between types and pattern
matching and powerful abstractions like monads. Yet, functional programming does
not embrace concurrency and distribution in its foundations. It is not based on
a model of computation, like the actor model or the process calculi, in which the
notion of execution that is fundamentally concurrent. That said, it meshes nicely
with a variety of concurrency programming models. In particular, the combination
of higher order functions (with the ability to pass functions as arguments and re-
turn functions as values) together with the structuring techniques of monads make
models such as software transactional memory or data flow parallelism quite easy to
integrate, while pattern-matching additionally makes message-passing style easier
to incorporate.
To illustrate the point, note that these changes in hardware have impacted hardware
memory models. This has a much greater impact on the C/C++ family of languages
than on Java because the latter is built on an abstract machine that not only hides
the underlying hardware memory model, but more importantly can hide changes to
the model. One may, in fact, contemplate an ironic future in which this abstraction
alone causes managed code to outperform C/C++ code because of C/C++’s faulty
assumptions about best use of memory that percolate all through application code.
Secondly, it completely changes the landscape for language development. By provid-
ing a much higher level and more uniform target for language execution semantics
it lowers the barrier to entry for contending language designs. It is not surprising,
therefore, that we have seen an explosion in language proposals in the last several
years, including Clojure, Fortress, Scala, F# and many others. It should not
escape notice that all of the languages in that list are either functional languages,
object-functional languages, and the majority of the proposals coming out are ei-
ther functional, object-functional or heavily influenced by functional language design
concepts.
to do with side-effecting operations and I/O, the underlying semantic model did
not seem well-suited to address those kinds of computations. And yet, not only are
side-effecting computations and especially I/O ubiquitous, using them led (at least
initially) to considerably better performance. Avoiding those operations (sometimes
called functional purity) seemed to be an academic exercise not well suited to writing
“real world” applications.
However, while many industry shops were throwing out functional languages,
except for niche applications, work was going on that would reverse this trend.
One of the key developments in this was an early bifurcation of functional language
designs at a fairly fundamental level. The Lisp family of languages are untyped
and dynamic. In the modern world the lack of typing might seem egregiously un-
maintainable, but by comparison to C it was more than made up for by the kind
of dynamic meta-programming that these languages made possible. Programmers
enjoyed a certain kind of productivity because they could “go meta” – writing pro-
grams to write programs (even dynamically modify them on the fly) – in a uniform
manner. This sort of feature has become mainstream, as found in Ruby or even
Java’s reflection API, precisely because it is so extremely useful. Unfortunately, the
productivity gains of meta-programming available in Lisp and it’s derivatives were
not enough to offset the performance shortfalls at the time.
There was, however, a statically typed branch of functional programming that
began to have traction in certain academic circles with the development of the ML
family of languages – which today includes OCaml, the language that can be consid-
ered the direct ancestor of both Scala and F#. One of the very first developments
in that line of investigation was the recognition that data description came in not
just one but two flavors: types and patterns. The two flavors, it was recognized,
are dual. Types tell the program how data is built up from its components while
patterns tell a program how to take data apart in terms of its components. The
crucial point is that these two notions are just two sides of the same coin and can
be made to work together and support each other in the structuring and execution
of programs. In this sense the development – while an enrichment of the language
features – is a reduction in the complexity of concepts. Both language designer and
programmer think in terms of one thing, description of data, while recognizing that
such descriptions have uses for structuring and de-structuring data. These are the
origins of elements in Scala’s design like case classes and the match construct.
The ML family of languages also gave us the first robust instantiations of para-
metric polymorphism. The widespread adoption of generics in C/C++, Java and C#
say much more about the importance of this feature than any impoverished account
the author can conjure here. Again, though, the moral of the story is that this
represents a significant reduction in complexity. Common container patterns, for
example, can be separated from the types they contain, allowing for programming
8 CHAPTER 1. MOTIVATION AND BACKGROUND
2
that is considerably DRYer.
Still these languages suffered when it came to a compelling and uniform treat-
ment of side-effecting computations. That all changed with Haskell. In the mid-80’s
a young researcher by the name of Eugenio Moggi observed that an idea previ-
ously discovered in a then obscure branch of mathematics (called category theory)
offered a way to structure functional programs to allow them to deal with side-
effecting computations in uniform and compelling manner. Essentially, the notion
of a monad (as it was called in the category theory literature) provided a language
level abstraction for structuring side-effecting computations in a functional setting.
In today’s parlance, he found a domain specific language, a DSL, for organizing
side-effecting computations in an ambient (or hosting) functional language. Once
Moggi made this discovery another researcher, Phil Wadler, realized that this DSL
had a couple of different “presentations” (different concrete syntaxes for the same
underlying abstract syntax) that were almost immediately understandable by the
average programmer. One presentation, called comprehensions (after it’s counter
part in set theory), could be understood directly in terms of a very familiar con-
struct SELECT ... FROM ... WHERE ...; while the other, dubbed do-notation
by the Haskell community, provided operations that behaved remarkably like se-
quencing and assignment. Haskell offers syntactic sugar to support the latter while
the former has been adopted in both XQuery’s FLWOR-expressions and Microsoft’s
LINQ.
Of course, to say that Haskell offers syntactic sugar hides the true nature of
how monads are supported in the language. There are actually three elements that
come together to make this work. First, expressing the pattern at all requires sup-
port for parametric polymorphism, generics-style type abstraction. Second, another
mechanism, Haskell’s typeclass mechanism (the Haskell equivalent to Scala’s
trait) is required to make the pattern itself polymorphic. Then there is the do-
notation itself and the syntax-driven translation from that to Haskell’s core syntax.
Taken together, these features allow the compiler to work out which interpretations
of sequencing, assignment and return are in play – without type annotations. The
simplicity of the design sometimes makes it difficult to appreciate the subtlety, or
the impact it has had on modern language design, but this was the blueprint for the
way Scala’s for-comprehensions work.
With this structuring technique (and others like it) in hand it becomes a lot
easier to spot (often by type analysis alone) situations where programs can be rewrit-
ten to equivalent programs that execute much better on existing hardware. This
is one of the central benefits of the monad abstraction, and these sorts of powerful
abstractions are among the primary reasons why functional programming has made
2
DRY is the pop culture term for the ’Do not Repeat Yourself’. Don’t make me say it again.
1.1. WHERE ARE WE 9
such progress in the area of performance. As an example, not only can LINQ-based
expressions be retargeted to different storage models (from relational database to
XML database) they can be rewritten to execute in a data parallel fashion. Results of
this type suggest that we are really just at the beginning of understanding the kinds
of performance optimizations available through the use of monadic programming
structuring techniques.
It turns out that side-effecting computations are right at the nub of strategies
for using concurrency as a means to scale up performance and availability. In some
sense a side-effect really represents an interaction between two systems (one of which
is viewed as “on the side” of the other, i.e. at the boundary of some central locus of
computation). Such an interaction, say between a program in memory and the I/O
subsystem, entails some sort of synchronization. Synchronization constraints are the
central concerns in using concurrency to scale up both performance and availabil-
ity. Analogies to traffic illustrate the point. It’s easy to see the difference in traffic
flow if two major thoroughfares can run side-by-side versus when they intersect and
have to use some synchronization mechanism like a traffic light or a stop sign. So,
in a concurrent world, functional purity – which insists on no side-effects, i.e. no
synchronization – is no longer an academic exercise with unrealistic performance
characteristics. Instead computation which can proceed without synchronization,
including side-effect-free code, becomes the gold standard. Of course, it is not real-
istic to expect computation never to synchronize, but now this is seen in a different
light, and is perhaps the most stark way to illustrate the promise of monadic struc-
turing techniques in the concurrent world programmers find themselves. They allow
us to write in a language that is at least notionally familiar to most programmers
and yet analyze what’s written and retarget it for the concurrent setting.
In summary, functional language design improved in terms of
• extending the underlying mechanism at work in how types work on data ex-
posing the duality between type conformance and pattern-matching
Taken together with the inherent simplicity of functional language design and
its compositional nature we have the makings of a revolution in complexity man-
agement. This is the real dominating trend in the industry. Once Java was within
1.4X the speed of C/C++ the game was over because Java offered such a significant
reduction in application development complexity which turned into gains in both
10 CHAPTER 1. MOTIVATION AND BACKGROUND
encapsulated in the notion of monad gives us a language for talking about both kinds
of scaling and connecting the two ideas. It provides a language for talking about
the interplay between the composition of structure and the composition of the flow
of control. It encapsulates stateful computation. It encapsulates data structure. In
this sense the notion of monad is poised to be the rational reconstruction of the
notion of object. Telling this story was my motivation for writing this book.
It has become buzz-word du jour to talk about DSL-based design. So much so that
it’s becoming hard to understand what the term means. In the functional setting the
meaning is really quite clear and since the writing of the Structure and Interpretation
of Computer Programs (one of the seminal texts of functional programming and
one of the first to pioneer the idea of DSL-based design) the meaning has gotten
considerably clearer. In a typed functional setting the design of a collection of types
tailor-made to model and address the operations of some domain is the basis is
effectively the design of an abstract syntax of a language for computing over the
domain.
To see why this must be so, let’s begin from the basics. Informally, DSL-based
design means we express our design in terms of a little mini-language, tailor-made
for our application domain. When push comes to shove, though, if we want to
know what DSL-based design means in practical terms, eventually we have to ask
what goes into the specification of a language. The commonly received wisdom
is that a language is comprised of a syntax and a semantics. The syntax carries
the structure of the expressions of the language while the semantics says how to
evaluate those expressions to achieve a result – typically either to derive a meaning
for the expression (such as this expression denotes that value) or perform an action
or computation indicated by the expression (such as print this string on the console).
Focusing, for the moment, on syntax as the more concrete of the two elements, we
note that syntax is governed by grammar. Whether we’re building a concrete syntax,
like the ASCII strings one types to communicate Scala expressions to the compiler
or building an abstract syntax, like the expression trees of LINQ, syntax is governed
by grammar.
What we really want to call out in this discussion is that a collection of types
forming a model of some domain is actually a grammar for an abstract syntax.
This is most readily seen by comparing the core of the type definition language of
modern functional languages with something like EBNF, the most prevalent language
for defining context-free grammars. At their heart the two structures are nearly the
same. When one is defining a grammar one is defining a collection of types that
1.3. HOW ARE WE GOING TO GET THERE 13
model some domain and vice versa. This is blindingly obvious in Haskell, and
is the essence of techniques like the application of two-level type decomposition to
model grammars. Moreover, while a little harder to see in Scala it is still there.
It is in this sense that typed functional languages like Scala are very well suited
for DSL-based design. To the extent that the use of Scala relies on the functional
core of the language (not the object-oriented bits) virtually every domain model is
already a kind of DSL in that it’s types define a kind of abstract syntax.
Taking this idea a step further, in most cases such collections of types are
actually representable as a monad. Monads effectively encapsulate the notion of
an algebra – which in this context is a category theorist’s way of saying a certain
kind of collection of types. If you are at all familiar with parser combinators and
perhaps have heard that these too are facilitated with monadic composition then the
suggestion that there is a deeper link between parsing, grammars, types and monads
might make some sense. On the other hand, if this seems a little too abstract it
will be made much more concrete in the following sections. For now, we are simply
planting the seed of the idea that monads are not just for structuring side-effecting
computations.
These core capabilities wrap around our little toy programming language in
much the same way a modern IDE might wrap around development in a more
robust, full-featured language. Hence, we want the capabilities of the application to
be partially driven from the specification of our toy language. For example, if we
support some syntax-highlighting, or syntax-validation on the client, we want that
to be driven from that language spec to the extent that changes to the language
spec ought to result in changes to the behavior of the highlighting and validation.
Thus, at the center of our application is the specification of our toy language.
16 CHAPTER 1. MOTIVATION AND BACKGROUND
Abstract syntax Fittingly for a book about Scala we’ll use the λ-calculus as our
toy language. 5 The core abstract syntax of the lambda calculus is given by the
following EBNF grammar.
It doesn’t take much squinting to see that this looks a lot like a subset of
Scala, and that’s because – of course! – functional languages like Scala all share a
common core that is essentially the λ-calculus. Once you familiarize yourself with
the λ-calculus as a kind of design pattern you’ll see it poking out everywhere: in
Clojure and OCaml and F# and Scala. In fact, as we’ll see later, just about any
DSL you design that needs a notion of variables could do worse than simply to crib
from this existing and well understood design pattern.
5
A word to the wise: even if you are an old hand at programming language semantics, even if
you know the λ-calculus like the back of your hand, you are likely to be surprised by some of the
things you see in the next few sections. Just to make sure that everyone gets a chance to look at
the formalism as if it were brand new, a few recent theoretical developments have been thrown in.
So, watch out!
18 CHAPTER 1. MOTIVATION AND BACKGROUND
directory
subdirectory
file
file
file
file
status
Code editor
Project editor
Advanced features
• Chapter two introduces terminology, notation and concepts necessary for the
rest of the book.
• Chapter four investigates parsing the transport and application level requests.
Chapter 10
query model
Chapter 6
Chapter 3 Chapter 4
User
Chapter 2 Chapter 7
store
Chapter 9
Toolbox
TBD
2.1.1 Scala
2.1.2 Maths
21
22 CHAPTER 2. TOOLBOX
Haskell’s monad API Given such a type constructor, you only need a pair of
maps (one of which is higher order). Thus, in Haskell a monad is presented in
terms of the following data
Now, it’s not enough to simply have this collection of pieces. The pieces have
to fit together in a certain way; that is, they are subject to the following laws:
• return (bind a f) ≡ f a
• bind m return ≡ m
Do-notation One of the driving motivations for this particular formulation of the
concept is that it makes it very easy to host a little DSL inside the language. The
syntax and semantics of the DSL is simultaneously given by the following procedure
for de-sugaring, i.e. translating expressions in the DSL back to core Haskell.
2.2. INTRODUCTION TO CORE DESIGN PATTERNS 23
do { x } = x
do { x ; <stmts> }
= bind x (\ −> do { <stmts> })
do { v <− x ; <stmts> }
= bind x (\ v −> do { <stmts> })
On the face of it, the notation provides both a syntax and a semantics reminis-
cent of the standard side-effecting operations of mainstream imperative languages.
In presence of polymorphism, however, these instruments are much more powerful.
These operations can be systematically “overloaded” (meaning the overloaded defi-
nitions satisfy the laws above). This allows to systematically use the notation for a
wide variety of computations that all have some underlying commonality. Typical
examples include I/O, state management, control flow (all three of which all bun-
dle up in parsing), and also container navigation and manipulation. It gets better
for many of the tools of mathematics that are regularly the subject of computer
programs such probability distributions, integration, etc., also have presentations as
monads. Thus, innocent examples like this one
do { putStrLn ” Enter a l i n e o f t e x t : ” ;
x <− getLine ;
putStrLn ( ” you wrote : ” ++ x ) }
as might be found in some on-line tutorial on monads belie the potency of this
combination of ideas.
for-comprehensions Unlike Haskell, Scala does not reify the notion of monad
under a trait, the language’s equivalent of Haskell’s typeclass. Instead the system-
atic means of de-sugaring for-notation and polymorphic interpretations of flatMap,
etc are the effective definitions of the notion in Scala.
24 CHAPTER 2. TOOLBOX
expr 1 f i l t e r {
case p => true
case => f a l s e
} flatMap {
p => for ( <stmts> ) yield expr 2
}
This means, therefore, inside the appropriate code context (i.e., a do-block or
a for-comprehension, respectively) we have the following correspondence
with a kind of spiritual kinship between expr1 << expr2 and expr1 ; expr2 .
26 CHAPTER 2. TOOLBOX
• intuition
• correspondence to previously existing structures
• decomposition of the requirements
As we will see the notion of monad maps nicely onto an appropriately parametric
notion of container. From this point of view we can imagine a container “API” that
has three basic operations.
Putting things into the container The next operation is very basic, it says
how to put things into the container. To align with a very long history, we will refer
to this operation by the name unit. Since the operation is supposed to allow us to
put elements of type A into containers of shape S[A], we expect the signature of this
operation to be unit : A =>S[A].
Programmers are very aware of data structures that support a kind of concatenation
operation. The data type of String is a perfect example. Every programmer expects
that the concatenation of a given String, say s, with the empty String, ”” will
return a result string equal to the original. In code, s .equals( s + ”” ) ==true .
Likewise, string concatenation is insensitive to the order of operation. Again, in
code, (( s + t ) + u).equals( s + ( t + u ) ) ==true .
Most programmers have noticed that these very same laws survive polymorphic
interpretations of +, equals and the “empty” element. For example, if we substituted
the data type Integer as the base type and used integer addition, integer equality,
and 0 as the empty element, these same code snippets (amounting assertions) would
still work.
Many programmers are aware that there is a very generic underlying data type,
historically referred to as a monoid defined by these operations and laws. In code,
we can imagine defining a trait in Scala something like
t r a i t Monoid {
def u n i t : Monoid
def mult ( t h a t : Monoid )
}
except for the small problem that Int is final (illustrating an important differ-
ence between the adhoc polymorphism of Haskell’s typeclass and Scala’s trait).
Any solution will depend on type parametrization. For example
t r a i t Monoid [ Element ] {
def u n i t : Element
def mult ( a : Element , b : Element )
}
This parametric way of viewing some underlying data structure is natural both
to the modern programmer and the modern mathematician. Both are quite familiar
with and make extensive use of overloading of this kind. Both are very happy to find
higher levels of abstraction that allow them to remain DRY when the programming
demands might cause some perspiration. One of the obvious places where repetition
is happening is in the construction of view. Consider another view of Int
c l a s s MAddInt extends Monoid [ I n t ] {
override def u n i t : I n t = 0
override def mult ( a : I n t , b : I n t ) = a + b
}
It turns out that there is a lot of machinery that is common to defining a view
like this for any given data type. Category theorists realized this and recognized that
you could reify the view which not only provides a place to refactor the common
machinery, but also to give it another level of polymorphism. Thus, a category
theorist’s view of the monad API might look something like this.
t r a i t Monad [ Element ,M[ ] ] {
def u n i t ( e : Element ) : M[ Element ]
def mult ( mme : M[M[ Element ] ] ) : M[ Element ]
}
The family resemblance to the Monoid API is not accidental. The trick is to
bring syntax back into the picture. Here’s an example.
case c l a s s MonoidExpr [ Element ] ( val e : L i s t [ Element ] )
c l a s s MMInt extends Monad [ Int , MonoidExpr ] {
override def u n i t ( e : I n t ) = MonoidExpr ( L i s t ( e ) )
override def mult ( mme : MonoidExpr [ MonoidExpr [ I n t ] ] ) =
mme match {
case MonoidExpr ( N i l ) =>
MonoidExpr ( N i l )
case MonoidExpr ( mes ) =>
MonoidExpr (
( N i l / : mes ) (
{ ( acc , me ) => me match {
case MonoidExpr ( e s ) => ac c +++ e s
}
}
)
2.3. VARIATIONS IN PRESENTATION 29
)
}
}
While it’s clear that unit turns Ints into integer expressions, what the opera-
tion mult is doing is canonically flattening nested expressions in a way the exactly
parallels the flattening of nest arithmetic addition expressions. For a broad class
of monads, this is the paradigmatic behavior of mult. The fact that monads are
characterized by a generic interpretation of flattening of nested structure, by the
way, makes the choice of the term flatMap particularly appropriate.
Associativity as flattening Looking at it from the other way around, one of the
properties of a monoid is that it’s binary operation, its mult, is associative. The
actual content of the notion of associativity is that order of grouping doesn’t make
any difference. In symbols, a binary operation, ∗, is associative when a ∗ (b ∗ c) =
(a ∗ b) ∗ c. This fact gives us the right to erase the parens and simply write a ∗ b ∗ c.
In other words, associativity is flattening. A similar connection can be made for
unit and the identity of a monoid. One quick and dirty way to see this is that since
we know that a ∗ e = a (when e is the unit of the monoid) then the expression a ∗ e
effectively nests a in a MonoidExpr. That’s the “moral” content of the connection
between the two notions of unit.
Bracing for XML In this connection it is useful to make yet another connection to
a ubiquitous technology, namely XML. As a segue, notice that we can always write
a binary operation in prefix notation as well as infix. That is, whatever we could
write at a ∗ b we could just as easily write as ∗(a, b). The flattening property of
associativity says we can drop nesting such as ∗(a, ∗(b, c)) in favor of ∗(a, b, c). In
this sense, the syntax of braces is a kind of generic syntax for monoids and monads.
If we introduce the notion of “colored” braces, this becomes even more clear at the
lexicographic or notational level. So, instead of ∗(a, b, c) we’ll mark the “color” of the
braces like so: (∗|...|∗), where ∗ can be any color. Then, at the level of monoid the
unit is the empty braces, (∗||∗), while at the level of the monad the unit places the
element, say a, in between the braces: (∗|a|∗). The conceptual connection between
the two variations of the operation now becomes clear: writing a ∗ e is the same as
writing ∗(a, e) which is the same as writing (∗|a, (∗||∗)|∗), which canonically flattens
into (∗|a|∗).
Now, anyone who’s spent any time around XML can see where this is headed. At
a purely syntactic, lexicographic level we replace round brackets with angle brackets
and we have exactly XML notation for elements. In this sense, XML is a kind of uni-
versal notation for monads. The only thing missing from the framework is a means
30 CHAPTER 2. TOOLBOX
to associate operations to unit and mult, i.e. to inserting content into elements and
flattening nested elements. Scala’s specific support for XML puts it in an interesting
position to rectify this situation.
The connection with set-comprehensions Finally, since we’ve gone this far
into it, we might as well make the connection to comprehensions. Again, let’s let
notation support our intuitions. The above discussion should make it clear that its
not the particular shape of the brace that matters, but the action of “embracing” a
collection of elements that lies at the heart of the notion. So, it’s fine if we shift to
curly braces to be suggestive. Thus, we are looking at a formalism that allows us to
polymorphically “collect” elements between braces, like {∗|a, b, c|∗}.
This is fine for finite collections, but what about infinitary collections or col-
lections of elements selected programmatically, rather than given explicitly. The
set theoretic notation was designed specifically for this purpose. When we have an
extant set of elements that we can give explicitly, we simply write {a1 , a2 , a3 , ...}.
When we have a potentially infinitary collection of elements, or elements that are se-
lected on the basis of a condition, then we write {pattern ∈ S | condition}. The idea
of monad as comprehension recognizes that these operations of collecting, pattern
matching and selection on the basis of a condition can be made polymorphic using
monads. Notationally, we can denote the different polymorphic interpretations by
the “color” of the brace. In other words, we are looking at a shift of the form
to build into our notation an explicit representation of the fact that the operation of
collection, pattern matching and filtering on the basis of predicate are polymorphic.
1
Often times, good mathematics, like good programming is really about the
design of good notation – it’s about DSLs! In this case, the notation is particularly
useful because it begs the question of the language of patterns and the language
of conditions – something that Wadler’s original paper on monads as generalized
comprehensions did not address. This is a theme to which we will return at the end
1
This demarcation between extensionally and intensionally given expressions is also reflected
in the notation used for arithmetic or monoids, more generally. When we have a finite number
and/or explicitly given set of operands, we can write expressions like a1 + a2 + ... + an , but when
we have an infinite expression (like and infinite series) or an expression whose operands are given
X
programmatically we write expressions like e(i).
i∈S
2.3. VARIATIONS IN PRESENTATION 31
of the book when we address search on a semantics basis. For now, the central point
is to understand how monad as container and monad as generalization of monoid
are actually views of the same underlying idea.
Now, just to make sure the connection is absolutely explicit, there is a one-
for-one correspondence between the polymorphic set-comprehension notation and
the for-comprehension notation of Scala. The correspondence takes {∗|pattern ∈
S | condition|∗} to
As the Scala type checker will explain, this translation is only approximate.
If the pattern is refutable, then we need to handle the case when the match is not
possible. Obviously, we just want to throw those away, so a fold might be a better
a better choice, but then that obscures the correspondence.
Syntax and containers The crucial point in all of this is that syntax is the only
container we have for computation. What do we mean by this? Back when Moggi
was crafting his story about the application of the notion of monad to computing
he referred to monads as “notions of computation”. What he meant by that was
that monads reify computation (such as I/O or flow of control or constructing data
structures) into “objects”. Computation as a phenomenon, however, is both dy-
namic and (potentially) infinitary. At least as we understand it today, it’s not in
the category of widgets we can hold in our hand like an apple or an Apple TM com-
puter. All we can do is point to it, indicate it in some way. Syntax, it turns out, is
our primary means of signifying computation. That’s why many monads factor out
as a reification of syntax, and why they are so key to DSL-based design.
In the presentation of the monad API that we’ve discussed here the constraints on
any given monad candidate are well factored into three different kinds of require-
ments – operating at different levels of the “API”, dubbed in order of abstraction:
functoriality, naturality and coherence. Often these can be mechanically verified,
and when they can’t there are natural ways to generate spot-checks that fit well
with tools such as ScalaCheck.
32 CHAPTER 2. TOOLBOX
One of the principle challenges of presenting the categorical view of monads is the
dependencies on the ambient theory. In some sense the categorical view of the
monad API is like a useful piece of software that drags in a bunch of other libraries. A
complete specification of monad from the categorical point of view requires providing
definitions for
• category
• functor
• natural transformation
This book is not intended to be a tutorial on category theory. There are lots
of those and Google and Wikipedia are your friends. Rather, this book is about a
certain design pattern that can be expressed, and originally was expressed within
that theory, but is to a great extent an independent notion. 2 On the other hand, for
the diligent and/or curious reader a pointer to that body of work has the potential
to be quite rewarding. There are many treasures there waiting to be found. For
our purposes, we strike a compromise. We take the notion of category to be given
in terms of the definable types within the Scala type system and the definable
programs (sometimes called maps) between those types. Then a functor, say F, is a
pair consisting of a parametric type constructor, FT , together with a corresponding
action, say FM , on programs, that respects certain invariants. Specifically,
• A functor must preserve identity. That is, for any type, A, we can define
an identity map, given canonically by the program ( x : A ) =>x. Then
FM ( ( x : A ) => x ) = ( x : FT [A] ) => x
• A functor must preserve composition. That is, given two programs, f : A =>B
and g : B =>C, FM ( f ◦ g ) = FM ( f ) ◦ FM ( g ) where ( f ◦ g )( x ) = g( f( x ) )
As you might have guessed, this constraint is dubbed naturality. Category the-
orists have developed a nice methodology for reasoning about such constraints. They
draw them as diagrams. For example, the diagram below represents the naturality
equation.
nA-
FT [A] GT [A]
FM (f ) GM (f )
? ?
FT [B] - GT [B]
nB
You can read the diagram as stating that the two paths from the upper left
corner of the diagram to the lower right corner (one along the top and down the
right and the other down the left and along the bottom) must be equal as functions.
In general, when all the path between two vertices in such a diagram are equal as
functions, the diagram is said to commute. This sort of tool is really invaluable for
people doing systems-level design.
BN F C trang scalaxb
EBN F - DT D - XSD - case classes
Chaining through the open source components (maps in our category) to find a
way to wire in the kiama functionality is a lot like diagram chasing, which feels like it
was made for an open source world. Moreover, when BNFC eventually targets Scala
directly, we have a quality assurance constraint. Up to some accepted variance in
output format we want
BN F-C trang
EBN F DT D - XSD
BN
FC
+ scalaxb
+
- ?
case classes
Monads are triples Returning to the topic at hand, a monad is really given by
a triple3 , (S, unit, mult) where
• S is a functor,
Or in pictures
S mult- unit S -
S3 S2 S S2
• At the level of Scala, which – if you recall – is our ambient category, we find
types and maps between them.
• Though this is harder to see because we have restricted our view to just one
category, at the level of functors, categories play in the role of types, while
functors play in the role of maps between them.
• At the level of natural transformations, functors play in the role of types while
natural transformations play in the role of maps between them.
• functoriality
• naturality
• coherence
Monads bring all three levels together into one package. Monads operate on a
category via a functor and pair of natural transformations that interact coherently.
This food chain arrangement points the way toward an extremely promising recon-
struction of the notion of interface. One way to think about it is in terms of the
recent trend away from inheritance and towards composition. In this trend the no-
tion of interface is still widely supported, but it really begs the question: what is an
interface? What makes a collection of functions cohere enough to be tied together
under an interface?
One way to go about answering that question is to assume there’s nothing
but the interface name that collects the functions it gathers together. In that case,
how many interfaces are there? One way to see that is just to consider all the sub
interfaces of a single interface with n methods on it: that’s 2n interfaces. That’s a
lot. Does that give us any confidence that any one way of carving up functionality
via interfaces is going to be sane? Further, in practice, do we see random distribution
through this very large space?
What we see over and over again in practice is that the answer to the latter
question is “no!” Good programmers invariably pick out just a few factorizations
of possible interfaces – from the giant sea of factorizations. That means that there
is something in the mind of a good programmer that binds a collection of methods
together. What might that something be? i submit that in their minds there are
some constraints they know or at least intuit must hold across these functions. The
evidence from category theory is that these are not just arbitrary constraints, but
that the space of constraints that bind together well factored interfaces is organized
along the lines of functoriality, naturality and coherence. There may yet be higher-
order levels of organization beyond that, but these – at least – provide a well-vetted
and practical approach to addressing the question of what makes a good interface. If
monad is the new object, then these sorts of categorical situations (of which monad
is but one instance) are the basis for a re-thinking what we mean when we say
“interface”.
All of this is discussion leads up to the context in which to understand the
correspondence between the Haskell variation of the monad laws and their original
presentation.
2.3. VARIATIONS IN PRESENTATION 37
t r a i t Monad [M[ ] ] {
// map p a r t o f M
// p a r t o f t h e r e q u i r e m e n t o f M’ s f u n c t o r i a l i t y
// M : S c a l a => S c a l a
def map [ A, B ] ( a2b : A => B ) : M[A] => M[ B ]
// t h e u n i t n a t u r a l t r a n s f o r m a t i o n , unit : Identity => M [A]
def u n i t [A ] ( a : A ) : M[A]
// t h e mult n a t u r a l t r a n s f o r m a t i o n , mult : M [M [A]] => M [A]
def mult [A ] ( mma : M[M[A ] ] ) : M[A]
// flatMap , aka b i n d i s a d e r i v e d n o t i o n
def flatMap [ A, B ] ( ma : M[A] , a2mb : A => M[ B ] ) : M[ B ] = {
mult ( map( a2mb ) ( ma ) )
}
}
Listing 2.4: categorical presentation of monad as Scala trait
38 CHAPTER 2. TOOLBOX
Chapter 3
The following code is adapted from Tiark Rompf’s work using delimited con-
tinuations for handling HTTP streams.
import s c a l a . c o n c u r r e n t .
import s c a l a . c o n c u r r e n t . c p s o p s .
import j a v a . n e t . I n e t S o c k e t A d d r e s s
import j a v a . n e t . I n e t A d d r e s s
import j a v a . n i o . B y t e B u f f e r
import j a v a . n i o . CharBu ffe r
39
40 CHAPTER 3. AN IO-MONAD FOR HTTP STREAMS
Chapter 10
query model
Chapter 6
Chapter 3 Chapter 4
User
Chapter 2 Chapter 7
store
Chapter 9
import j a v a . n i o . c h a r s e t . Charset
import j a v a . n i o . c h a r s e t . CharsetDecoder
import j a v a . n i o . c h a r s e t . CharsetEncoder
import j a v a . u t i l . r e g e x . P a t t e r n
import j a v a . u t i l . r e g e x . Matcher
import j a v a . u t i l . Se t
import s c a l a . c o l l e c t i o n . J a v a C o n v e r s i o n s .
object DCWebserver
extends FJTaskRunners {
case c l a s s Generator [+A,−B,+C ] ( val fun : (A => (B @cps [ Any , Any ] ) ) => (C @c
def s e l e c t i o n s ( s e l e c t o r : S e l e c t o r )
: Con trol Con te xt [ S e t [ S e l e c t i o n K e y ] , Unit , Unit ] =
shiftR {
k : ( S e t [ S e l e c t i o n K e y ] => Any) =>
def c r e a t e A s y n c S e l e c t o r ( ) = {
val s e l e c t o r = S e l e c t o r P r o v i d e r . p r o v i d e r ( ) . o p e n S e l e c t o r ( )
p r i n t l n ( ” handling : ” + handler )
h a n d l e r ( key )
}
keySet . c l e a r ( )
}
}
selector
}
def c a l l b a c k s ( c h a n n e l : S e l e c t a b l e C h a n n e l , s e l e c t o r : S e l e c t o r , ops : I n t )
Generator {
k : ( S e l e c t i o n K e y => Unit @cps [ Any , Any ] ) =>
42 CHAPTER 3. AN IO-MONAD FOR HTTP STREAMS
shift {
o u t e r k : ( Unit => Any) =>
def c a l l b a c k ( key : S e l e c t i o n K e y ) = {
key . i n t e r e s t O p s ( 0 )
spawn {
println (” before continuation in callback ”)
k ( key )
i f ( key . i s V a l i d ( ) ) {
key . i n t e r e s t O p s ( ops )
s e l e c t o r . wakeup ( )
} else {
outerk ()
// r e t u r n t o . gen ( ) ;
}
}
}
val s e l e c t i o n K e y = c h a n n e l . r e g i s t e r ( s e l e c t o r , ops , c a l l b a c k )
def a c c e p t C o n n e c t i o n s ( s e l e c t o r : S e l e c t o r , p o r t : I n t ) =
Generator {
k : ( SocketChannel => Unit @cps [ Any , Any ] ) =>
val s e r v e r S o c k e t C h a n n e l = S e r v e r S o c k e t C h a n n e l . open ( )
serverSocketChannel . configureBlocking ( false )
3.1. CODE FIRST, QUESTIONS LATER 43
val i s a = new I n e t S o c k e t A d d r e s s ( p o r t )
s e r v e r S o c k e t C h a n n e l . s o c k e t ( ) . bind ( i s a )
for (
key <−
c a l l b a c k s ( s e r v e r S o c k e t C h a n n e l , s e l e c t o r , S e l e c t i o n K e y .OP ACCEP
) {
val s e r v e r S o c k e t C h a n n e l =
key . c h a n n e l ( ) . a s I n s t a n c e O f [ S e r v e r S o c k e t C h a n n e l ]
val s o c k e t C h a n n e l = s e r v e r S o c k e t C h a n n e l . a c c e p t ( )
socketChannel . configureBlocking ( false )
k ( socketChannel )
}
def r e a d B y t e s ( s e l e c t o r : S e l e c t o r , s o c k e t C h a n n e l : SocketChannel ) =
Generator {
k : ( B y t e B u f f e r => Unit @cps [ Any , Any ] ) =>
shift {
o u t e r k : ( Unit => Any) =>
reset {
val b u f S i z e = 4096 // f o r example . . .
val b u f f e r = B y t e B u f f e r . a l l o c a t e D i r e c t ( b u f S i z e )
p r i n t l n ( ” about t o read ” )
for (
key
<− c a l l b a c k s (
socketChannel , s e l e c t o r , S e l e c t i o n K e y .OP READ
)
) {
p r i n t l n ( ” about t o a c t u a l l y read ” )
44 CHAPTER 3. AN IO-MONAD FOR HTTP STREAMS
i f ( count < 0 ) {
p r i n t l n ( ” should c l o s e connection ” )
socketChannel . c l o s e ( )
p r i n t l n ( ” r e s u l t of outerk ” + outerk ( ) )
// r e t u r n t o . gen ( ) s h o u l d c a n c e l h e r e !
} else {
buffer . f l i p ()
k( buffer )
buffer . clear ()
s h i f t { k : ( Unit=>Any) => k ( ) }
}
}
p r i n t l n ( ” readBytes returning ” )
outerk ()
}
}
}
def r e a d R e q u e s t s ( s e l e c t o r : S e l e c t o r , s o c k e t C h a n n e l : SocketChannel ) =
Generator {
k : ( S t r i n g => Unit @cps [ Any , Any ] ) =>
var s : S t r i n g = ”” ;
def w r i t e R e s p o n s e (
selector : Selector ,
s o c k e t C h a n n e l : SocketChannel ,
res : String
) = {
val r e p l y = r e s
def handleRequest ( r e q : S t r i n g ) = r e q
def t e s t ( ) = {
val s e l = c r e a t e A s y n c S e l e c t o r ( )
for ( s o c k e t C h a n n e l <− a c c e p t C o n n e c t i o n s ( s e l , 8 0 8 0 ) ) {
spawn {
p r i n t l n ( ” Connect : ” + s o c k e t C h a n n e l )
for ( r e q <− r e a d R e q u e s t s ( s e l , s o c k e t C h a n n e l ) ) {
val r e s = handleRequest ( r e q )
w r i t e R e s p o n s e ( s e l , socketChannel , r e s )
p r i n t l n ( ” Disconnect : ” + socketChannel )
}
// d e f main ( a r g s : Array [ S t r i n g ] ) = {
// Thread . s l e e p (1000∗60∗60) // 1h !
// // t e s t . mainTaskRunner . w a i t U n t i l F i n i s h e d ( )
// }
TBD
4.3.3 Maintainability
47
48 CHAPTER 4. PARSING REQUESTS, MONADICALLY
Chapter 10
query model
Chapter 6
Chapter 3 Chapter 4
User
Chapter 2 Chapter 7
store
Chapter 9
# l i n e endings
CRLF = ”\ r \n” ;
# character types
CTL = ( c n t r l | 1 2 7 ) ;
s a f e = ( ”$” | ”−” | ” ” | ” . ” ) ;
e x t r a = ( ” ! ” | ”∗” | ” ’ ” | ” ( ” | ” ) ” | ” , ” ) ;
r e s e r v e d = ( ” ; ” | ”/” | ” ? ” | ” : ” | ”@” | ”&” | ”=” | ”+” ) ;
s o r t a s a f e = ( ” \” ” | ”<” | ”>” ) ;
u n s a f e = (CTL | ” ” | ”#” | ”%” | s o r t a s a f e ) ;
n a t i o n a l = any −− ( a l p h a | d i g i t | r e s e r v e d | e x t r a | s a f e | u n s a f e ) ;
u n r e s e r v e d = ( alpha | d i g i t | s a f e | e x t r a | n a t i o n a l ) ;
e s c a p e = ( ”%” x d i g i t x d i g i t ) ;
uchar = ( u n r e s e r v e d | e s c a p e | s o r t a s a f e ) ;
pchar = ( uchar | ” : ” | ”@” | ”&” | ”=” | ”+” ) ;
t s p e c i a l s = ( ” ( ” | ” ) ” | ”<” | ”>” | ”@” | ” , ” | ” ; ” | ” : ” | ” \\ ” | ” \” ”
# elements
token = ( a s c i i −− (CTL | t s p e c i a l s ) ) ;
http number = ( d i g i t+ ” . ” d i g i t+ ) ;
HTTP Version = ( ”HTTP/” http number ) >mark %h t t p v e r s i o n ;
R e q u e s t L i n e = ( Method ” ” Request URI ( ”#” Fragment ) { 0 , 1 } ” ” HTTP V
f i e l d n a m e = ( token −− ” : ” )+ > s t a r t f i e l d $ s n a k e u p c a s e f i e l d %w r i t e
f i e l d v a l u e = any∗ >s t a r t v a l u e %w r i t e v a l u e ;
m e s s a g e h e a d e r = f i e l d n a m e ” : ” ” ”∗ f i e l d v a l u e :> CRLF;
main := Request ;
50 CHAPTER 4. PARSING REQUESTS, MONADICALLY
Chapter 5
TBD
51
52 CHAPTER 5. THE DOMAIN MODEL AS ABSTRACT SYNTAX
Chapter 10
query model
Chapter 6
Chapter 3 Chapter 4
User
Chapter 2 Chapter 7
store
Chapter 9
trait Expressions {
type Nominal
// M, N ::=
abstract c l a s s E x p r e s s i o n
// x
case c l a s s Mention ( r e f e r e n c e : Nominal )
extends E x p r e s s i o n
// λ x1 , ..., xn .M
case c l a s s A b s t r a c t i o n (
f o r m a l s : L i s t [ Nominal ] ,
body : E x p r e s s i o n
) extends E x p r e s s i o n
// M N1 ...Nn
case c l a s s A p p l i c a t i o n (
operation : Expression ,
actuals : List [ Expression ]
) extends E x p r e s s i o n
}
Currying The attentive reader will have noticed that there’s a difference between
the abstract syntax and our Scala model. The abstract syntax only supports a single
formal parameter under λ-abstraction, while the Scala model declares the formals
def
to be of type List [Nominal]. The model anticipates the encoding λ x y.M =
λ x.λ y.M . Given that abstractions are first-class values, in the sense that they can
be returned as values and passed as parameters, this is a fairly intuitive encoding.
It has some pleasant knock-on effects. For example, when there is an arity shortfall,
i.e. the number of actual parameters is less than the number of formal parameters,
54 CHAPTER 5. THE DOMAIN MODEL AS ABSTRACT SYNTAX
then it is both natural and useful simply to return an abstraction. Thus, (λxy.f xy)u
can be evaluated to return (λy.f uy). This is an extremely convenient mechanism
to support partial evaluation.
principle lurking here, called two-level type decomposition, that is enabled by type-
level parametricity. We’ll talk more about this in upcoming chapters, but just want
to put it on the backlog.
Some syntactic sugar To this core let us add some syntactic sugar.
It doesn’t take much squinting to see that this looks a lot like a subset of
Scala, and that’s because – of course! – functional languages like Scala all share a
common core that is essentially the λ-calculus. Once you familiarize yourself with
the λ-calculus as a kind of design pattern you’ll see it poking out everywhere: in
Clojure and OCaml and F# and Scala. In fact, as we’ll see later, just about any
DSL you design that needs a notion of variables could do worse than simply to crib
from this existing and well understood design pattern.
If you’ve been following along so far, however, you will spot that something is
actually wrong with this grammar. We still don’t have an actual terminal! Concrete
syntax is what “users” type, so as soon as we get to concrete syntax we can no longer
defer our choices about identifiers. Let’s leave open the door for both ordinary
56 CHAPTER 5. THE DOMAIN MODEL AS ABSTRACT SYNTAX
identifiers – such as we see in Scala – and our funny quoted terms. This means we
need to add the following productions to our grammar.
(The reason we use the @ for quotation – as will become clear later – is that
when we have both quote and dequote, the former functions a lot like asking for a
pointer to a term while the latter is a lot like dereferencing the pointer.)
// [[ x ]] = x
def compileExpr ( mentionExpr : Absyn . Mention )
: Expression = {
new Mention ( i n t e r n ( mentionExpr . v a r i a b l e e x p r ) )
}
// [[ ( x ) => e x p r ]] = λ x.[[ e x p r ]]
def compileExpr ( a b s t r a c t i o n E x p r : Absyn . A b s t r a c t i o n )
: Expression = {
val f m l s : L i s t [ Nominal ] =
a b s t r a c t i o n E x p r . l i s t v a r i a b l e e x p r . map(
{ ( vExpr : Absyn . V a r i a b l e E x p r ) => i n t e r n ( vExpr ) }
). toList
new A b s t r a c t i o n ( fmls , c o m p i l e ( a b s t r a c t i o n E x p r . e x p r e s s i o n ) )
}
// [[ e x p r ( e x p r 1 , . . . , e x p r n ) ]]
// = [[ e x p r ]] [[ e x p r 1 ]] . . . [[ e x p r n ]]
def compileExpr ( a p p l i c a t i o n E x p r : Absyn . A p p l i c a t i o n )
: Expression = {
new A p p l i c a t i o n (
compile ( applicationExpr . e x p r e s s i o n 1 ) ,
L i s t ( compile ( applicationExpr . e x p r e s s i o n 2 ) )
)
}
}
case a b s t r a c t i o n E x p r : Absyn . A b s t r a c t i o n => {
compileExpr ( a b s t r a c t i o n E x p r )
}
case a p p l i c a t i o n E x p r : Absyn . A p p l i c a t i o n => {
compileExpr ( a p p l i c a t i o n E x p r )
}
}
}
def p a r s e ( s t r : S t r i n g ) : Absyn . E x p r e s s i o n = {
(new p a r s e r (
new Yylex ( new S t r i n g R e a d e r ( s t r ) )
) ) . pExpression ()
}
def c o m p i l e ( s t r : S t r i n g ) : E x p r e s s i o n = {
try {
compile ( parse ( s t r ) )
}
catch {
case e => { // l o g e r r o r
throw e
}
}
}
}
The first thing to notice about this translation is how faithfully it follows
the equational specification. This aspect of functional programming in general and
Scala in particular is one of the things that sets it apart. In a development culture
where AGILE methodologies rightfully demand a justification thread running from
feature to line of code, a means of tracing specification to implementation is of prac-
tical importance. Of course, rarely do today’s SCRUM meetings result in equational
specifications; however, they might result in diagrammatic specification which, as
we will see in subsequent sections, can be given equational interpretations that then
guide functional implementation. Of equal importance: it cannot have escaped no-
tice how much more compact the notations we have used for specification actually
are. In a context where brevity and complexity management are paramount, tools
– such as these specification techniques – that help us gain a higher vantage point
ought to carry some weight. This is another aim of this book, to provide at least
5.2. OUR APPLICATION DOMAIN MODEL 59
some exposure to these higher-level techniques. One of the central points to be made
is that if she’s not already using them, the pro Scala programmer is primed and
ready to take advantage of them.
def s u b s t i t u t e (
term : E x p r e s s i o n ,
a c t u a l s : L i s t [ E x p r e s s i o n ] , f o r m a l s : L i s t [ Nominal ]
) : Expression = {
term match {
case Mention ( r e f ) => {
f o r m a l s . indexOf ( r e f ) match {
case −1 => term
case i => a c t u a l s ( i )
}
}
case A b s t r a c t i o n ( fmls , body ) => {
val fmlsN = f m l s . map(
{
( fml ) => {
f o r m a l s . indexOf ( fml ) match {
case −1 => fml
case i => f r e s h ( L i s t ( body ) )
}
}
}
)
val bodyN =
substitute (
body ,
fmlsN . map( => Mention ( ) ),
fmlsN
)
Abstraction (
fmlsN ,
s u b s t i t u t e ( bodyN , a c t u a l s , f o r m a l s )
)
}
case A p p l i c a t i o n ( op , a c t l s ) => {
Application (
s u b s t i t u t e ( op , a c t u a l s , f o r m a l s ) ,
a c t l s . map( => s u b s t i t u t e ( , a c t u a l s , f o r m a l s ) )
)
}
}
}
62 CHAPTER 5. THE DOMAIN MODEL AS ABSTRACT SYNTAX
With this code in hand we have what we need to express the structural equiv-
alence of terms.
def ‘=a= ‘ (
term1 : E x p r e s s i o n , term2 : E x p r e s s i o n
) : Boolean = {
( term1 , term2 ) match {
case (
Mention ( r e f 1 ) ,
Mention ( r e f 2 )
) => {
r e f 1 == r e f 2
}
case (
A b s t r a c t i o n ( fmls1 , body1 ) , A b s t r a c t i o n ( fmls2 , body2 )
) => {
i f ( f m l s 1 . l e n g t h == f m l s 2 . l e n g t h ) {
val f r e s h F m l s =
f m l s 1 . map(
{ ( fml ) => Mention ( f r e s h ( L i s t ( body1 , body2 ) ) ) }
)
‘=a= ‘ (
s u b s t i t u t e ( body1 , f r e s h F m l s , f m l s 1 ) ,
s u b s t i t u t e ( body2 , f r e s h F m l s , f m l s 2 )
)
}
else false
}
case (
A p p l i c a t i o n ( op1 , a c t l s 1 ) ,
A p p l i c a t i o n ( op2 , a c t l s 2 )
) => {
( ‘=a= ‘ ( op1 , op2 ) / : a c t l s 1 . z i p a c t l s 2 ) (
{ ( acc , a c t l P a i r ) =>
ac c && ‘=a= ‘ ( a c t l P a i r . 1 , a c t l P a i r . 2 )
}
)
}
}
}
what we mean when we write M [N/x] and M ≡ N . People have wondered if this
sort of machinery could be reasonably factored so that it could be mixed into a
variety of variable-binding capabilities. It turns out that this is possible and is
at the root of a whole family of language design proposals that began with Jamie
Gabbay’s FreshML.
Beyond this separation of concerns the introduction of abstract syntax affords
another kind of functionality. While we will look at this in much more detail in
subsequent chapters, and especially the final chapter of the book, it is worthwhile
setting up the discussion at the outset. A computationally effective notion of struc-
tural equivalence enables programmatic investigation of structure. In the context of
our story, users not only write programs, but store them and expect to retrieve them
later for further editing. In such a system it is easy to imagine they might want
to search for structurally equivalent programs. In looking for patterns in their own
code they might want to abstract it is easy to imagine them searching for programs
structurally equivalent to one they’ve found themselves writing for the umpteenth
time. Further, structural equivalence is one of the pillars of a system that supports
automated refactoring.
β-reduction
(λx.M )N → M [N/x]
In terms of our concrete syntax this means that we can expect expressions of
the form ((x1 ,..., xn ) =>e)e1 ... en to evaluate to e[e1 /x1 ... en /xn ].
It is perhaps this last expression that brings home a point: we need to man-
age variable bindings, called environments in this discussion. The lambda calculus
is silent on how this is done. There are a variety of strategies for implementing
environments.
Ordinary maps
DeBruijn notation
type E x p r e s s i o n
abstract c l a s s Value
case c l a s s C l o s u r e (
f n : L i s t [ Value ] => Value
) extends Value
case c l a s s Quantity ( q u a n t i t y : I n t )
extends Value
}
val i n i t i a l A p p l i c a t o r : A p p l i c a t o r =
{ ( xpr : E x p r e s s i o n ) => {
( a c t l s : L i s t [ Value ] ) => {
xpr match {
case I n t e g e r E x p r e s s i o n ( i ) => Quantity ( i )
case => throw new E xce pti on ( ”why a r e we h e r e ? ” )
}
}
}
}
def r e d u c e (
a p p l i c a t o r : Applicator ,
environment : Environment
) : E x p r e s s i o n => Value = {
case I n t e g e r E x p r e s s i o n ( i ) => Quantity ( i )
case Mention ( v ) => environment ( Mention ( v ) )
case A b s t r a c t i o n ( fmls , body ) =>
Closure (
{ ( a c t u a l s : L i s t [ Value ] ) => {
val keys : L i s t [ Mention ] =
f m l s . map( { ( fml : Nominal ) => Mention ( fml ) } ) ;
66 CHAPTER 5. THE DOMAIN MODEL AS ABSTRACT SYNTAX
reduce (
applicator ,
environment . extend (
keys ,
a c t u a l s ) . a s I n s t a n c e O f [ Environment ]
) ( body )
}
}
)
case A p p l i c a t i o n (
operator : Expression ,
actuals : List [ Expression ]
) => {
r e d u c e ( a p p l i c a t o r , environment ) ( o p e r a t o r ) match {
case C l o s u r e ( f n ) => {
f n . apply (
( actuals
map
{( a c t u a l : E x p r e s s i o n ) =>
( r e d u c e ( a p p l i c a t o r , environment ) ) ( a c t u a l ) } )
)
}
case =>
throw new E xce pti on ( ” attempt t o apply non f u n c t i o n ” )
}
}
case => throw new E xce pti on ( ” not implemented , y e t ” )
}
}
Before moving to the next chapter it is important to digest what we’ve done here.
Since we’ve called out DSL-based design as a methodology worthy of attention, what
does our little foray into defining a language tell us about language definition? It
turns out that this is really part of folk lore in the programming language semantics
community. At this point in time one of the commonly accepted presentations of a
language definition has three components:
That’s exactly what we see here. Our toy language can be completely charac-
terized by the following aforementioned half-page specification.
Syntax
expression mention abstraction application
M, N ::= x | λx.M | MN
Structural equivalence
y∈
/ FN (M )
α-equivalence
λx.M = λy.M [y/x]
where
Operational semantics
M ≡ M 0, M 0 → N 0, N 0 ≡ N
struct
β-reduction (λx.M )N → M [N/y] M →N
Discussion This specification leaves open some questions regarding order or eval-
uation. In this sense it’s a kind of proto-specification. For example, to get a left-most
evaluation order you could add the rule
M → M0
leftmost
M N → M 0N
Huet’s zipper
69
70 CHAPTER 6. ZIPPERS AND CONTEXTS AND URI’S, OH MY!
Chapter 10
query model
Chapter 6
Chapter 3 Chapter 4
User
Chapter 2 Chapter 7
store
Chapter 9
// Branches
c l a s s T r e e S e c t i o n [A ] (
val s e c t i o n : L i s t [ Tree [A ] ]
) extends Tree [A]
Since it is clear how this boilerplate is made, we will dispense with it in sub-
sequent discussion; but note that the cost in boilerplate may not have been worth
deprecating inheritance in case classes.
Now, we have the types necessary to model our intuitions as to what a location
is. It’s a pair of a context and a tree that plugs into the context. Note that neither
of these datum are suffient in an of themselves to identify a location in a tree. The
subtree could occur in any number of trees. Likewise, the context could be filled
with any number of subtrees. It takes the pair to identify a location in a tree. For
those with some experience in mathematics, this idea is strongly reminiscent of both
Dedekind cuts and Conway’s models of games as numbers.
c l a s s L o c a t i o n [A ] (
val t r e e : Tree [A] ,
val c t x t : Context [A]
)
72 CHAPTER 6. ZIPPERS AND CONTEXTS AND URI’S, OH MY!
Token [ S t r i n g ] ( ”∗” ) ,
Token [ S t r i n g ] ( ”b” )
)
)
),
Top ( ) ,
List ( )
),
L i s t ( Token [ S t r i n g ] ( ”d” ) )
)
)
The navigation functions With this structure we can define generic navigation
functions.
t r a i t Z i p p e r N a v i g a t i o n [A] {
def l e f t ( l o c a t i o n : L o c a t i o n [A] ) : L o c a t i o n [A] = {
l o c a t i o n match {
case L o c a t i o n ( , Top ) => {
throw new E xce pti on ( ” l e f t o f top ” )
}
case L o c a t i o n ( t , TreeContext ( l : : l e f t , up , r i g h t ) ) => {
L o c a t i o n ( l , TreeContext ( l e f t , up , t : : r i g h t ) )
}
case L o c a t i o n ( t , TreeContext ( Nil , up , r i g h t ) ) => {
throw new E xce pti on ( ” l e f t o f f i r s t ” )
}
}
}
def r i g h t ( l o c a t i o n : L o c a t i o n [A] ) : L o c a t i o n [A] = {
l o c a t i o n match {
case L o c a t i o n ( , Top ) => {
throw new E xce pti on ( ” r i g h t o f top ” )
}
case L o c a t i o n ( t , TreeContext ( l e f t , up , r : : r i g h t ) ) => {
L o c a t i o n ( r , TreeContext ( t : : l e f t , up , r i g h t ) )
}
case L o c a t i o n ( t , ) => {
throw new E xce pti on ( ” r i g h t o f l a s t ” )
}
}
74 CHAPTER 6. ZIPPERS AND CONTEXTS AND URI’S, OH MY!
}
def up ( l o c a t i o n : L o c a t i o n [A] ) : L o c a t i o n [A] = {
l o c a t i o n match {
case L o c a t i o n ( , Top ) => {
throw new E xce pti on ( ”up o f top ” )
}
case L o c a t i o n ( t , TreeContext ( l e f t , up , r i g h t ) ) => {
L o c a t i o n ( T r e e S e c t i o n [A ] ( l e f t . r e v e r s e : : : ( t : : right ) ) ,
up )
}
}
}
def down ( l o c a t i o n : L o c a t i o n [A] ) : L o c a t i o n [A] = {
l o c a t i o n match {
case L o c a t i o n ( TreeItem ( ), ) => {
throw new E xce pti on ( ”down o f item ” )
}
case L o c a t i o n ( T r e e S e c t i o n ( u : : t r e e s ) , c t x t ) => {
L o c a t i o n ( u , Context ( Nil , c t x t , t r e e s ) )
}
}
}
}
Exercising the zipper We can exercise the zipper navigation functions using the
two examples from above.
object E x e r c i s e extends Z i p p e r N a v i g a t i o n [ S t r i n g ] {
val a r i t h m e t i c E x p r 1 = . . .
val l o c a t i o n O f 2 n d M u l t = . . .
}
}
}
}
s c a l a > import E x e r c i s e .
import E x e r c i s e .
import E x e r c i s e .
s c a l a > show ( 0 ) ( a r i t h m e t i c E x p r 1 )
show ( 0 ) ( a r i t h m e t i c E x p r 1 )
Leaf : a
Leaf : ∗
Leaf : b
Leaf : +
Leaf : c
Leaf : ∗
Leaf : d
s c a l a > show ( 0 ) ( l o c a t i o n O f 2 n d M u l t . t r e e )
show ( 0 ) ( l o c a t i o n O f 2 n d M u l t . t r e e )
Leaf : ∗
s c a l a > show ( 0 ) ( up ( l o c a t i o n O f 2 n d M u l t ) . t r e e )
show ( 0 ) ( up ( l o c a t i o n O f 2 n d M u l t ) . t r e e )
Leaf : c
Leaf : ∗
Leaf : d
s c a l a > show ( 0 ) ( up ( up ( l o c a t i o n O f 2 n d M u l t ) ) . t r e e )
show ( 0 ) ( up ( up ( l o c a t i o n O f 2 n d M u l t ) ) . t r e e )
Leaf : a
Leaf : ∗
Leaf : b
Leaf : +
Leaf : c
Leaf : ∗
Leaf : d
s c a l a > show ( 0 ) ( up ( up ( up ( l o c a t i o n O f 2 n d M u l t ) ) ) . t r e e )
show ( 0 ) ( up ( up ( up ( l o c a t i o n O f 2 n d M u l t ) ) ) . t r e e )
76 CHAPTER 6. ZIPPERS AND CONTEXTS AND URI’S, OH MY!
j a v a . l a n g . E xc ep tion : up o f top
...
scala>
c u r r , TreeContext ( t r e e : : l e f t , up , r i g h t )
)
}
}
}
d e f insertDown (
l o c a t i o n : L o c a t i o n [A] , t r e e : Tree [A]
) : L o c a t i o n [A] = {
l o c a t i o n match {
case L o c a t i o n ( TreeItem ( ), ) => {
throw new Ex ce pti on ( ”down o f item ” )
}
case L o c a t i o n (
T r e e S e c t i o n ( progeny ) , c t x t
) => {
Location (
t r e e , TreeContext ( Nil , c t x t , progeny )
)
}
}
}
d e f delete (
l o c a t i o n : L o c a t i o n [A] , t r e e : Tree [A]
) : L o c a t i o n [A] = {
l o c a t i o n match {
case L o c a t i o n ( , Top ( ) ) => {
throw new Ex ce pti on ( ” d e l e t e o f top ” )
}
case L o c a t i o n (
, TreeContext ( l e f t , up , r : : r i g h t )
) => {
Location (
r , TreeContext ( l e f t , up , r i g h t )
)
}
case L o c a t i o n (
, TreeContext ( l : : l e f t , up , N i l )
) => {
Location (
l , TreeContext ( l e f t , up , N i l )
)
78 CHAPTER 6. ZIPPERS AND CONTEXTS AND URI’S, OH MY!
}
case L o c a t i o n (
, TreeContext ( Nil , up , N i l )
) => {
L o c a t i o n ( T r e e S e c t i o n ( N i l ) , up )
}
}
}
}
Zippers generically
Two kinds of genericity It turns out that Huet’s discovery can be made to work
on a much wider class of structures than “just” trees. Intuitively speaking, if their
type arguments are “zippable”, then virtually all of the common functional data
type constructors, including sequencing constructors like product, and branching
constructors, like summation or “casing”, result in “zippable” types. That is, there
are procedures for deriving a notion of zipper capable of traversing and mutating
the structure. Essentially, there are two strategies to achieve this genericity: one is
based on structural genericity and the other on procedural genericity.
we have been observing since Chapter 1. Obviously, in the context of the web,
this particular use case is of considerable interest. Nearly, every web application is
of this form: navigating a tree or graph of pages. Usually, that graph of pages is
somehow homomorphic, i.e. an image of, the graph of some underlying domain data
structure, like the data structures of employee records in a payroll system, or the
social graph of a social media application like Twitter. Many web applications, such
as so-called content management systems, also support the mutation of the graph
of pages. So, having a method of generating this functionality from the types of
the underlying data domain, be they web pages, or some other domain data type,
is clearly pertinent to the most focused of application developers.
And yet, the notion of a derivative of data types is irresistably intriguing. It’s
not simply that it has many other applications besides web navigation and update.
That a calculational device that an Englishman discovered some 400+ years ago in
his investigations for providing a mathematical framework for gravitation and other
physical phenomena should be applicable to structuring computer programs is as
surprising as it is elegant and that makes it cool.
t r a i t Prompt [ R,A] {
def l e v e l : I n t
}
c l a s s CPrompt [ R,A ] (
override val l e v e l : I n t
) extends Prompt [ R,A] {
}
t r a i t P [ R,M[ ] ,A] {
s e l f : StateT [ Int ,M,A] =>
def s t a t e T : StateT [ Int ,M,A]
def runP ( ) : M[ ( Int ,A ) ]
def newPrompt ( ) = {
for ( n <− g e t ) yield { put ( n+1 ) ; new CPrompt ( n ) }
}
}
t r a i t Frame [M[ ] , R, A, B] {
def a2CC : A => CC[ R,M, B ]
}
trait K[ R,M[ ] , A, B] {
def frame : Frame [M, R, A, B ]
def r : R
def a : A
def b : B
Essentially, a zipper in this new style wraps a term. It may also contain a
traversal function.
t r a i t Z i p p e r [ R,M[ ] , T,D] {
def term : T
}
We then provide basic factory mechanism for constructing zippers and then
using them.
T [[ newPrompt ]] = newPrompt
T [[ pushPrompt e1 e2 ]] =
T [[e1 ]] flatMap {
( p ) => pushPrompt p T [[e2 ]]
}
T [[ withSubCont e1 e2 ]] =
T [[e1 ]] flatMap {
( p ) => T [[e2 ]] flatMap {
( f ) => withSubCont p f
}
}
T [[ pushSubCont e1 e2 ]] =
T [[e1 ]] flatMap {
( s ) => pushSubCont s T [[e2 ]]
}
( kg , )
One way to implement this is that the “daemon”, Pooh, is really just the act of
wrapping either client’s access to the box in code that grabs the current continuation,
call it kg (or kt, respectively), and then does the following.
• Giver side:
– check to see if there is a matching taker, kt, (in a queue of taker requests
packaged as continuations).
– If there is, invoke (kt v) passing it the value, v, that came in on the giver’s
call, and invoke (kg unit), passing it unit.
– Otherwise, queue (v,kg) in a giver’s queue.
• Taker’s side:
– check to see if there is a matching giver, (v,kg), (in a queue of giver
requests packages as continuations).
– If there is, invoke (kt v), passing v to the taker’s continuation, and (kg
unit) passing unit to the giver’s continuation.
– Otherwise, queue kt in a taker’s queue.
If these look strangely like the put and get operations of the State monad
– they’s because they are. They’ve been coordinated around a state cell that is
”located” at a rendezvous point for a pair of coroutines to exchange data.
For the adventurous, it is possible to develop a further connection to Milner’s
π-calculus. Roughly speaking, this is the way to implement synchronous-IO-style in
86 CHAPTER 6. ZIPPERS AND CONTEXTS AND URI’S, OH MY!
( kg , )
context subterm
zipper
in which it occurs. Using types to guide our intuition we see that the subterm
must have the same type as a term while the type of a context is determined by a
calculation that perfectly matches a version of the derivative one might have learned
in high school calculus – but applied to data structures.
6.6.1 Contexts
∂ConstA = 0
∂Id = 0
∂F + G = ∂F + ∂G
∂F × G = F × ∂G + ∂F × G
∂F ◦ G = ∂F ◦ G × G
6.6.2 Zippers
∂
∂
),
support
),
support
)
)
}
}
}
}
val r i g h t = s . dropRight ( 1 )
RegularSum [ Name , NSeq ] (
List (
RegularProduct [ Name , NSeq ] (
List (
partial ( x, s( 0 ) ) ,
RegularProduct [ Name , NSeq ] (
right ,
supp
)
),
supp
),
RegularProduct [ Name , NSeq ] (
List (
s( 0 ),
partial (
x,
RegularProduct [ Name , NSeq ] ( r i g h t , supp )
)
),
supp
)
),
supp
)
}
case RegularFixPt ( v , e , supp ) => {
val z = f r e s h match {
case None => throw new E xce pti on ( ” out o f names” )
case Some ( f n ) => f n
}
RegularSum [ Name , NSeq ] (
List (
RegularFixPt (
z,
partial (
x,
RegularWeakening (
z,
RegularFPEnv ( v , e , rtype , supp ) ,
92 CHAPTER 6. ZIPPERS AND CONTEXTS AND URI’S, OH MY!
supp
)
),
supp
),
RegularProduct (
List (
partial (
v,
RegularFPEnv (
v,
e,
rtype ,
supp
)
),
RegularMention ( z , supp )
),
supp
)
),
supp
)
}
case RegularFPEnv ( v , e , s , supp ) => {
RegularSum (
List (
RegularFPEnv (
v,
partial ( x, e ) ,
s,
supp
),
// BUGBUG −− lgm −− have i g o t t h e a s s o c i a t i o n c o r r e c t
RegularProduct (
List (
RegularFPEnv (
v,
partial ( v, e ) ,
s,
supp
6.7. MAPPING URIS TO ZIPPER-BASED PATHS AND BACK 93
),
partial ( x, s )
),
supp
)
),
supp
)
}
case RegularWeakening ( v , e , supp ) => {
i f ( x == v ) {
r e g u l a r N u l l ( supp )
}
else {
RegularWeakening ( v , p a r t i a l ( x , e ) , supp )
}
}
}
}
}
directory
subdirectory
file
file
file
file
((lambda f.
file (lambda x. (f x x))
subdirectory (lambda x.(f x x)))
m) u
status
Where are we; how did we get here; and where are we going?
As we saw in chapter two, one role of monad is to provide the bridge between “flatten-
able” collections and the models of binary operators, investigating two paradigmatic
kinds of collections and, more importantly, their interaction, exposes some of the
necessary interior structure of a wide range of species of monad. It also prepares us
for an investigation of the new Scala collections library. Hence, in this section we
investigate, in detail, the Set and List monads as well as their combinations.
Recalling our basic encapsulation of the core of the monad structure in Scala
95
96 CHAPTER 7. A REVIEW OF COLLECTIONS AS MONADS
Chapter 10
query model
Chapter 6
Chapter 3 Chapter 4
User
Chapter 2 Chapter 7
store
Chapter 9
t r a i t Monad [M[ ] ] {
// map p a r t o f t h e f u n c t o r M
def map [ A, B ] ( a2b : A => B ) : M[A] => M[ B ]
// t h e u n i t n a t u r a l t r a n s f o r m a t i o n , unit : Identity => M [A]
def u n i t [A ] ( a : A ) : M[A]
// t h e mult n a t u r a l t r a n s f o r m a t i o n , mult : M [M [A]] => M [A]
def mult [A ] ( mma : M[M[A ] ] ) : M[A]
// d e r i v e d
def flatMap [ A, B ] ( ma : M[A] , a2mb : A => M[ B ] ) : M[ B ] = {
mult ( map( a2mb ) ( ma ) )
}
}
7.1. SETS, LISTS AND LANGUAGES 97
The definition suggests we have named map well: our map means Scala’s map.
This is a fairly general recipe: in a preponderance of cases lifting a function, say
f : A =>B, to a function, M[f] : M[A] =>M[B], means calculating the function
on each of the “elements” of M[A] and collecting the results in an M-like collection,
namely M[B]. In the case above, M just happens to be Set.
In a similar manner, the recipe for the implementation of unit is ... well...
paradigmatic. If the meaning of unit is the construction of a container embracing
a single element, say a, then calling the constructor of the M collection feels like a
natural choice. This is yet another view on the discussion in chapter 2 on monads as
a kind of generic brace notation. If that was the syntactic view, this is the semantic
view of the very same concept.
Finally, while there are several ways to implement mult we choose fold be-
cause the genericity of this implementation is a quick and dirty demonstration of
the universality of fold. In some very real sense, all “flattening” of structure is
representable as a fold.
To illustrate the genericity of these definitions, we compare them with a sim-
ple implementation of the Set monad. The implementations are nearly identical,
begging the question of a DRYer expression of these instantiations, which we defer
to a later section.
98 CHAPTER 7. A REVIEW OF COLLECTIONS AS MONADS
to fewer constraints on the operation ++. Inversely, Set records less information
about order and multiplicity of the elements inhabiting the type; yet, this corre-
sponds to more properties imposed on the operation ++. To wit, on the data type
++, the operation is required to be commutative, i.e. if s1 :Set[A] and s2 :Set[A],
then (s1 ++s2 )==(s2 ++s1 ). Likewise, if s : Set[A], then (s++s)==s.
This is a general principle worth internalizing. When the operations associated
with a collection acquire more structure, i.e. enjoy more properties, the collection
remembers less information about the individual inhabitants of the type, precisely
because the operation associated with “collecting” identifies more inhabitants of the
type. In some sense the the assumption of properties drops a kind of veil down over
individual structure. Controposatively, “freedom” means that individual structure
is the only carrier of information, or that all inhabitants of the type are “perfectly”
individuated.
As seen below, the structure underlying the monadic view of List and Set is
the data type we called a Monoid in chapter two. More specifically, it is the free
monoid. It turns out that List is really just another syntax for the free monoid,
while Set is a characterization of the smallest version of the monoid where the binary
operation is commutative and idempotent. For those in the know, this means that
Set is model of Boolean algebra. In terms of our discussion of DSLs, this means that
there is an isomorphism between the DSL of Boolean algebra and the data type Set.
Why go to such lengths to expose truths that most programmers know in
their bones, even if they don’t know that they know them? We return to our aim:
complexity management. What we have seen is that there is a deep simplicity, in
fact one common structure, underlying these data types. Moreover, the notation of
monad provides a specific framework for factoring this common structure in a way
that both aligns with the principles of DSL-based design and with mathematical
wisdom now vetted over 50 years. Looked at from another point of view, it provides
justification for the intuitions guiding proposals for DSL-based design. Language-
oriented design hooks into and makes available a wide range of tools that actually
can simplify code and encourage reuse.
Moreover, like the language design view, the categorical view also provides a
factorization of the free structure, aka the grammar, and the identities on terms, aka
the relations. In categorical language the addition of identities takes place in what’s
called the Eilenberg-Moore algebras of the the monad. As we will see below, in a
computatonal universe such as Scala this is just a four syllable name for the action
of pairing the grammar with the relations. As we will see in the last chapter, on
semantic search, holding a place for the relations widens the scope of the applicability
of this technology. Specifically, it provides a unified framework for constraint-based
programming, significantly expanding the scope of reach of LINQ-like technologies.
100 CHAPTER 7. A REVIEW OF COLLECTIONS AS MONADS
type S e t L i s t [X] = S et [ L i s t [X ] ]
t r a i t SetListM extends Monad [ S e t L i s t ] {
// map p a r t o f t h e S e t f u n c t o r
def map( a2b : A => B ) = {
( sa : Se t [ L i s t [A ] ] ) => sa map a2b
}
// t h e u n i t n a t u r a l t r a n s f o r m a t i o n o f t h e S e t monad
def u n i t ( a : A ) = S et ( L i s t ( a ) )
// t h e mult n a t u r a l t r a n s f o r m a t i o n o f t h e S e t monad
def mult ( mma : S e t [ L i s t [ Se t [ L i s t [A ] ] ] ] ) =
( ( S e t ( ) : Se t [A] ) / : mma ) (
{ ( ac c : S e t [ L i s t [A ] ] , elem : Se t [ L i s t [A ] ] ) => . . . }
)
}
addition multiplication
| m&n |m∗n
7.3 Algebras
7.3.1 Kleisli
7.3.2 Eilenberg-Moore
Intuitionistic discipline
Linear discipline
TBD
8.1.3 ORM
ScalaQuery
Squeryl
103
104 CHAPTER 8. DOMAIN MODEL, STORAGE AND STATE
Chapter 10
query model
Chapter 6
Chapter 3 Chapter 4
User
Chapter 2 Chapter 7
store
Chapter 9
8.3.2 Transactions
Chapter 9
TBD
105
106 CHAPTER 9. PUTTING IT ALL TOGETHER
Chapter 10
query model
Chapter 6
Chapter 3 Chapter 4
User
Chapter 2 Chapter 7
store
Chapter 9
Where are we; how did we get here; and where are we going?
107
108 CHAPTER 10. THE SEMANTIC WEB
Chapter 10
query model
Chapter 6
Chapter 3 Chapter 4
User
Chapter 2 Chapter 7
store
Chapter 9
monadic design pattern not only makes it clear that such a feature is a possibility, it
makes the organization of the code to do it perfectly tractable. i cannot imagine a
more powerful argument for the efficacy of this technique for structuring functional
programs.
A little motivation The next couple of sections will introduce some a little more
apparatus. Hopefully, by now, the reader is convinced of the value of the more
standard theoretical presentations of this kind of material if for no other reason
than the evident compression it affords. That said, we recognize the need to ground
the introduction of new apparatus in good use cases. The discussion above can be
turned directly into a use case. The central point of this chapter is to develop a query
language for searching for programs in our toy language. Following the analogy we
established at the outset of this book between select ... from ... where ... and
for-comprehensions, this query language will allow users to write queries of the form
fo r ( p <− d i f c ) yield e
In all of the preceding chapters we deferred one of the most important questions:
do monads themselves compose? After all, if monad is the proposal to replace the
notion of object, and the primary criticism of the notion of object is its lack of
support for composition, hadn’t we better check that monads compose?
Intriguingly, monads do not automatically compose. That is, if F = (F, unitF , multF )
and G = (G, unitG , multG ) are monads it does not necessarily follow that
def
F ◦ G = (F ◦ G, u n i t F ◦ u n i t G , mult f ◦ mult G )
F dG multF mult-
G
F GF G - F F GG FG
Gd3-
GHF GF H
d1
F
H
d2
-
HGF F GH
-
H
d2
d1
F
-
d3 G-
HF G F HG
They are the coherence conditions, the conditions of good interaction amongst
the distributive maps. In fact, this is sufficient to scale out to arbitrary collections
of monads. That is, if for any pair of monads in the collection we have a distributive
map, and for any three we have the switching condition above, then composition
is completely coherent and well defined. To illustrate that this is not just some
abstract mathematical gadget lets put it to work.
Preliminary
First we will consider a single distributive map. We will look at this in terms of two
extremely simple monads, a DSL for forming arithmetic expressions involving only
addition, i.e. a monoid, and a monad for collection, in this case Set.
case c l a s s MonoidExpr [ Element ] ( val e : L i s t [ Element ] )
c l a s s MMExpr [A] extends Monad [ A, MonoidExpr ] {
override def u n i t ( e : A ) = MonoidExpr ( L i s t ( e ) )
override def mult ( mme : MonoidExpr [ MonoidExpr [A ] ] ) =
mme match {
case MonoidExpr ( N i l ) =>
MonoidExpr ( N i l )
case MonoidExpr ( mes ) =>
MonoidExpr (
( N i l / : mes ) (
{ ( acc , me ) => me match {
case MonoidExpr ( e s ) => ac c +++ e s
}
}
)
10.3. SEMANTIC APPLICATION QUERIES 111
)
}
}
Now, what we need to construct is a map d that takes elements inhabiting the
type MMExpr[Set[A]] to elements inhabiting the type Set[MMExpr[A]].
The primary technique is what’s called point-wise lifting of operations. Con-
sider a simple example, such as the element
e =MMExpr( List( Set( a1 , a2 ), Set( b1 , b2 , b3 ) ) ).
This element represents the composition of two sets. We can turn this into a
set of compositions, by considering pairs of a’s with b’s. That is,
e match {
case MMExpr( s 1 : : s 2 : : N i l ) =>
S et (
for ( a <− s 1 ; b <− s 2 )
yield { MMExpr( L i s t ( a , b ) ) }
)
case . . .
}
If you recall, there’s an alternative way to present monads that are algebras, like our
monoid monad. Algebras are presented in terms of generators and relations. In our
case the generators presentation is really just a grammar for monoid expressions.
This is subject to the following constraints, meaning that we will treat syn-
tactic expressions of certain forms as denoting the same element of the monoid. To
emphasize the nearly purely syntactic role of these constraints we will use a differ-
ent symbol for the constraints. We also use the same symbol, ≡, for the smallest
equivalence relation respecting these constraints.
112 CHAPTER 10. THE SEMANTIC WEB
Now, if we had a specific set in hand, say L (which we’ll call a universe in the
sequel), we can interpret the expressions in the this language, aka formulae, in terms
of operations on subsets of that set. As with our compiler for the concrete syntax
of the lambda-calculus in chapter 1, we can express this translation very compactly
as
Now, what’s happening when we pull the monoid monad through the set
monad via a distributive map is this. First, the monoid monad furnishes the uni-
verse, L, as the set of expressions generated by the grammar. We’ll denote this by
L(m). Then, we enrich the set of formulae by the operations of the monoid acting
on sets.
The identity element, e and the generators of the monoid, g1 , ..., gn , can be
considered 0-ary operations in the same way that we usually consider constants as 0-
ary operations. To avoid confusion between these elements and the logical formulae
that pick them out of the crowd, we write the logical formulae in boldface.
Now, we can write our distributive map. Surprisingly, it is exactly a meaning
for our logic!
10.3. SEMANTIC APPLICATION QUERIES 113
Patterns
The constructions of a language of patterns for our monoid expressions is also com-
pletely determined by monadic structure. All we are really doing is constructing the
data type of 1-holed contexts. In chapter 6 we showed how the derivative of a given
regular data type is exactly the 1-holed contexts for the data type. This provides our
first example of how to calculate the pattern language for our for-comprehensions.
After calculation we arrive at
In some sense, the story here, much like the Sherlock Holmes story, is that the
dog didn’t bark. The patterns we calculate from our term language are precisely the
sorts of patterns we expect if we modeled our term language via Scala case classes.
We can now use these pieces to flesh out some examples of the kinds of queries we
might build. The expression
fo r ( x <− d i f ¬(¬e ∗ ¬e)&¬e ) yield x
The whole point of working in this manner is that by virtue of its compositional
structure it provides a much higher level of abstraction and greater opportunities
for reuse. To illustrate the point, we will now iterate the construction using our
toy language, the lambda-calculus, as the term language. As we saw in chapter
1, the lambda-calculus also has a generators and relations presentation. Unlike a
monoid, however, the lambda calculus has another piece of machinery: reduction!
In addition to structural equivalence of terms (which is a bi-directional relation)
there is the beta-reduction rule that captures the behavioral aspect of the lambda
calculus.
10.3. SEMANTIC APPLICATION QUERIES 115
probe
| hdic
The first category of formulae, included for completeness, is again, just the
language of Boolean algebra we get because our collection monad is Set. The next
category comes directly from the abstract syntax of the λ-calculus. The next group
is of interest because it shows that the construction faithfully supports syntactic
sugar. The semantics of the “sugar” formulae is the semantics of desugaring factored
through our distributive map. These latter two categories allow us to investigate
the structure of terms. The final category of formulae, which has only one entry,
P ROBE, is the means of investigating behavior of terms.
Examples Before we get to the formal specification of the semantics of our logic,
let’s exercise intuition via a few examples.
3
In some sense this is one of the central contributions of the theory of computation back to
mathematics. Algebraists have known for a long time about generators and relations presentations
of algebraic structures (of which algebraic data types are a subset). This collective wisdom is
studied, for example, in the field of universal algebra. Computational models like the lambda-
calculus and more recently the process calculi, like Milner’s π-calculus or Cardelli and Gordon’s
ambient calculus, take this presentation one step further and add a set of conditional rewrite
rules to express the computational content of the model. It was Milner who first recognized this
particular decomposition of language definitions in his seminal paper, Functions as Processes,
where he reformulated the presentation π-calculus along these lines.
116 CHAPTER 10. THE SEMANTIC WEB
• for ( ( f i x p t ) <− d
i f ((f ) => ((x) => f (x(x)))((x) => f (x(x))))(true) )
yield f i x p t
The first of these will return the expressions in “function” position applied the
actual parameters meeting the conditions ci respectively. The second will return all
actual parameters of expressions that calculate fixpoints. Both of these examples are
representative common code optimization schemes that are usually carefully hand-
coded. The third example finds all elements in d that are already fixpoints of a
given function, f .
Logical semantics
mention
| x = {m ∈ L(m) | m ≡ x}
abstraction
| [[(x1 ,...,xk ) => c]] = {m ∈ L(m) | m ≡ (x1 , ..., xk ) => m0 , m0 ∈ [[c]]}
application
| [[c(c1 ,...,ck )]] = {m ∈ L(m) | m ≡ m0 (m1 , ..., mn ), m0 ∈ [[c]], mi ∈ [[ci ]]}
probe
| [[hdic]] = {m ∈ L(m) | ∃m0 ∈ [[d]].m0 (m) → m00 , m00 ∈ [[c]]}
Stateful collections
10.3. SEMANTIC APPLICATION QUERIES 117
data1
dataK
{ }
form1
constraint1
constraintN
formK
form
{ form : form1 <- data1,..., formK <- dataK, constraint1, ,..., constraintN }
10.4.2 Examples