Amr Sabry: (1) : 1 (000, January 1993 C 1993 Cambridge University Press
Amr Sabry: (1) : 1 (000, January 1993 C 1993 Cambridge University Press
Functional Programming 1 (1): 1{000, January 1993 c 1993 Cambridge University Press
Abstract
Functional programming languages are informally classi ed into pure and impure languages. The precise meaning of this distinction has been a matter of controversy. We therefore investigate a formal de nition of purity. We begin by showing that some proposed de nitions that rely on con uence, soundness of the beta axiom, preservation of pure observational equivalences, and independence of the order of evaluation, do not withstand close scrutiny. We propose instead a de nition based on parameter-passing independence . Intuitively, the de nition implies that functions are pure mappings from arguments to results; the operational decision of how to pass the arguments is irrelevant. In the context of Haskell, our de nition is consistent with the fact that the traditional call-by-name denotational semantics coincides with the traditional call-by-need implementation. Furthermore, our de nition is compatible with the stream-based, continuationbased, and monad-based integration of computational e ects in Haskell. Finally, we observe that call-by-name reasoning principles are unsound in compilers for monadic Haskell.
2
comp.lang.functional
Amr Sabry
). Even in published papers we nd varying statements regarding the de nition of purity that refer to notions like the soundness of the -axiom (Odersky et al., 1993), referential transparency (Launchbury & Peyton Jones, 1995), the con uence of a calculus for the language (Swarup et al., 1991), the preservation of pure observational equivalences (O'Hearn, 1995), and the independence of order of evaluation (Launchbury & Peyton Jones, 1995). The investigation of a formal de nition of purity goes beyond settling some differences in opinion. It is crucial at this time when a signi cant amount of current research aims for e cient realizations of stateful algorithms in functional languages while maintaining the purity of these functional languages. In the absence of a formal de nition of purity, we cannot judge the correctness of such extensions. In fact, we cannot even state the correctness properties that need to be proven. This work is therefore a rst step towards reasoning about the imperative extensions of functional languages. Our main point is to propose the following de nition of purity:
A language is purely functional if (i) it includes every simply typed -calculus term, and (ii) its call-by-name, call-by-need, and call-by-value implementations are equivalent (modulo divergence and errors).
We will formalize this statement in Section 4.3. To get to our main result, we proceed according to the following plan: 1. Since the notion of purity is an informal one, we begin with some assumptions. First, we assume that a language is functional (pure or not) if it includes the simply typed -calculus. Second, we assume that the following three languages are purely functional: the language that extends the call-by-name -calculus with numbers and addition, and the languages PCF and PPCF (Plotkin, 1977). These assumptions may be challenged but they appear to be consistent with the informal practice. Finally, two of the informal de nitions of purity: referential transparency and independence of order of evaluation, do not have universally agreed-upon de nitions and are not considered any further. (See, however, the treatment of referential transparency by S ndergaard and Sestoft (1990).) 2. In Section 2, we specify the semantics of our pure (by assumption) languages and study some of their properties. Any property that holds at this point is potentially relevant to the formal de nition of purity. 3. In Section 3, we eliminate many potential de nitions of purity using the following strategy. On one hand, we extend the pure language to by adding two expressions, inc and read, that perform side-e ects on an implicit global location that contains a natural number. The language is reminiscent of canonical impure languages such as Scheme and SML, with the important caveat that it has call-by-name semantics rather than call-by-value semantics. We assume that it should not be classi ed as purely functional according to any proposed de nition. We conclude that any property of that still holds in is insuf! ! !
cient to characterize purity. Two such properties are the soundness of the axiom, and the con uence of the associated calculus. On the other hand, we extend PCF to PPCF by adding one constant por which, by assumption, preserves the purity of the language. Hence, any property of PCF that no longer holds in PPCF is insu cient to characterize purity. In particular, since the addition of por breaks some PCF observational equivalences while retaining purity, we conclude that observational equivalence alone cannot characterize purity. 4. The elimination procedure gets rid of most candidate de nitions, but it does, however, leave one reasonable alternative: purity means that the language semantics is insensitive to the parameter-passing mechanism. This alternative is explored in Section 4. 5. Our proposed de nition has a drawback: it requires the existence of a notion of value, and several evaluation functions (implementations) for the same syntax. Nevertheless, we demonstrate in Section 5 that it is compatible with several designs, some widely used and some less so, for including computational e ects in purely functional languages. Before concluding, we brie y discuss implementations of Haskell with monadic state and their correctness.
2.1
The set of -terms extends the language of the -calculus (variables, procedures, and applications) with basic and functional constants. For the sake of presentation, we consider a representative set of constants that contains numerals and addition. We do not make any assumptions about the type structure of the language. Let x; y; z range over an in nite set of variables Vars, and n range over the natural numbers: M; N; L 2 Term ::= x j x:M j MN j n j M + N The language has the following context-sensitive properties. In a procedure ( x:M ), the variable x is bound in the body M . A variable that is not bound is free. A term with no free variables is closed. Like Barendregt (1984, ch 2,3), we identify terms modulo bound variables and we assume that free and bound variables do not interfere in de nitions or theorems. The term M N=x] is the result of the capturefree substitution of all free occurrences of x in M by N . A context C is a term with a hole ] in the place of one subterm. The operation of lling the context C with a
De nition 2.1 (Syntax of )
Amr Sabry
term M yields the term C M ], possibly capturing some free variables of M in the process. The semantics of is a partial function eval n from programs to observables. A program is a closed term. An observable B is either a number or the tag proc indicating a procedure. Thus, as usual, the code of a procedure is not observable. We choose to specify the partial function eval n using a term rewriting machine since this approach does not require the introduction of many new concepts. The machine states are simply closed terms. To perform a step 7 !, the machine decomposes the current term into an evaluation context E and a redex, and then rewrites the redex. The full de nition follows.
De nition 2.3 (eval n ) The partial function eval n from terms to observables is de ned as: eval n (M ) = B if M 7 ! A and obs (A) = B , where 7 ! is the re exive transitive closure of 7 ! De nition 2.2 (Programs and Observables)
and:
Answers
A ::= n j x:M
] j EM j E + M j n + E
Evaluation contexts
E ::=
State transitions
1 2
The following steps of the machine show that eval n (( x:x + x) (4 + 2)) = 12 and eval n (( x:xx) ( y:y)) = proc: ( x:x + x) (4 + 2) 7 ! (4 + 2) + (4 + 2) 7 ! 6 + (4 + 2) 7 ! 6 + 6 7 ! 12 ( x:xx) ( y:y) 7 ! ( y:y)( y:y ) 7 ! ( y:y) Sometimes the term rewriting machine gets stuck and can no longer proceed.
De nition 2.5 (Stuck)
A term M where M is an application or an addition is stuck if the term rewriting machine has no transition from that term. In the remainder of this paper, we will be study the language and its extensions.
De nition 2.6 (Conservative Extension (Felleisen, 1991))
the set of L -terms (programs) includes the set of L -terms (programs), the of L -observables includes the set of L -observables, and the semantics of L extends the semantics of L , i.e., for all L -programs M , we have that eval L2 (M ) = B if and only if eval L1 (M ) = B .
1 2 2
The following are observational equivalences in : 1. ( x:M ) N = M N=x] 2. M + N = N + M The proofs are tedious but straightforward. The idea is to set up a relation between machine states containing the left hand side of the equivalence and machine states containing the right hand side. Then it su ces to show that related states rewrite to related states.
2.3 Calculus
A -calculus is an equational theory over with a ( nite) number of axiom schemas and inference rules. The inference rules extend the axioms to an equivalence relation compatible with contexts (a congruence). The set of axioms should be rich enough to specify the evaluation function but can otherwise include any equalities that are sound with respect to the observational equivalence relation of the language. We write ` M = N when M = N is provable in the calculus. By the congruence rules, if ` M = N then ` C M ] = C N ] for all contexts C .
De nition 2.9 (Axioms for )
Amr Sabry
A typical calculus for could include the following axioms: ( x:M ) N = M N=x] ( ) (M + N ) + L = M + (N + L) (S ) n+M = M +n (C ) n + n = n where n = n + n (A) The axioms are all sound with respect to the observational equivalence relation. This guarantees the consistency and correctness of the system. We do not generalize axiom (C ) to N + M = M + N as the future extension of the language with assignments would make the more general axiom unsound. Furthermore, the current axioms are su cient for evaluation.
1 2 1 2
The idea is to prove the following statement: If M 7 ! N and N is not stuck, then ` M = N . This latter statement follows because the machine's transitions can be performed using the axioms and A given the compatibility of the relation =. It is sound to non-deterministically apply the axioms to a program until it reaches an answer: the order of the reductions has no semantic signi cance. Another way to state this fact is to reason syntactically about the axioms.
Lemma 2.11
Proof Sketch
The statement is an immediate consequence of the Church-Rosser theorem. It is elementary to check that the Church-Rosser property holds using the following idea. We direct each axiom from left to right to yield a system of reductions. We then divide the reductions into three groups. The rst group G includes C and is Church-Rosser because the reduction forms an orthogonal combinatory reduction system (Klop et al., 1993). The second group G includes which is ChurchRosser (Barendregt, 1984). The third group G includes the remaining reductions S and A and is Church-Rosser because the re exive closure of the reductions satis es the diamond property (Barendregt, 1984, Ch.3). The result follows by the HindleyRosen Lemma (Barendregt, 1984) since G commutes with G , and the union of G and G commutes with G .
1 2 3 2 3 2 3 1
To study the impact of computational e ects on the properties of purely functional languages, we extend our language with expressions whose evaluation performs global side-e ects. Any suggested de nition of purity should identify such a language as not pure.
De nition 3.1 (Syntax of )
!
3 An Imperative Extension
The set of terms extends the set in De nition 2.1: M; N; L 2 Term ::= : : : j inc j read The two new constructs act on an implicit global location which is initialized to 0. Informally speaking, the evaluation of inc returns the current value of the global location and increments it as a side-e ect. The evaluation of read returns the current contents of the global location. The formal semantics is speci ed using an extension of the term rewriting machine. States are now pairs whose rst component is the current program, and whose second component ` is the current value of the global location.
De nition 3.2 (eval ) The partial function eval from terms to observables is de ned as: eval (M ) = B if hM; 0i 7 ! hA; `i and obs (A) = B . The de nitions of answers, evaluation contexts, and obs are identical to the ones in De nition 2.3. The state transitions are:
! ! !
The following steps of the machine show that eval n (( x:x + x) inc) = 1: h( x:x + x) inc; 0i 7 ! hinc + inc; 0i 7 ! h0 + inc; 1i 7 ! h0 + 1; 2i 7 ! h1; 2i It is straightforward to verify that is a conservative extension of . In other words, the result of evaluating a pure term using eval coincides with the result of evaluating it using eval n .
! !
Example 3.3
7! 7! 7! 7!
Without further information about a language, the soundness of the axiom does not guarantee that the language is purely functional. However, as generally expected from an imperative extension (Felleisen, 1991), the observational equivalence relation of di ers from the one for .
!
Amr Sabry
!
From proposition 2.8, we have x + y = y + x. In , this equivalence no longer holds as the terms can be distinguished by the context (( x: y: ]) inc read): eval (( x: y:x + y) inc read) = 1 eval (( x: y:y + x) inc read) = 0
! !
The relationship between the two observational equivalence relations for and may suggest that purity requires that the observational equivalence relation of a language coincides with that of an underlying purely functional subset. However, we show that a na ve interpretation of this idea is incorrect. It is possible to break some observational equivalences of a purely functional language by extending it with a pure but non-expressible (Felleisen, 1991) construct. The standard illustration of this situation are the two purely functional languages PCF and PPCF (Plotkin, 1977). The language PCF extends the simply typed -calculus with constants for expressing recursion, conditionals, and operations on the natural numbers. The language PPCF extends PCF with a parallel (but deterministic) operator por. Consider the PCF terms M (1) and M (2) where is a canonical diverging term: M (u) = f:if (f True ) (if (f True) (if (f False False) u) )
!
It is a standard result that M (1) and M (2) are observationally equivalent in PCF but not in PPCF (Plotkin, 1977). The way to distinguish the terms in PPCF is to apply them to por. The latter construct bypasses the rst two conditional tests as it returns True if either of its arguments is True even if the other argument diverges. This result motivates the following statement.
Fact 3.7
Without further information about a language, the non-preservation of pure observational equivalences does not imply that the language is not purely functional.
3.2 Calculus
As for the pure language, we can also realize the evaluation function for using a calculus. Our calculus includes all of the axioms of the pure language and some additional axioms that manipulate the imperative constructs. To conveniently express these imperative axioms, we extend the internal syntax of the language with a new construct (ref ` M ). A source program is mapped to the internal term (ref 0 M ) and the axioms apply to that latter term. This trick is only necessary because our source language is not a realistic language. Had the language been richer, for example, as rich as Scheme, then all the axioms would be expressible in the source language itself (Sabry & Field, 1993; Felleisen & Hieb, 1992).
!
A possible set of axioms includes the axioms in De nition 2.9 and: ref ` read = ref ` ` (R ) ref ` (read + M ) = ref ` (` + M ) (R ) ref ` inc = ref (` + 1) ` (I ) ref ` (inc + M ) = ref (` + 1) (` + M ) (I ) As before the calculus is con uent and the axioms are su cient for evaluation.
1 2 1 2
Lemma 3.9
10
Amr Sabry
In other words, we can still evaluate a program by non-deterministically applying the axioms until we reach an answer: the order of reductions does not a ect the relative order of the imperative operations. (See Figure 1 for 4 di erent proofs that eval ((inc + read) + (inc + inc)) = 4.) This result motivates the following fact.
!
Fact 3.11
Without further information about a language, the con uence of a calculus for the language does not guarantee that the language is purely functional.
With all the negative results in the previous section, one might suspect that we have missed some fundamental property of purely functional languages. Indeed, we have not at all considered their implementations in practice and the connection between the semantics and the implementation. We therefore examine a practical implementation of the language .
4.1 Call-by-Need
The call-by-need evaluator achieves an e cient realization of eval n by sharing the evaluation of non-trivial expressions. These expressions are easy to identify from our semantic speci cations: any expression that is reduced when in the hole of an evaluation context is non-trivial. For example, in the language , both applications and additions are non-trivial (see De nition 2.3). The other expressions, called syntactic values, have a trivial evaluation, and are allowed to be duplicated and hence re-evaluated several times.
De nition 4.1 (Syntactic Value)
In , the following subset of terms are syntactic values (Plotkin, 1975): V ::= n j x:M
De nition 4.2
The call-by-need evaluator is de ned (Ariola et al., 1995; Ariola & Felleisen, 1996) as follows: eval z (M ) = B if M 7 ! A and obs (A) = B , where:
Answers
A ::= V j ( x:A) M
] j EM j E + M j n + E j ( x:E ) M j ( x:E x]) E
1 2
Evaluation contexts
E ::=
1
State transitions
2 1
11
Observing answers obs (n) = n obs ( x:M ) = proc obs (( x:A) M ) = obs (A)
The call-by-need implementation is correct since it de nes the same partial function as the call-by-name implementation (Ariola et al., 1995; Ariola & Felleisen, 1996).
Theorem 4.3 If eval n (M ) = B and eval z (M ) = B then B = B
1 2 1
Furthermore, the call-by-need evaluation is expected to be much more e cient in practice. To de ne the call-by-need evaluator for , we must decide whether the new expressions read and inc are syntactic values or not. Again the answer is evident from the reductions in De nition 3.2. Both read and inc are reducible when in the hole of an evaluation context, and hence are not values. It is now easy to see that a call-by-need evaluator for would not be observationally equivalent to the call-by-name one. For example, we have: eval (( x:x + x) inc) = eval (inc + inc) = 1 But instead attempting to optimize the interpreter eval by sharing the evaluation of the non-value inc would produce 0. Thus the equivalence of call-by-name and call-by-need that is crucial for the e cient implementation of does not hold for . Implementations of cannot rely on laziness to implement the non-strict semantics. This observation suggests that purity manifests itself in practice when we are trying to combine di erent parameter-passing mechanisms (in the speci cation of the semantics and in the implementation). It is therefore reasonable to conjecture that purity implies that these di erent parameter-passing mechanisms are equivalent.
! ! ! ! ! ! !
4.2 Call-by-Value
Having identi ed a possible connection between purity and parameter-passing, we study the r^ ole of call-by-value in this context. Consider a (malicious?) implementor who used a call-by-value evaluator to realize the semantic function in De nition 2.3. What would be wrong?
De nition 4.4
12
Evaluation contexts
Amr Sabry
The call-by-value evaluator is de ned as: eval v (M ) = B if M 7 ! A and obs (A) = B . Answers and obs are identical to the ones in De nition 2.3:
E ::=
] j EM j E + M j n + E j V E
E ( x:M ) V ] 7 ! E M V=x]] E n + n ] 7 ! E n] where n = n + n The call-by-value evaluator is not correct in the sense that it de nes a di erent partial function from eval n . But could a user ever observe a di erence? If the callby-name semantics speci es that a program should terminate with an observable answer, then the call-by-value evaluator will either: 1. terminate with the same observable answer, or 2. not terminate. In the rst case, the user observes the same behavior. In the second case, the user does not observe anything and hence cannot ascertain that the evaluator is incorrect: maybe it is just slow.
1 2 1 2
State transitions
Proposition 4.5 If eval n (M ) = B then either: eval v (M ) = B , or eval v (M ) is unde ned. Conversely, if eval v (M ) = B then eval n (M ) = B .
4.3 Thesis
The previous two subsections motivate the following de nition. Let P be a set of programs, B be a set of observables, and eval and eval be two partial functions (implementations) from programs to observables. We say eval is weakly equivalent to eval when the following conditions hold: If eval (P ) = B then either eval (P ) = B or eval (P ) is unde ned. If eval (P ) = B then either eval (P ) = B or eval (P ) is unde ned. We can now formulate our thesis precisely.
1 2 1 2 1 2 2 2 1 1
A language is purely functional if: 1. it is a conservative extension of the simply typed -calculus, 2. it has well-de ned call-by-value, call-by-need, and call-by-name evaluation functions (implementations), and 3. all three evaluation functions (implementations) are weakly equivalent.
13
There are several important points to note: The rst condition in the de nition requires that the language be a conservative extension of the simply typed -calculus. This condition guards against languages with no functions, and hence that would vacuously satisfy the second and third conditions. Among the many parameter-passing mechanisms we have selected call-byvalue, call-by-name, and call-by-need as the relevant ones for the thesis. This choice appears to work well as it allows us to verify that the subset of SML (a call-by-value language) without assignments and exceptions is pure, and also that Haskell (a language with a call-by-name denotational semantics and a call-by-need implementation) is pure. It may be the case that the thesis could be formulated with only two of the parameter-passing mechanisms, for example by omitting call-by-value entirely. This new thesis would essentially be about sharing of computations since this is the fundamental di erence between call-by-name and call-by-need. We leave this point as an open problem. A drawback of this de nition is that it requires the existence of several evaluation functions (implementations) for the same syntax. Starting from a callby-value language like Scheme, it is straightforward to devise a call-by-need or call-by-name evaluator. However, starting from a call-by-name language like or Idealized Algol (Reynolds, 1991; Reynolds, 1981; Reynolds, 1988), the design of the call-by-value or call-by-need variant rst requires setting a notion of syntactic value. This latter decision a ects the purity of the language. Indeed, as we will see in the next section, by varying the notion of value in , we can design a new variant of the language that is purely functional. The thesis follows the convention that non-termination and errors are special kinds of computation whose e ects are not observable. Hence expressions that diverge, or evaluate to a black hole, or an error are all considered equivalent. If errors become observable, then not even PCF would be pure (Cartwright & Felleisen, 1991; Cartwright et al., 1993).
! !
Using our proposed de nition it is straightforward to con rm some common claims. For example, the subsets of Scheme and SML excluding assignments, pointer equality, exceptions, and control operators are purely functional, and their extensions with assignments, call/cc, or eq? are not purely functional. Also Haskell is pure as long as one observes neither errors nor non-termination (black holes). To show the applicability of our de nition beyond these simple examples, we study several extended languages in this section. Given our de nition of purity, the design of a purely functional variant of requires the construction of call-by-value, call-by-need, and call-by-name evaluation functions that behave similarly.
!
5 Case Studies
14
Amr Sabry
These evaluation functions already di er on simple programs like (( x:x + x) inc) as the program evaluates to 0 using call-by-value or call-by-need but evaluates to 1 using call-by-name. An obvious way of making the evaluation functions agree on the program is to treat the expressions inc and read as values, which we write as incM and readM for clarity. This implies that the program would be equivalent to (incM + incM) using any parameter-passing mechanism. But this only solves part of the problem. Consider now the term: ( x:x + x) (incM + readM) The argument is not a value and again we are in a situation where the program evaluates to di erent results under call-by-value and call-by-name. The solution is however as simple as before: treat the expression (incM + readM) as a (constructed) value, which we write as (Plus incM readM) for clarity. We have thus arranged for the evaluation of (( x:Plus x x) incM) to produce the value (Plus incM incM) as its nal answer using any parameter-passing mechanism. The evaluation does not perform any computational e ects but just collects the demands for computational e ects and propagates them to the top level of the program as the nal answer. The mapping of answers to observables would need to perform the computational e ects to print the expected answer of 1. Putting things together the formal syntax and semantics of our language are now as follows.
De nition 5.1 (Syntax of s )
The set of terms is de ned as: M; N; L 2 Term ::= x j x:M j MN j n j Plus M N j incM j readM V 2 Value ::= n j x:M j Plus V V j incM j readM
1 2
De nition 5.2 (eval s ) The partial function eval s from terms to observables is de ned as: eval s (M ) = B if M 7 ! A and obs (A) = B , where: Answers
Evaluation contexts
E ::=
] j EM j Plus E M j Plus V E
E ( x:M ) N ] 7 ! E M N=x]] The mapping of answers to observables is more complicated than usual since it needs to perform all the e ects. We specify this mapping using an abstract machine of its own.
De nition 5.3 (Observing Answers)
State transitions
15
E ::=
] j Plus E A j Plus n E
State transitions
Example 5.4 We have eval s (( x: y:Plus (Plus x y) (Plus x x)) incM readM) = 4. For clarity
7! 7! 7! 7!
where n = n + n
1
we use 7 !f for the reduction steps of the main (functional) evaluator and 7 !o for the reduction steps of the observer: Functional evaluation ( x: y:Plus (Plus x y) (Plus x x)) incM readM 7 !f ( y:Plus (Plus incM y) (Plus incM incM)) readM 7 !f Plus (Plus incM readM) (Plus incM incM)
Observing the answer hPlus (Plus incM readM) (Plus incM incM); 0i 7 !o hPlus (Plus 0 readM) (Plus incM incM); 1i 7 !o hPlus (Plus 0 1) (Plus incM incM); 1i 7 !o hPlus 1 (Plus incM incM); 1i 7 !o hPlus 1 (Plus 1 incM); 2i 7 !o hPlus 1 (Plus 1 2); 3i 7 !o hPlus 1 3; 3i 7 !o h4; 3i To justify our claim that the above language is purely functional, we should de ne a call-by-value and call-by-need evaluation functions and show their weak equivalence of eval s . Both variants of the semantics are as expected. Much like the pure call-by-value semantics (De nition 4.4), the call-by-value variant has one additional kind of evaluation context, V E , and it replaces the state transitions of De nition 5.2 with: E ( x:M ) V ] 7 ! E M V=x]] The call-by-need variant is similarly de ned following De nition 4.2. It is almost evident that both variants of the semantics are weakly equivalent to eval s . Indeed, ignoring the mapping from answers to observables which does not involve any procedure calls (and hence does not depend on our notion of parameter-passing), the language is just an applied -calculus that includes simple constants and datatypes. The call-by-value, call-by-need, and call-by-name evaluation functions are known to be weakly equivalent for this language.
Proposition 5.5
16
Amr Sabry
The language s is purely functional. In summary, the idea of the language s is to treat all expressions that perform e ects as values, collect these expressions in some data structure as part of the answer, and perform the e ects by a conceptually separate evaluator after all functions have disappeared. This idea originates with the design of Idealized Algol (Reynolds, 1988; Reynolds, 1991; Reynolds, 1981) where the evaluation proceeds as follows: in a rst phase, perform all -steps producing an imperative program, and in a second phase performs all the imperative operations. The only catch is that the imperative program resulting from an Idealized Algol program may be in nite, so this view is only conceptual (Weeks & Felleisen, 1993). In practice the two evaluators would be implemented as coroutines. It is interesting to note that O'Hearn (1995) shows that the observational equivalence of the full Idealized Algol language conservatively extends the observational equivalence of the functional sublanguage, which might be interpreted as evidence for the purity of Idealized Algol. The idea is also reminiscent of the stream I/O model in Haskell (Hudak et al., 1992) where for example, instead of having side-e ecting expressions like writeFile, we have a datatype of Request that includes a data constructor (i.e., a value) WriteFile. These values that refer to I/O operations are accumulated in a stream and performed at the top level. Again the number of I/O operations in the stream is unbounded, so the phase separation is only conceptual.
17
depending on our interpretation. Note that the evaluation of each of the latter two terms is insensitive to the parameter-passing mechanism. To formulate the evaluation function, we need an additional construct (run M ) that marks the top level of a program. The reason for this additional construct is that monadic operations on the state are only performed at top level. Like in the previous section, imperative operations embedded deep inside the program are not performed there but are propagated to the top level using the monadic combinator >>=, and only performed during a conceptually second phase of evaluation. This intuition is made precise in the de nitions of evaluation contexts and standard reductions below.
De nition 5.6 (Syntax of m )
The set of terms is de ned as: T 2 TopTerm ::= M j run M M; N; L 2 Term ::= x j x:M j MN j n j M + N j return M j M >>= N j readM j incM V 2 Value ::= n j x:M j return M j M >>= N j readM j incM As explained above, if the construct run occurs in a term, it must occur at the top level. The (implicit) monadic state is just the value of the global location manipulated by readM and incM. The partial function eval m from terms to observables is de ned as: eval m (T ) = B if T 7 ! A and obs (A) = B , where:
Answers De nition 5.7 (eval m )
A V W I
::= V j run W ::= n j x:M j return M j M >>= N j readM j incM ::= readM j incM j return I j W >>= x:W ::= n j x j I + I
Evaluation contexts
>>=
EjW
>>=
x:F
G ( x:M )N ] 7 ! M N=x]] G n + n ] 7 ! G n] where n = n + n As in the previous case, the mapping of answers to observables is complicated since it performs all the e ects. We specify this mapping using an abstract machine of its own.
1 2 1 2
State transitions
18
E ::=
] j E +M j n+E
State transitions
7 ! 7 ! 7 ! 7 ! 7 !
hrun readM; `i 7 ! h`; `i hrun incM; `i 7 ! h`; ` + 1i hrun (return n); `i 7 ! hn; `i hrun (return E n + n ]); `i 7 ! hrun (return E n]); `i where n = n + n hrun (readM >>= x:W ); `i 7 ! hrun (W x := `]); `i hrun (incM >>= x:W ); `i 7 ! hrun (W x := `]); ` + 1i hrun ((return n) >>= x:W ); `i 7 ! hrun (W x := n]); `i hrun ((return E n + n ]) >>= x:W ); `i 7 ! hrun ((return E n]) >>= x: W ); `i where n = n + n hrun ((W >>= x:W ) >>= y:W ); `i 7 ! hrun (W >>= x: W >>= y:W )); `i
1 2 1 2 1 2 1 2 1 2 1 2
Example 5.9
For example, the term (run (incM >>= v :incM >>= v :return (v + v ))) evaluates to 1. The functional evaluation terminates immediately and all the computation happens during the observation part: hrun (incM >>= v :incM >>= v :return (v + v )); 0i 7 ! hrun (incM >>= v :return (0 + v )); 1i 7 ! hrun (return (0 + 1)); 2i 7 ! hrun (return 1); 2i 7 ! h1; 2i Why is the above language purely functional? The argument is similar to the one in the previous section. The evaluation is clearly divided into two separate phases. In the mapping from answers to observables, all substitutions involve values (and hence are valid in call-by-value, call-by-need, and call-by-name semantics). Abstracting from the way answers are observed, the language is just an applied -calculus in which answers are trees. Changing the evaluation contexts and standard reductions in De nition 5.7 to either call-by-value or call-by-need will either produce the same
1 2 1 2 1 2 1 2 2 2
19
tree as the call-by-name semantics or diverge. The analogy to the previous case is not surprising since Peyton Jones and Wadler (1993) demonstrate that there is a close relationship among the stream-based, monad-based, and continuation-based integration of computational e ects in Haskell.
Proposition 5.10
The language m is purely functional. The language m is a miniature version of the State in Haskell language (Launchbury & Peyton Jones, 1995), which Launchbury and Peyton Jones informally argue is pure:
A formal proof would necessarily involve some operational semantics, and a proof that no evaluation order could change the behaviour of the program. We have not yet undertaken such a proof (Launchbury & Peyton Jones, 1995, p.322).
We have already developed a call-by-name operational semantics for the full State in Haskell (Launchbury & Sabry, 1997) language. To prove that the language is pure according to our de nition, it remains to develop call-by-value and call-by-need variants of the semantics and show their weak equivalence. The language m can by implemented with the same tradeo s as the language of State in Haskell (Launchbury & Peyton Jones, 1995). We describe two possible implementations: a functional one and an imperative one. The rst phase of both implementations translates the source programs by expressing return, >>=, and run in store-passing style. For convenience, the target language of this translation includes, like Haskell, pairs, let-expressions, and pattern-matching with the usual semantics.
De nition 5.11
5.3 Implementation
The translation of m is de ned as follows: (return M ) x = x (M >>= N ) ( x:M ) = x:M (MN ) = M N (run M ) n = n readM (M + N ) = M + N incM
`:hM ; `i `:let hx; `0 i = M ` in N x`0 = fst (M 0) = `:readM ` = `:incM ` The second phases of the two implementations di er as follows: the functional implementation treats the operations incM and readM as state transformers, i.e., functions that take an input store as one of their arguments and return an output store as part of their result: readM = `:h`; `i incM = `:h`; ` + 1i This implementation is close to the semantics of the language but would be rather ine cient in practice as it implements updates by copying (parts of) the store data
= =
20
Amr Sabry
structure. The intermediate language of the functional implementation is clearly pure but not interesting as the basis for a compiler. The imperative implementation generates code for incM and readM that ignores the store argument and performs destructive updates that operate on a global location. This is clearly more e cient but is not evidently correct. Indeed, we show that if the semantics of the intermediate language is call-by-name then the implementation strategy based on destructive updates is incorrect. Consider the following term: run (incM >>= x:return (x + x)) whose value according to the semantics is 0. The translation of the term into the intermediate language produces: fst ( `:(let ha; `i = incM ` in ( x: `:h(x + x); `i) a `) 0) which, if the intermediate language has call-by-name semantics could be simpli ed as follows: = fst (let ha; `i = incM 0 in h(a + a); `i) = fst (let p = incM 0 in h(fst p + fst p); snd pi) = fst h(fst (incM 0) + fst (incM 0)); snd (incM 0)i = (fst (incM 0) + fst (incM 0)) If incM is implemented as an expression that ignores its state argument and instead performs a global side e ect, then the above term evaluates to 1 instead of 0. Launchbury and Peyton Jones (1995) informally argue that the above evaluation strategy as realized in the Glasgow Haskell compiler (ghc) is correct. Clearly, as we demonstrate above, ghc cannot use arbitrary -reductions on the intermediate representation of the program. Fortunately, even before the monadic extensions, ghc was careful not to duplicate work and hence refrained from using steps for performance reasons (Ariola et al., 1995). Consequently, the addition of assignments to the back end did not cause any immediate problems. The correctness of the destructive implementation of monadic state is however still an open problem.
6 Conclusion
The paper proposes a framework for reasoning about purely functional languages and their extensions with computational e ects. We have put forward the thesis that purity can be determined by the (weak) equivalence of call-by-name, call-byvalue, and call-by-need. This de nition of purity naturally motivates and explains the various strategies used to integrate computational e ects with purely functional languages. Building on the thesis, we propose a way to formally reason about the correctness of the destructive implementation of monadic operations. We also reveal the unsoundness of call-by-name reasoning principles in compilers for monadic Haskell and hence the importance of call-by-need theories that are rich enough to express imperative operations.
21
I would like to thank Lennart Augustsson, Matthias Felleisen, Robert Harper, Peter O'Hearn, and Uday Reddy for criticism and comments on an early (and rough) draft. I have also bene ted from interesting discussions with Zena Ariola, Magnus Carlsson, Thomas Hallgren, John Hughes, John Launchbury, Johan Nordlander, Lars Pareto, Simon Peyton Jones, Miley Semmelroth, Thomas Streicher, and Walid Taha. The referees as well as the editor Philip Wadler o ered valuable advice that sharpened both the ideas and the presentation.
Ariola, Zena M., & Felleisen, Matthias. (1996). The call-by-need lambda calculus. To appear in the Journal of Functional Programming . Ariola, Zena M., Felleisen, Matthias, Maraist, John, Odersky, Martin, & Wadler, Philip. (1995). A call-by-need lambda calculus. Pages 233{246 of: ACM Symposium on Principles of Programming Languages. Barendregt, H. P. (1984). The lambda calculus: Its syntax and semantics. Revised edn. Studies in Logic and the Foundations of Mathematics, vol. 103. North-Holland. Cartwright, R., & Felleisen, Matthias. 1991 (August). Observable sequentiality and full abstraction. Tech. rept. 91-167. Rice University. Preliminary version in: Proc. 19th ACM Symposium on Principles of Programming Languages (1992), pp. 328{342. Cartwright, R., Curien, P.-L., & Felleisen, Matthias. 1993 (December). Fully abstract semantics for observably sequential languages. Tech. rept. 93-219. Rice University. Also appears in Information and Computation 111 (2), 1994, 297{401. Felleisen, Matthias. (1991). On the expressive power of programming languages. Pages 35{75 of: Science of Computer Programming, vol. 17. Preliminary version in: Proc. European Symposium on Programming, Lecture Notes in Computer Science, 432. SpringerVerlag (1990), 134{151. Felleisen, Matthias, & Hieb, R. (1992). The revised report on the syntactic theories of sequential control and state. Theoretical Computer Science, 102, 235{271. Technical Report 89-100, Rice University. Filinski, Andrzej. (1994). Representing monads. Pages 446{457 of: ACM Symposium on Principles of Programming Languages. Filinski, Andrzej. (1996). Controlling e ects. Ph.D. thesis, Carnegie Mellon University. Available as Technical Report CS-96-119. Hudak, Paul, Peyton Jones, Simon L., & Wadler, Philip. (1992). Report on the programming language Haskell, a non-strict purely functional language (version 1.2). SIGPLAN notices, 27(5). Klop, J. W., van Oostrom, V., & van Raamsdon, F. 1993 (June). Combinatory reduction systems: Introduction and survey. Tech. rept. IR-327. Vrije Universiteit Amsterdam. Launchbury, John, & Peyton Jones, Simon L. (1995). State in Haskell. Lisp and Symbolic Computation, 8, 193{341. Launchbury, John, & Sabry, Amr. (1997). Monadic state: Axiomatization and type safety. ACM SIGPLAN International Conference on Functional Programming. Odersky, Martin, Rabin, Dan, & Hudak, Paul. 1993 (Jan.). Call by name, assignment, and the lambda calculus. Pages 43{56 of: ACM Symposium on Principles of Programming Languages. O'Hearn, Peter W. (1995). Note on Algol and conservatively extending functional programming. Journal of Functional Programming. To appear.
Acknowledgments
References
22
Amr Sabry
Peterson, J., et al. . (1996). Report on the programming language Haskell (version 1.3). Tech. rept. YALEU/DCS/RR-1106. Yale University. Peyton Jones, Simon L., & Wadler, Philip. (1993). Imperative functional programming. Pages 71{84 of: ACM Symposium on Principles of Programming Languages. Plotkin, Gordon D. (1975). Call-by-name, call-by-value, and the -calculus. Theoretical Computer Science, 1, 125{159. Plotkin, Gordon D. (1977). LCF considered as a programming language. Theoretical Computer Science, 5, 223{255. Reynolds, John C. (1972). De nitional interpreters for higher-order programming languages. Pages 717{740 of: Proceedings of the ACM annual conference. Reynolds, John C. (1981). The essence of Algol. Pages 345{372 of: de Bakker, & van Vliet (eds), Algorithmic languages. Amsterdam: North-Holland. Reynolds, John C. (1988). Preliminary design of the programming language Forsythe. Tech. rept. CMU-CS-88-159. Carnegie Mellon University. Reynolds, John C. (1991). Replacing complexity with generality: The programming language Forsythe. Unpublished manuscript, Carnegie Mellon University. Sabry, Amr, & Field, John. (1993). Reasoning about explicit and implicit representations of state. Tech. rept. YALEU/DCS/RR-968. Yale University. ACM SIGPLAN Workshop on State in Programming Languages, pages 17-30. S ndergaard, H., & Sestoft, P. (1990). Referential transparency, de niteness and unfoldability. Acta Informatica, 27(6), 505{517. Swarup, V., Reddy, Uday, & Ireland, E. (1991). Assignments for applicative languages. Pages 192{214 of: Conference on Functional Programming and Computer Architecture. Wadler, Philip. (1990). Comprehending monads. Pages 61{78 of: ACM conference on Lisp and Functional Programming. Weeks, Stephen, & Felleisen, Matthias. (1993). On the orthogonality of assignments and procedures in Algol. Pages 57{70 of: ACM Symposium on Principles of Programming Languages.