Programming Language Implementation
Programming Language Implementation
Implementation
Martin Henz
oPL: A Simple
Object-oriented Language
3
4 CHAPTER 12. OPL: A SIMPLE OBJECT-ORIENTED LANGUAGE
end
P1 P2 P3
control
time
called message sending. The latter case can be achieved either by a special case
of object application, called this application, or by method application; we
shall discuss these two possibilities in Section 12.2.3. In all cases, the operation
leads to execution of a corresponding method. The resulting refinement of the
functional control flow is depicted in Figure 12.2. It is convenient to group
the methods that operate on a certain kind of object into classes. Classes are
modules containing methods that all operate on the same kind of objects, which
are called its instances.
Each instance of a class carries its own identity distinguishing it from all
other objects. We call this approach to equality token equality as opposed to
structural equality which defines two objects to be equal if they have the same
structure and all their components are equal.
It is useful to group the values that can be referred to by identifiers into types.
With respect to a given type structure, an identifier occurrence is polymorphic,
if it can refer to values of different type at different points in time. Cardelli
and Wegner [CW85] distinguish between two ways an operation can be applied
to a polymorphic identifier occurrence. Either the operation can be performed
uniformly on all values of the different types (e.g. we can compute the length
of a list regardless of the type of its elements), or different operations will be
performed for different types (an addition x + y works differently if x and y
are integers or floats). Operations of the former kind are called universally
polymorphic and the latter ad-hoc polymorphic. Both kinds of polymorphism
play a central role in object-oriented programming.
Object-oriented languages handle both kinds of polymorphism by using late
binding for object application. Late binding introduces an indirection between
12.2. SOFTWARE DEVELOPMENT VIEW 7
objects O1 O2
methods M1 M2 M3 M1 M2 M3
control
time
class
lookup
class
method
lookup
code
address
code
execution
executing
code
procedure procedure
application execution
early
binding
the higher complexity of such hierarchies and later on the one hand, [inher-
itance] enables developers to make extensive use of existing components when
coping with new requirements; conversely, clients can be exposed to a source of
instability that discourages them from depending on a hierarchy of classes.
Early Binding Late binding enforces the use of the method given by the
class of the object being applied. Often, this is too restrictive. Consider the
frequent case that an overriding method needs to call the overridden method.
The use of late binding here would instead call the overriding method! Instead
a mechanism is needed to call the overridden method directly. The classical
idiom for this situation is the super call, which calls the method of the direct
predecessor of the class that defines the method in which the call appears. A
super call in a given method always calls the same method and thus implements
early binding. The term early binding refers to the fact that the class, whose
method matches the method identifier is known at the time of definition of the
method in which the call occurs. A generalization of the super call is a construct
that directly calls the method of a given class; we call this method application.
Early binding can be used to ensure the execution of a particular method.
The programmer can limit the flexibility of designers of derived classes, and
thus rely on stronger invariants. In practice, early binding is often used for
efficiency reasons. Some object-oriented languages such as SIMULA [DN66]
and C++ [Str87] treat early binding as the default and require special user
annotations such as virtual for methods that may be overridden by descendant
classes.
12.2. SOFTWARE DEVELOPMENT VIEW 11
late
binding
object
application executing
code
method method
application execution
early
binding
Control Flow We saw that object application changes the current this,
this application does not change this and both use late binding, whereas
method application uses early binding. We contrast this somewhat sophisticated
control flow to the control flow in functional languages, which is depicted in
Figure 12.4. In these languages, control flows from one function to the next
through function application, with the possibility to pass parameters along.
The control flow in object-oriented languages is depicted in Figure 12.5.
Method application corresponds to function application. General object appli-
cation sets this and uses late binding whereas this application does not
change this, but also uses late binding.
12.2.4 Encapsulation
Software consists of different parts that interact with each other. Encapsulation
allows us to confine this interaction to a specified interface. No interaction
between the parts is possible unless this interface is used. Encapsulation is
crucial to software development for several reasons:
top
end
end,
Makeempty:
fun this ->
this.Content := [];
this
end
]
in ...
end
A new instance of the stack can now be created by the following expression.
Note that record property assignment adds a new property to the empty record.
We can push and pop numbers as in the following expression.
This expression first pushes the numbers 1 and 2 onto the stack, then pops and
adds them.
q E1 .E2 E1 hasproperty E2
Recall that in imPL, record property assignment was only allowed to change
properties, if they are already present in the given record. For example,
let r = [A: 1, B: 2]
in r.C := 3
...
end
14 CHAPTER 12. OPL: A SIMPLE OBJECT-ORIENTED LANGUAGE
let r = [A: 1, B: 2, C: 3]
in ...
end
We call properties in oPL first-class because they can appear anywhere, where
expressions are allowed, for example as as arguments and return values of func-
tions. The following function makes use of first-class properties and the modified
record property assignment.
let new = fun theClass -> [Class: theClass] end in ... end
Instead of creating a stack by calling Makeempty, we insist on using new for all
object creations as in the following program.
After the object is created, the initialization method Makeempty can add the
object fields, in this case the field Content.
(stack.Makeempty mystack)
Since every object knows its class, we can now implement late binding through
the following function lookup.
Note that here we make use of first-class properties. Instead of early binding,
we can now use late binding for operations on objects as follows.
This application is done in this object system by passing this as first argu-
ment to a method. For example, the following method Pushtwice performs two
this applications.
12.6 Inheritance
The final addition to our object system is the concept of inheritance. We achieve
inheritance by adding an extra property Parent to classes that indicate their
parent class.
...
let stackWithTop = [Parent: stack,
Top: fun this -> this.Content.First end]
in ...
end
The class stackWithTop extends the stack class by adding the method Top to
the methods defined by stack.
16 CHAPTER 12. OPL: A SIMPLE OBJECT-ORIENTED LANGUAGE
Note that this design allows for non-conservative extension. The new class can
override inherited methods and thereby non-conservatively change the semantics
of its instances compared to parent instances.
class M1 Mn end
stands for
[ M1 , , Mn ]
and
class extends x M1 Mn end
stands for
[Parent:x, M1 , , Mn ]
12.8. IMPLEMENTATION OF OPL 17
E.q(E1 En )
stands for
With this syntax in place, we can now write our stack class as follows.
let stack =
class method Push(x) -> this.Content := x :: this.Content
end
method Pop() -> let top = this.Content.First in
this.Content := this.Content.Second;
top
end
end
method Makeempty()
-> this.Content := []; this
end
method Pushtwice(x)
-> this.Push(x);
this.Push(x)
end
end
in
let stackWithTop =
class extends stack
method Top() -> this.Content.First
end
end
in
let myStackWithTop = (new stackWithTop)
in myStackWithTop.Makeempty();
myStackWithTop.Push(1);
myStackWithTop.Push(2);
myStackWithTop.Push(myStackWithTop.Pop() + myStackWithTop.Pop());
myStackWithTop.Top()
end
end
end
in the previous section. The resulting imPL program is then implmented using
an interpreter or compiler as described in previous chapters.
However, the efficient implementation of records described in the previous
chapter is not possible for oPL, since properties are first-class values, and new
properties can be added to records at runtime. Thus a virtual machine-based
implementation of oPL needs to revert to representing records using hashtables,
mapping property names to the stored values. Existing languages such as Java
avoid hashing for object fields through a type system that takes account of the
class hierarchy.
In addition to representing objects themselves, the efficient implementation
of object-oriented languages faces another challenge, namely the efficient imple-
mentation of late binding. It lies in the nature of late binding that the target
function can only be determined at runtime. However, the process of determin-
ing the target function can be sped up through specific optimizations.
Consider the function lookup given in Section 12.6. The function lookup
needs to follow the ancestor line of the class of the given object until a class
is found that contains a method under the given property. Since late binding
is encouraged in object-oriented languages, this process of method lookup can
become a bottleneck in the implementation of object-oriented languages.
Some virtual machine-based implementations of object-oriented languages
provide specific support for an efficient implementation of lookup. As an ex-
ample, we shall study a technique called inline caching [HCU91]. In this op-
timization, the compiler translates (lookup obj q) to a machine instruction
sequence
LD <index of obj>
LDPS q
LOOKUP <class heap address> <cache>
Here, the LOOKUP instruction has two extra parameters that each can accom-
modate a heap address. Initially, the LOOKUP instruction proceeds as given in
Section 12.6, looking up what class in the ancestor line of the class of the argu-
ment object has a method under property q. It then stores the heap address of
class of the argument obj in the provided slot <class heap address>, and the
heap address of the lookup result in the slot <cache>. Subsequent invocations
of lookup first compare the heap address of the class of the current argument
obj with the stored address in <class heap address>. If these addresses are
the same, there is no need to conduct the lookup; we can immediately return
the address stored in the <cache> slot. Only if the addresses are not the same,
the lookup proceeds up the ancestor line of the class of the current argument
object.
The rationale behind this optimization is that the actual argument objects
may change frequently between invocations of lookup, whereas the classes of
those argument objects change much less frequently. This rationale has been
confirmed by statistics on real-world programs [HCU91]. Since the technique
uses the machine code as the place to cache the lookup result, it is called inline
caching.
12.8. IMPLEMENTATION OF OPL 19
[AG96] Ken Arnold and James Gosling. The Java Programming Lan-
guage. The Java Series. Addison-Wesley, Reading, MA, 1996.
[Boo94] Grady Booch. Object-Oriented Analysis and Design with Appli-
cations. Benjamin/Cummings Publishing, Redwood City, CA,
second edition, 1994.
[Col93] Derek Coleman. Object-Oriented Development: The Fusion
Method. Prentice Hall, Englewood Cliffs, NJ, 1993.
[CW85] Luca Cardelli and Peter Wegner. On understanding types,
data abstraction, and polymorphism. ACM Computing Surveys,
17(4):471522, December 1985.
[DN66] Ole-Johan Dahl and Kristen Nygaard. Simula, an Algol-based
simulation language. Communications of the ACM, 9(9):671678,
1966.
[GR83] Adele Goldberg and David Robson. Smalltalk-80: The Language
and its Implementation. Addison-Wesley, Reading, MA, 1983.
[HCU91] Urs Holzle, Craig Chambers, and David Ungar. Optimizing
dynamically-typed object-oriented languages with polymorphic
inline caches. In Pierre America, editor, Proceeedings of the Euro-
pean Conference on Object-Oriented Programming, Lecture Notes
in Computer Science 512, pages 2138, Geneva, Switzerland, 1991.
Springer-Verlag, Berlin.
[Ing78] Dan Ingalls. The Smalltalk-76 programming system design and
implementation. In Proceedings of the ACM Symposium on Prin-
ciples of Programming Languages, Tuscon, AZ, 1978. The ACM
Press, New York.
[KMMPN83] Bent Bruun Kristensen, Ole Lehrmann Madsen, Birger Moller-
Pedersen, and Kristen Nygaard. Abstraction mechanisms in the
Beta programming language. In Alan Demers, editor, Proceedings
of the ACM Symposium on Principles of Programming Languages,
pages 285298, Austin, TX, 1983. The ACM Press, New York.
21
22 BIBLIOGRAPHY
[Ste90] Guy Steele. Common LISP: The Language. Digital Press, second
edition, 1990.