Chapter15 Iterator
Chapter15 Iterator
Contents
15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
15.3 Cohesion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
15.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
15.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1
15.1 Introduction
To iterate means to repeat. In software it may be implemented using recursion or loop
structures such as for-loops and while-loops. A class that provides the functionality to
support iteration is called an iterator.
The term aggregate is used to refer to a collection of objects. In software a collection may
be implemented in an array, a vector, a binary tree, or any other data structure of objects.
The iterator pattern is prescriptive about how aggregates and their iterators should be
implemented.
Two general principles are applied in the iterator design pattern. The first is a prominent
principle of good design namely separation of concerns. The other is a fundamental
principle of generic programming namely decoupling of data and operations. The iterator
design pattern suggests that the functionality to traverse an aggregate should be moved to
an iterator while functionality to maintain the aggregate remain the responsibility of the
aggregate itself. This way the principle of separation of concerns is applied because the
functionality concerned with the maintenance of aggregates is separated form functionality
concerned with traversal of the aggregates. At the same time the operation to traverse is
decoupled from the data structures which are traversed, leading to the creation of a more
generic traversal algorithm.
In this lecture we discuss how separation of concerns leads to better cohesion before we
proceed to explain the design and implementation of the iterator design pattern.
15.2.1 Identification
Name Classification Strategy
Iterator Behavioural Delegation
Intent
Provide a way to access the elements of an aggregate object sequentially without
exposing its underlaying representation ([3]:257)
15.2.2 Problem
A system has wildly different data structures that are often traversed for similar results.
Consequently traversal code is duplicated but with minor differences because each aggre-
gate has its own way to provide the functionality to access and traverse its objects. To
eliminate such duplication, we need to abstract the traversal of these data structures so
that algorithms can be defined that are capable of interfacing with them transparently [4].
2
15.2.3 Structure
15.2.4 Participants
Iterator
Concrete Iterator
Aggregate
Concrete Aggregate
3
15.3 Cohesion
When designing a system, it is important to keep its maintenance in mind. Making
changes should be easy. One of the design principles that can be applied to avoid the
need to change a class, is separation of concerns. This means that functionality concerning
different aspects should be separated from one another by implementing them in different
classes. Thus, the functionality provided by each class in the design should be related to
one aspect only. If a single class implements various aspects of functionality, changes in
any one of these aspects will result in having to change the class. On the other hand,
if a class implements only one aspect of functionality, it will change if and only if that
specific aspect changes. Note that the amount of change does not change, only the chance
of having to change each class is reduced, meaning that changes are isolated to certain
classes.
The term cohesion is used to refer to the internal consistency within parts of the design.
In object-oriented design the level of cohesion of a system is determined by the level
of cohesion of the classes that constitutes the system. A metric to measure the lack of
cohesion in methods (LCOM) was proposed by [1]. It is recognised as the most used
metric when trying to measure the goodness of a class written in some object-oriented
language [5]. When calculating the LCOM of a class, two methods are considered to have
a lack of cohesiveness when they operate on disjunct sets of attributes in a class. While it
is valid to state that methods are not cohesive when they operate on different attributes,
it is not conclusive that they are cohesive if they operate on the same same attributes.
Consequently the presence of cohesiveness is much harder to observe than the absence
thereof.
We say a class has high cohesion when its methods are related. These methods should be
related not only by operating on the same attributes of a class, but more importantly they
must be related in terms of the functions they perform. If methods perform functions
that are related to different responsibilities, they should be separated by including them
in different classes. However, separating responsibility in design is one of the most difficult
things to do. Sometimes non-cohesiveness of a class is only realised when the class tends
to change more often or in more than one way as the system grows.
Separation of concerns will increase overall cohesion of a system and may reduce the
number of classes that has to change when needed. However, it will most likely increase the
number of classes in the system. This, in turn will increase complexity as well as coupling
between the classes. Finding the best design is illusive. When improving one thing one
tends to worsen another! Through experience one learn how to find good solutions that
has low coupling without compromising too much in terms of high cohesion.
15.4.1 Design
The Iterator design pattern applies separation of concerns specifically to aggregates. Usu-
ally aggregates have at least two functions. One being its maintenance, and the other its
traversal. Maintenance of aggregates includes methods to add and remove elements and
the like, while traversal of aggregates concerns only accessing the elements and knowing
4
an order in which they should be accessed. The iterator design pattern describe a design
that separates the mechanism to iterate through the aggregate from the other functions
an aggregate may have.
The Iterator design pattern moves the responsibility of traversing objects away from the
aggregate to another class called an iterator. The aggregate class, therefore, can have a
simpler interface and implementation because it needs only to cater for maintenance of
the aggregate and no longer for its traversal [2].
The iterator design pattern takes this good design a step further. Instead of just imple-
menting every aggregate in two classes (one for maintenance, and one for traversal), this
pattern is a design that provides a generic way to traverse the objects in aggregates that
is independent of the structure of the various aggregates. This is achieved by defining
two abstract interfaces – one for iteration and one for the rest of the functionality of ag-
gregates. This way the system is more flexible when either aggregates or their iterations
needs maintenance.
• Iterators contribute to the flexibility of your code – if you change the underlying
container, it’s easy to change the associated iterator. Thus, the code using aggre-
gates becomes much easier to maintain. Most changes to the internal structure of
the aggregates it uses will have no impact on the code that uses the aggregate.
• Iterators contribute to the reusability of your code – algorithms that were written to
operate on a containers that use an iterator can easily be reused on other containers
provided that they use compatible iterators. Thus, the same code can be used to
traverse a variety of aggregate structures in the same application. This reduces
duplication of code in applications that manipulate multiple aggregates.
• It is easy to provide different ways to iterate through the same structure for example
traversing breadth first or depth first through a game tree or for example to have
an iterator that might provide access only to those elements that satisfy specific
constraints.
15.4.3 Disadvantage
A prominent disadvantage of the application of the iterator design pattern is that it
becomes complicated the synchronise an aggregate with its iterator. Because the aggregate
structure is completely independent of the iteration process, it is thus possible to apply
changes to the aggregate while an independent thread iterates through the structure.
Such situation is prone to error.
5
In the example in Section 15.6, VectorSteppingTool creates a copy of the state of the
aggregate. In this case the iterator can not malfunction even if the aggregate is changed
during the iteration process. However, the iteration will complete without reflecting the
changes. On the other hand LinkedListIterator operates directly on its aggregate. If
the LinkedList is changed, the iterator will reflect such changes immediately. However,
it is prone to error if not synchronised properly. For example, if the current item of the
iterator is deleted, the next call to next() will cause a segmentation fault. To prevent this
the implementation should either disallow the deletion of the current item in all iterations
(which might be difficult to implement), or update the current item in all iterations when
an item is deleted. To implement this, iterators need to be registered as observers of the
delete action – this is also not trivial.
Memento
The memento pattern is often used in conjunction with the iterator pattern. An
iterator can use a memento to capture the state of the aggregate. This memento is
stored inside the iterator to be used for traversing the aggregate.
Adapter
Both patterns provides an interface through which operations are performed. They
differ in the reason for providing this interface. The adapter do it because it would
be otherwise impossible while the iterator do it specifically to generalise iteration of
aggregates.
Composite
Recursive structures such as composites usually need iterators to traverse them
sequentially. Although recursive traversal might be very easy to implement with-
out extending the composite pattern, its is strongly advised to create a composite
iterator as discussed in Section 3.
6
15.5 Implementation Issues
• Make a copy of the aggregate inside the iterator. This is the most robust
solution. This is execution-wise the most efficient, but memory-wise the least effi-
cient. It also has the drawback of not being able to reflect on-the-fly changes to the
aggregate.
• Create an object storing the state of the aggregate inside the iterator.
This more or less boils down to storing a memento (See the memento design pattern)
of the aggregate inside its iterator. This is also a robust solution. This might be
more efficient than making a copy of the whole aggregate, but not always easy to
implement. It suffers the same drawback of not being able to reflect changes to the
aggregate that are made after the iterator was created.
• Keep a pointer to the aggregate inside the iterator and use a call back
mechanism to access the elements of the aggregate. This solution is memory
efficient, yet not as robust as the other methods. In this case the methods that
needs to be called should be public in the aggregate, or alternatively the iterator
can be declared a private/protected friend class of the aggregate and hence be
given access to its private/protected methods. This solution will be able to reflect
changes that are applied to the aggregate in real time, however it is prone to error if
synchronisation between the aggregate and the iterator is not implemented properly.
Such close coupling between the aggregate and the iteration also compromises the
encapsulation of the aggregate.
• Use the pimpl1 principle. This is the most efficient, both in terms of memory
and execution time. It is also robust. How this is done is beyond the scope of this
module.
• remove() – This method should remove the current item from the aggregate. It
provides the means to synchronise the maintenance of the aggregate with its iterator
by using a double dispatch2 .
1
pointer to implementation
2
More detail on the double dispatch mechanism is discussed in L34 Visitor
7
• previous() – This method should step backwards instead of forwards to enable iter-
ations that can go in both directions. If this is supported one should also implement
to different methods for the prescribed isDone(). One for reaching the end while
moving forward, and one for reaching the beginning while moving backwards. This
is usually implemented using method names like hasNext() and hasPrevious().
• skipTo() – This method should position the iterator to an object matching specific
criteria. This operation may be useful for sorted or indexed collections to enable
the implementation of more complicated algorithms to operate on the aggregate.
Examples of algorithms that may need this operation are binary search and quick
sort.
8
When one use one of these containers, a variable of the container type is declared by
including the type of the objects as a template parameter. For example use the following
syntax to declare a vector of integers called myVector:
vector<int> myVector;
To declare an iterator appropriate for a particular STL template class, you use the fol-
lowing syntax
where name is the name of the iterator variable you wish to create and the class name
is the name of the STL container you are using, and the template parameters are the
parameters to the template used to declare objects that will work with this iterator. Note
that because the STL classes are part of the std namespace, you will need to either prefix
every container class type with std::, or include using namespace std; at the top of
your program. For example you can create an iterator for the vector myVector that was
declared in the above mentioned example as follows:
std::vector<int>::iterator myIterator;
The two loops in the following code fragment are functionally equivalent. The first uses
an integer counter to iterate through the vector that was declared in the above mentioned
example, while the second uses the iterator that is declared here:
for ( int myCounter = 0 ;
myCounter< myVector . s i z e ( ) ; myCounter++)
cout << myVector [ myCounter ] << ’ \ t ’ ;
for ( m y I t e r a t o r = myVector . b e g i n ( ) ;
m y I t e r a t o r < myVector . end ( ) ; m y I t e r a t o r++)
cout << ∗ m y I t e r a t o r << ’ \ t ’ ;
Note how the elements of the vector are accessed by using the operator[] in the first loop,
while they are accessed by dereferencing the iterator in the second loop. To move from
one element to the next, the increment operator, ++, is used in both cases. Iterators
overload all operators. One can use the standard arithmetic shortcuts such as --, +=
and -=, and also use !=, ==, <, >, <=, and >= to compare iterator positions within the
container.
The following are some pitfalls to watch out for when using STL iterators:
• Iterators can be invalidated if the underlying container (the container being iterated
over) is changed significantly
9
10
Figure 2: Class Diagram of a system illustrating the implementation of the iterator design pattern
15.6 Example
Figure 15.5.5 is a class diagram of an application that implements the iterator design
pattern. It implements two data structures and their respective iterators. The main
program uses the same code to manipulate any one of these data structures. It also shows
how two independent iterators can be used to traverse the same structure at the same
time. The two data structures are a vector and a singly linked list.
Iterator
Concrete Iterator
Aggregate
11
• It also defines methods to be able to maintain concrete objects of classed de-
rived from this interface. It supports only one insertion and a default deletion
of elements, as well as a means to determine if the collection is empty.
Concrete Aggregate
Client
15.7 Exercises
1. Change the test harness (main program) of the system given in Section 15.6 to allow
the insertion and deletion of nodes during iteration and observe the impact. (See
Section 15.4.3).
2. Add a structure that stores a binary tree of double values to the system given in
Section 15.6. The values should be inserted in such a way that the left child of
every parent is smaller than its parent and the right child is larger than its parent.
Duplicates should be ignored. Implement different concrete SteppingTools to allow
pre-order, in-order, and post-order traversal of your binary tree.
12
Write a new test harness. This program should insert random double values si-
multaneously into a VectorOfDoubles and into your binary tree in the order they
are generated. Use a VectorSteppingTool to display the vector and your different
binary iterators to show the different traversals of your binary tree.
References
[1] S.R. Chidamber and C.F. Kemerer. A metrics suite for object oriented design. IEEE
Transactions on Software Engineering, 20(6):476 –493, jun 1994.
[2] Eric Freeman, Elisabeth Freeman, Bert Bates, and Kathy Sierra. Head First Design
Patterns. O’Reilly Media, Sebastopol, CA95472, 1 edition, 2004.
[3] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design patterns :
elements of reusable object-oriented software. Addison-Wesley, Reading, Mass, 1995.
[5] Sami Mäkelä and Ville Leppänen. Observation on lack of cohesion metrics. In Proceed-
ings of the International Conference on Computer Systems and Technologies. 2006.
13