JENKOV Tutorial On Collections
JENKOV Tutorial On Collections
The Java Collections API's provide Java developers with a set of classes and interfaces that makes it easier to handle
collections of objects. In a sense Collection's works a bit like arrays, except their size can change dynamically, and
they have more advanced behaviour than arrays.
Rather than having to write your own collection classes, Java provides these ready-to-use collection classes for you.
This tutorial will look closer at the Java Collection's, as they are also sometimes referred to, and more specifically
the Java Collections available in Java 6.
The purpose of this tutorial is to give you an overview of the Java Collection classes. Thus it will not describe each
and every little detail of the Java Collection classes. But, once you have an overview of what is there, it is much
easier to read the rest in the JavaDoc's afterwards.
Most of the Java collections are located in the java.util package. Java also has a set of concurrent collections in
the java.util.concurrent package. This tutorial will not describe the concurrent collections. These will be described in
their own tutorial some time in the future.
Here is a list of the texts in this trail:
In order to understand and use the Java Collections API effectively it is useful to have an overview of the interfaces
it contains. So, that is what I will provide here.
There are two "groups" of interfaces: Collection's and Map's.
Here is a graphical overview of the Collection interface hierarchy:
for(Object o : list){
//do something o;
}
The Iterable interface has only one method:
public interface Iterable<T> {
public Iterator<T> iterator();
}
How you implement this Iterable interface so that you can use it with the new for-loop, is
Collection Subtypes
The following interfaces (collection types) extends the Collection interface:
List
Set
SortedSet
NavigableSet
Queue
Deque
Java does not come with a usable implementation of the Collection interface, so you will have to use one of the
listed subtypes. The Collection interface just defines a set of methods (behaviour) that each of these Collection
subtypes share. This makes it possible ignore what specific type of Collection you are using, and just treat it as a
Collection. This is standard inheritance, so there is nothing magical about, but it can still be a nice feature from time
to time. Later sections in this text will describe the most used of these common operations.
Here is a method that operates on a Collection:
public class MyCollectionUtil{
MyCollectionUtil.doSomething(set);
MyCollectionUtil.doSomething(list);
Collection Size
You can check the size of a collection using the size() method. By "size" is meant the number of elements in the
collection. Here is an example:
int numberOfElements = collection.size();
Iterating a Collection
You can iterate all elements of a collection. This is done by obtaining an Iterator from the collection, and iterate
through that. Here is how it looks:
Collection collection = new HashSet();
//... add elements to the collection
It is possible to generify the various Collection and Map types and subtypes in the Java collection API. This text will
not cover generics in detail. Java Generics is covered in my
Java Generics tutorial.
The Collection interface can be generified like this:
Collection<String> stringCollection = new HashSet<String>();
This stringCollection can now only contain String instances. If you try to add anything else, or cast the elements in
the collection to any other type than String, the compiler will complain.
Actually, it is possible to insert other objects than String objects, if you cheat a little (or is just plain stupid), but this
is not recommended.
You can iterate the above collection using the new for-loop, like this:
Collection<String> stringCollection = new HashSet<String>();
List Implementations
Being a Collection subtype all methods in the Collection interface are also available in theList interface.
Since List is an interface you need to instantiate a concrete implementation of the interface in order to use it. You
can choose between the following List implementations in the Java Collections API:
java.util.ArrayList
java.util.LinkedList
java.util.Vector
java.util.Stack
There are also List implementations in the java.util.concurrent package, but I will leave the concurrency utilities out
of this tutorial.
Here are a few examples of how to create a List instance:
List listA = new ArrayList();
List listB = new LinkedList();
List listC = new Vector();
List listD = new Stack();
listA.add("element 1");
listA.add("element 2");
listA.add("element 3");
listA.add("element 0");
listA.add("element 1");
listA.add("element 2");
Removing Elements
You can remove elements in two ways:
1. remove(Object element)
2. remove(int index)
remove(Object element) removes that element in the list, if it is present. All subsequent elements in the list are then
moved up in the list. Their index thus decreases by 1.
remove(int index) removes the element at the given index. All subsequent elements in the list are then moved up in
the list. Their index thus decreases by 1.
Generic Lists
By default you can put any Object into a List, but from Java 5, Java Generics makes it possible to limit the types of
object you can insert into a List. Here is an example:
List<MyObject> list = new ArrayList<MyObject>();
This List can now only have MyObject instances inserted into it. You can then access and iterate its elements
without casting them. Here is how it looks:
MyObject myObject = list.get(0);
Set Implementations
Being a Collection subtype all methods in the Collection interface are also available in theSet interface.
Since Set is an interface you need to instantiate a concrete implementation of the interface in order to use it. You can
choose between the following Set implementations in the Java Collections API:
java.util.EnumSet
java.util.HashSet
java.util.LinkedHashSet
java.util.TreeSet
Each of these Set implementations behaves a little differently with respect to the order of the elements when
iterating the Set, and the time (big O notation) it takes to insert and access elements in the sets.
HashSet is backed by a HashMap. It makes no guarantees about the sequence of the elements when you iterate them.
LinkedHashSet differs from HashSet by guaranteeing that the order of the elements during iteration is the same as
the order they were inserted into the LinkedHashSet. Reinserting an element that is already in
the LinkedHashSet does not change this order.
TreeSet also guarantees the order of the elements when iterated, but the order is the sorting order of the elements. In
other words, the order in which the elements whould be sorted if you used a Collections.sort() on a List or array
containing these elements. This order is determined either by their natural order (if they implement Comparable), or
by a specificComparator implementation.
There are also Set implementations in the java.util.concurrent package, but I will leave the concurrency utilities out
of this tutorial.
Here are a few examples of how to create a Set instance:
Set setA = new EnumSet();
Set setB = new HashSet();
Set setC = new LinkedHashSet();
Set setD = new TreeSet();
setA.add("element 1");
setA.add("element 2");
setA.add("element 3");
The three add() calls add a String instance to the set.
When iterating the elements in the Set the order of the elements depends on what Setimplementation you use, as
mentioned earlier. Here is an iteration example:
Set setA = new HashSet();
setA.add("element 0");
setA.add("element 1");
setA.add("element 2");
Removing Elements
You remove elements by calling the remove(Object o) method. There is no way to remove an object based on index
in a Set, since the order of the elements depends on the Setimplementation.
Generic Sets
By default you can put any Object into a Set, but from Java 5, Java Generics makes it possible to limit the types of
object you can insert into a Set. Here is an example:
Set<MyObject> set = new HashSet<MyObject>();
This Set can now only have MyObject instances inserted into it. You can then access and iterate its elements without
casting them. Here is how it looks:
for(MyObject anObject : set){
//do someting to anObject...
}
For more information about Java Generics, see the Java Generics Tutorial.
//this headset will contain "1", "2", and "3" because "inclusive"=true
NavigableSet headset = original.headSet("3", true);
The tailSet() method works the same way, except it returns all elements that are higher than the given parameter
element.
The subSet() allows you to pass two parameters demarcating the boundaries of the view set to return. The elements
matching the first boundary is included, where as elements matching the last boundary are not. Here is an example:
NavigableSet original = new TreeSet();
original.add("1");
original.add("2");
original.add("3");
original.add("4");
original.add("5");
//first is "1"
Object first = original.pollFirst();
//last is "3"
Object last = original.pollLast();
Map Implementations
Since Map is an interface you need to instantiate a concrete implementation of the interface in order to use it. You
can choose between the following Map implementations in the Java Collections API:
java.util.HashMap
java.util.Hashtable
java.util.EnumMap
java.util.IdentityHashMap
java.util.LinkedHashMap
java.util.Properties
java.util.TreeMap
java.util.WeakHashMap
In my experience, the most commonly used Map implementations are HashMap and TreeMap.
Each of these Set implementations behaves a little differently with respect to the order of the elements when
iterating the Set, and the time (big O notation) it takes to insert and access elements in the sets.
HashMap maps a key and a value. It does not guarantee any order of the elements stored internally in the map.
TreeMap also maps a key and a value. Furthermore it guarantees the order in which keys or values are iterated -
which is the sort order of the keys or values. Check out the JavaDoc for more details.
Here are a few examples of how to create a Map instance:
Map mapA = new HashMap();
Map mapB = new TreeMap();
Adding and Accessing Elements
To add elements to a Map you call its put() method. Here are a few examples:
Map mapA = new HashMap();
// key iterator
Iterator iterator = mapA.keySet().iterator();
// value iterator
Iterator iterator = mapA.values();
Most often you iterate the keys of the Map and then get the corresponding values during the iteration. Here is how it
looks:
Iterator iterator = mapA.keySet().iterator();
while(iterator.hasNext(){
Object key = iterator.next();
Object value = mapA.get(key);
}
Removing Elements
You remove elements by calling the remove(Object key) method. You thus remove the (key, value) pair matching
the key.
Generic Maps
By default you can put any Object into a Map, but from Java 5, Java Generics makes it possible to limit the types of
object you can use for both keys and values in a Map. Here is an example:
Map<String, MyObject> map = new HashSet<String, MyObject>();
This Map can now only accept String objects for keys, and MyObject instances for values. You can then access and
iterate keys and values without casting them. Here is how it looks:
for(MyObject anObject : map.values()){
//do someting to anObject...
}
Queue Implementations
Being a Collection subtype all methods in the Collection interface are also available in theQueue interface.
Since Queue is an interface you need to instantiate a concrete implementation of the interface in order to use it. You
can choose between the following Queue implementations in the Java Collections API:
java.util.LinkedList
java.util.PriorityQueue
LinkedList is a pretty standard queue implementation.
PriorityQueue stores its elements internally according to their natural order (if they implementComparable), or
according to a Comparator passed to the PriorityQueue.
There are also Queue implementations in the java.util.concurrent package, but I will leave the concurrency utilities
out of this tutorial.
Here are a few examples of how to create a Queue instance:
Queue queueA = new LinkedList();
Queue queueB = new PriorityQueue();
queueA.add("element 1");
queueA.add("element 2");
queueA.add("element 3");
The order in which the elements added to the Queue are stored internally, depends on the implementation. The same
is true for the order in which elements are retrieved from the queue. You should consult the JavaDoc's for more
information about the specific Queue implementations.
You can peek at the element at the head of the queue without taking the element out of the queue. This is done via
the element() method. Here is how that looks:
Object firstElement = queueA.element();
To take the first element out of the queue, you use the remove() method which is described later.
You can also iterate all elements of a queue, instead of just processing one at a time. Here is how that looks:
Queue queueA = new LinkedList();
queueA.add("element 0");
queueA.add("element 1");
queueA.add("element 2");
Removing Elements
To remove elements from a queue, you call the remove() method. This method removes the element at the head of
the queue. In most Queue implementations the head and tail of the queue are at opposite ends. It is possible,
however, to implement the Queue interface so that the head and tail of the queue is in the same end. In that case you
would have a stack.
Here is a remove example();
Object firstElement = queueA.remove();
Generic Queue
By default you can put any Object into a Queue, but from Java 5, Java Generics makes it possible to limit the types
of object you can insert into a Queue. Here is an example:
Queue<MyObject> queue = new LinkedList<MyObject>();
This Queue can now only have MyObject instances inserted into it. You can then access and iterate its elements
without casting them. Here is how it looks:
MyObject myObject = queue.remove();
Deque Implementations
Being a Queue subtype all methods in the Queue and Collection interfaces are also available in the Deque interface.
Since Deque is an interface you need to instantiate a concrete implementation of the interface in order to use it. You
can choose between the following Deque implementations in the Java Collections API:
java.util.ArrayDeque
java.util.LinkedList
LinkedList is a pretty standard deque / queue implementation.
ArrayDeque stores its elements internally in an array. If the number of elements exceeds the space in the array, a
new array is allocated, and all elements moved over. In other words, theArrayDeque grows as needed, even if it
stores its elements in an array.
There are also Queue implementations in the java.util.concurrent package, but I will leave the concurrency utilities
out of this tutorial.
Here are a few examples of how to create a Deque instance:
Deque dequeA = new LinkedList();
Deque dequeB = new ArrayDeque();
dequeA.add("element 0");
dequeA.add("element 1");
dequeA.add("element 2");
Removing Elements
To remove elements from a deque, you call the remove(), removeFirst() and removeLastmethods. Here are a few
examples:
Object firstElement = dequeA.remove();
Object firstElement = dequeA.removeFirst();
Object lastElement = dequeA.removeLast();
Generic Deque
By default you can put any Object into a Deque, but from Java 5, Java Generics makes it possible to limit the types
of object you can insert into a Deque. Here is an example:
Deque<MyObject> deque = new LinkedList<MyObject>();
This Deque can now only have MyObject instances inserted into it. You can then access and iterate its elements
without casting them. Here is how it looks:
MyObject myObject = deque.remove();
stack.push("1");
stack.push("2");
stack.push("3");
Object obj3 = stack.pop(); //the string "3" is at the top of the stack.
Object obj2 = stack.pop(); //the string "2" is at the top of the stack.
Object obj1 = stack.pop(); //the string "1" is at the top of the stack.
The push() method pushes an object onto the top of the Stack.
The peek() method returns the object at the top of the Stack, but leaves the object on of theStack.
The pop() method returns the object at the top of the stack, and removes the object from theStack.
stack.push("1");
stack.push("2");
stack.push("3");
equals()
equals() is used in most collections to determine if a collection contains a given element. For instance:
List list = new ArrayList();
list.add("123");
return true;
}
}
Which of these two implementations is "proper" depends on what you need to do. Sometimes you need to lookup
an Employee object from a cache. In that case perhaps all you need is for theemployeeId to be equal. In other cases
you may need more than that - for instance to determine if a copy of an Employee object has changed from the
original.
hashCode()
The hashCode() method of objects is used when you insert them into a HashTable, HashMap orHashSet. If you do
not know the theory of how a hashtable works internally, you can read abouthastables on Wikipedia.org.
When inserting an object into a hastable you use a key. The hash code of this key is calculated, and used to
determine where to store the object internally. When you need to lookup an object in a hashtable you also use a key.
The hash code of this key is calculated and used to determine where to search for the object.
The hash code only points to a certain "area" (or list, bucket etc) internally. Since different key objects could
potentially have the same hash code, the hash code itself is no guarantee that the right key is found. The hashtable
then iterates this area (all keys with the same hash code) and uses the key's equals() method to find the right key.
Once the right key is found, the object stored for that key is returned.
So, as you can see, a combination of the hashCode() and equals() methods are used when storing and when looking
up objects in a hashtable.
Here are two rules that are good to know about implementing the hashCode() method in your own classes, if the
hashtables in the Java Collections API are to work correctly:
1. If object1 and object2 are equal according to their equals() method, they must also have the same hash
code.
2. If object1 and object2 have the same hash code, they do NOT have to be equal too.
In shorter words:
1. If equal, then same hash codes too.
2. Same hash codes no guarantee of being equal.
Here are two example implementation of the hashCode() method matching the equals()methods shown earlier:
public class Employee {
protected long employeeId;
protected String firstName;
protected String lastName;
Collections.sort(list);
When sorting a list like this the elements are ordered according to their "natural order". For objects to have a natural
order they must implement the interface java.lang.Comparable. In other words, the objects must be comparable to
determine their order. Here is how the Comparableinterface looks:
public interface Comparable<T> {
int compareTo(T o);
}
The compareTo() method should compare this object to another object, return an int value. Here are the rules for
that int value:
Return a negative value if this object is smaller than the other object
Return 0 (zero) if this object is equal to the other object.
Return a positive value if this object is larger than the other object.
There are a few more specific rules to obey in the implementation, but the above is the primary requirements. Check
out the JavaDoc for the details.
Let's say you are sorting a List of String elements. To sort them, each string is compared to the others according to
some sorting algorithm (not interesting here). Each string compares itself to another string by alphabetic
comparison. So, if a string is less than another string by alphabetic comparison it will return a negative number from
the compareTo() method.
When you implement the compareTo() method in your own classes you will have to decide how these objects
should be compared to each other. For instance, Employee objects can be compared by their first name, last name,
salary, start year or whatever else you think makes sense.
Collections.sort(list, comparator);
Notice how the Collections.sort() method now takes a java.util.Comparator as parameter in addition to the List.
This Comparator compares the elements in the list two by two. Here is how the Comparator interface looks:
public interface Comparator<T> {
int compare(T object1, T object2);
}
The compare() method compares two objects to each other and should:
Return a negative value if object1 is smaller than object2
Return 0 (zero) if objec1 is equal to object2.
Return a positive value if object1 is larger than object2.
There are a few more requirements to the implementation of the compare() method, but these are the primary
requirements. Check out the JavaDoc for more specific details.
Here is an example Comparator that compares two fictive Employee objects:
public class MyComparator<Employee> implements Comparator<Employee> {
If you want to compare objects by more than one factor, start by comparing by the first factor (e.g first name). Then,
if the first factors are equal, compare by the second factor (e.g. last name, or salary) etc.