CS 3500 Fall 2013

Lecture 8: Iterator Abstraction
-------------------------------

Topics:
-------

Why do we need iterators
Functional Iterator -- The Traversal interface
Mutable Iterator - the Iterator and Iterable interfaces
Handling the Concurrent Modification
Iterating over other data types


----------------------------

Why do we need iterators:
-------------------------

A number of programs deal with a collection of data of the same type and perform some operations on the individual data elements. In DrRacket there are several functions that perform operations on lists: map, filter, fold, foldr, andmap, ormap, even sort. In Java (and most object-oriented languages) the basic way to traverse over the data is by designing a method with a loop statement within that extracts the individual elements from the data structure and performs the desired task on them. But extracting the data elements directly exposes the underlying implementation and prevents us from designing abstractions that work for multiple representations.

The 'iterator abstraction' provides the bridge between the implementation of a data structure that represents a collection of data elements of the same type and the methods that process all elements of the collection, one element at a time. The 'iterator' object is an 'observer' of the collection, and provides methods that allow the client to ask whether there are any more elements to generate, methods that generate the next element, and methods that advance the iterator to observe the next element of the collection.

The 'iterator abstraction' can be implemented either as an immutable object, or in a mutable way. We will show both variants. Java Collections Framework provides only the mutable variant, but equips it with a mechanism for detecting 'concurrent modification'.

Concurrent modification event happens when a program attempts to alter the structure of the data structure that is currently being observed by an 'active' iterator. Changing the structure of the data structure (by adding or deleting an element) may affect which element of the data structure should be generated next by the iterator, or even cause the iterator to have no more elements to generate, even though it has just promised that more elements are available.

So our goals is to provide an observer without revealing the internal implementation. While this is a very useful and helpful abstraction, people used to the ease of handling lists in the functional programming languages claim that 'iterator is a sign of weakness of object-oriented programming'.


Functional Iterator -- The Traversal interface:
-----------------------------------------------

To traverse a recursively-built linked list we first need to know when the list is an empty list. In a non-empty list, the 'first' element is the one we want to deliver, and to advance to see the next element, we look at the 'rest' of the list. This leads us to define a functional iterator as the following interface:

// to represent a functional iterator
interface Traversal<T> {

    // are there no more elements to generate?
    public boolean isEmpty();

    // generate the next element of the data set -
    // if available (throw NoSuchElementException otherwise)
    T getFirst();

    // produce a new Traversal that will generate 
    // the next element of the data set - if available
    // (throw NoSuchElementException otherwise)
    Traversal<T> getRest();
}

The file 'ListTraversal.java' shows the implementation of the 'Traversal' interface for both, a recursively-defined linked list and for an 'ArrayList'.

The first implementation is easy. The methods 'getFirst' and 'getRest' in the class that represents the empty list throw the desired exception. In the 'Cons' class they just return the values of the corresponding fields.

The second implementation keeps track of the index for the element to be generated next. The public constructor initialized this index to 0, and the 'getRest' method produces a new instance of this iterator, with the same 'ArrayList' data, but the index advanced by one - invoking a private constructor that prevents the client of manipulating the iterator object.

The 'ExamplesTraversals' class shows that each instance of the functional iterator generates the same element every time, even after it has produced the iterator for the next data element.


Mutable Iterator - the Iterator and Iterable interfaces:
--------------------------------------------------------

Java Collections Framework uses two interfaces to iterate over the collections of data elements. The 'Iterator' interface generates the elements of a dataset, while the 'Iterable' interface has the task of creating a new iterator for the dataset.

Here are the definitions:

// to assure an Iterator is available for this dataset
interface Iterable<T> {

    // provide an Iterator for this dataset
    Iterator<T> iterator();
}

// to allow iteration over this dataset
interface Iterator<T> {

    // are there more elements to generate?
    boolean hasNext();

    // generate the next element in this dataset
    // (if available - otherwise throw NoSuchElementException)
    // and advance the state of this iterator
    // to observe the following element (if any)
    T next();
}

The file Ariter.java shows how we can design our own way of iterating over an 'ArrayList'. The class 'ArrayListIterator' implements the 'Iterable" interface and produces an iterator for the 'ArrayList' to be traversed. We added a couple of methods to illustrate how to handle the interactions between active Iterators and the concurrent modifications of the ArrayList.

The Iterator interface is implemented by an 'inner' class 'AListIterator'. This way the class 'ArrayListIterator' can produce several iterators that will traverse over the ArrayList independently (one may be generating the third element, while the other is generating the first one). The 'AListIterator' has access to the ArrayList that the 'ArrayListIterator' contains. All it needs is the index for the current element to generate, and the rest of the implementation is straightforward.

The 'iterator()' method in the 'ArrayListIterator' just produces a new instance of the 'AListIterator' with the 'current' value set to 0.


Handling the Concurrent Modification:
-------------------------------------

To make sure that the ArrayList we are traversing does not attempt any structural changes while it is being observed by an Iterator, the class 'ArrayListIterator' keeps track of all currently active Iterator-s. An Iterator is 'active' only when it is ready to generate the next element, i.e. when its 'hasNext()' method produces 'true'.

We have chosen to keep the count of currently active iterators. The count is updated every time we produce a new Iterator and decreased after every invocation of 'getNext()' method that mutates the Iterator state where the 'hasNext()' methods returns 'false'. It also checks for this when a new Iterator is produced.

The test suite illustrates how several iterators can be active at the same time, and verifies that once the iterator has completed its traversal of the data, it no longer blocks the modification of the ArrayList.


Iterating over other data types:
--------------------------------

Java Collections Framework provides iterators that traverse over HashMap, TreeMap, LinkedList, and other collections of data. We can traverse over a binary tree by generating the data element in one of several possible orders: pre-order (generating the root, then the elements of the left subtree, then the elements of the right subtree), in-order (where the root element is generated after all elements from the left subtree, and followed by the elements of the right subtree), post-order (where the root element comes after the elements of both subtrees), or in level order, where the elements at each level are generated in the left-to-right order.

Another useful implementation of the iterator abstraction is to provide an Iterator that generates the data read from a data file. The 'StringIterator.java' file provides for traversing a set file and extracting from it individual words (to use in some text analysis).