Notes by Gene Cooperman, © 2009 (may be freely copied as long as this copyright notice remains)
We have now seen depth-first search (DFS, Chapter 3), breadth-first search (BFS, Chapter 4), and best-first search (based on Dijkstra's algorithm in Chapter 4).
Recall that in breadth-first search, one maintains the queue of unexplored vertices (visited, but neighbors not expanded yet) as a FIFO (first-in-first-out) queue. In choosing which vertex to explore next, we always explore the "oldest" vertex on the queue.
As seen in the text, the innermost operation of best-first search is to:
In best-first search, each vertex has associated with it a cost function that gives the smallest cost path back to the origin. As we find multiple paths to a given vertex, we update the cost of that vertex to be the smallest cost path seen so far.
As seen in the text, the innermost operation of best-first search is to:
The above algorithm is also known as Dijkstra's shortest path algorithm.
A FIFO queue of size n
should have two especially efficient operations,
and one really slow operation:
1. Insert new element (efficient, O(1)
)
2. Remove "oldest" element (efficient, O(1)
)
3. Find and remove smallest element (slow O(n)
)
A priority queue of size n
should have one especially efficient
and two moderately efficient operations:
1. Insert new element (moderately efficient, O(log n)
)
2. Remove "oldest" element (moderately efficient, O(log n)
)
3. Find and remove smallest element (efficient, O(log n)
)
From this, clearly a FIFO queue is the right match for a breadth-first search algorithm, since it only needs the first two operations to process a vertex: insert new element, and remove "oldest" element. In contrast, a priority queue is the right match for a best-first search algorithm, since it needs all three operations.
In best-first search, processing each vertex must
execute all three operations, which is O(log n + log n + log n) = O(log n)
in the case of priority queues.
If a best-first search were implemented using a FIFO queue, it would
cost O(1 + 1 + n) = O(n)
to process one vertex.
There are several data structures that one could consider to implement a FIFO queue and a priority queue.
(Note that if we know in advance the maximum number of vertices, a circular buffer is often the best. For large linked lists, we may have a cache miss at almost every step. A cache miss costs around 100 CPU cycles on current CPUs.)
O(n)
.]Sorted arrays and hash arrays have other nice features, however. Hash arrays are just what we want to maintain the visited status of a vertex. (Given the name of the vertex, hash it to find where the vertex is stored, and then look up its visited status.) So, we always want to also use a hash array along with our FIFO queue (breadth-first search) or along with our priority queue (best-first search).
Sorted arrays often end up competing with hash arrays for efficiency. For small arrays, sorted arrays are sometimes better, because they avoid the higher overhead of a hash array.
The textbook (exercise 4.16 at end of Chapter 4) has an implementation of a binary heap. (Add further details later??)