Preparation for the Final Exam.

If you print this out, set your font to constant width (e.g. Courier).
Sorry for lousy formatting.

You may not see the possible pitfalls unless you
actually try to implement your solution. It is therefore
advised that you spend some extra time in the lab, even
if you are sure you can write the code when needed. 

The code I refer to will be uploaded on Ambassador. Guess what --
I make no claim that it is 
                           a) in really good style
                           b) bug free
While I cannot do much about the style right now, anyone who spots a bug 
(and sends a timely e-mail to report it) gets EXTRA CREDIT :-). 

If you haven't implemented the solutions to the practice midterm 
problems, consider doing it now.

Quiz 1 and Quiz 4 are also useful for preparation (both now
have solutions on Ambassador in the Solutions folder).

0. Basics of C++:  Constructors, Destructors, Overloading,
   References, Templates.

I expect you to understand the purpose and the basic properties of

 * references        ( mostly for passing objects 'by reference' in and 
                          out of functions, see copy contr. )
               
 * copy constructors ( called whenever an object is passed into
                          or out of a function 'by value' 
                       used in declarations that declare and 
                          initialize an object from another object
                          of the same type:
                             X a( 1, 2, 3);   // X::X(int,int,int) called
                             X b = a;         // X::X(const X&), the c.c.
                             X c(a);          // and c.c. again
                       used to manipulate temporaries )               

 * operator=         ( used to copy contents, but not to create objects, 
                          unlike the c.c.
                       usually guards agains self-assignment:
                             X& operator= ( const X& other ){
                               if( this != &other ){ 
                               // do stuff ... }  
                               return *this;
                             }  
                       returns *this to allow multiple assignments )
                       
 * this              ( available only inside member functions of a class --
                       it's the pointer to the specific objects who calls 
                          the member, e.g. the overloaded operator= :
                             X a(1,2,3);
                             X b(3,4,5);  
                             a = b;       //see code above
                          calls  a.operator= (b), b is now 'other' , and
                          'this' is a pointer (type X*) to a . So smth like
                             a = a; 
                          makes this == &a true, stuff will not be done )  
                          
 * default           ( needed for smth like   X a;  or  X arr[100]; 
   constuctor          will be created automatically unless other 
                          constructors are declared )
                     
 * destructors       ( called when you use  delete  explicitly or when a 
                          local variable goes out of scope )
 
 * new and delete    ( call constructors and destructors
                       if you say  
                             X* some_ptr = new X[100]; //array of 100 X's 
                       you must say
                             delete[] some_ptr; //*not* delete some_ptr;  
                          
a) Implement the class Series which reads in a series of numbers from a file
   (the length of the series is the first number in the file, hence *not* 
   known at compile time), stores them and finds sums, averages etc. 
   A series object must be able to re-read from another file, replacing the 
   old data with the new (which may be of different length!).
   
   Since the size is not known at compile time, you will have to allocate
   memory on free store using new [] (i.e. dynamically) and free it up 
   with delete[], and keep a pointer to that memory in your class.
    
    Here's a series of doubles:
    
    class Series {
    private:
        // your stuff
    public:    
        Series( char* filename ); // read a file of the following format: 
                                  // length  num1  num2  num3 ...
        ~Series();
        
        void read_file( char* filename );  // read file and replace data 
        int size();                        // get the length of the series
        double sum(); 
        double average();
        //..     
    };
    
    The following must work (correctly):
    
    Series A( "data-1.dat" );
    Series B( "data-2.dat" ); 
    Series C( A );              //C get a copy of A's data
          
    cout << A.size() << " " << A.sum() << "\n"; //for data-1.dat          
    cout << B.size() << " " << B.sum() << "\n"; //for data-2.dat         
    
    B = A;                         //B gets a copy of A's data, loses its
                                   //own, which gets removed from memory
    A.read_file( "data-3.dat" );   //A's old data replaced and discarded  
                                                                       
    cout << B.size() << " " << B.sum() << "\n"; //they print the same  
    cout << C.size() << " " << C.sum() << "\n"; //numbers as the 1st cout,
                                                    //i.e. for data-1.dat

    cout << A.size() << " " << A.sum() << "\n"; //prints new stuff for
                                                    // data-3.dat    
    
    A = A;         //should not move any data -- self-assignment 
    B = C = A;     //everyone gets data-3 now 
    
Clearly, this code requires non-default copy constructor and operator=  .
What happens if these are not specified (i.e. the automatically generated
defaults are used?)

Put some messages in your constructors and operators, or step through
your code in the debugger so that you know what gets called and when.

b)  Write an overloaded operator[]  so that  cout << A[2]; prints the third
    number in the series and  A[2] = 100;  resets it to 100 . 
      
c)  Write an overloaded operator<  which compares series lexicographically
    (that is, 
            { 20 30 10 40 } < { 21 90 }
            { 21 30 10 40 } < { 21 30 10 90 }    
            { 10 11 12 13 } < { 11 }           
            { 10 11 12 13 } < { 10 11 12 14 }   
            { 10 11 12 13 } < { 10 11 12 13 14 }   etc, 
     i.e. compare the first entries, if they are equal compare the next pair
     and so on. Whoever has a smaller number on the first step when the 
     entries are different loses.)      

d)  Now make your class a template class, so that you can declare
    Series<int>,  Series<double> and Series<string>  (the latter
    is the most interesting -- what is the sum of two strings? You
    can add play around with this, if you have time. See Sec. 7.3 
    pp. 130-137 of the text.)

e)  Discuss the following "code":

    class Musor{ ... };
    void foo() { new Musor(0); }

f)  Sort an STL list<int> by any method (bubble-sort will do).
    Sort an STL list of any <class T> for which T::operator< is defined. 

                          
1. Binary Search Trees -- see the BST dictionary on Ambassador

This is an implementation of a binary tree of strings (the string
type in C++ has operator< defined). It is used to implement
a dictionary. I count and print out the list of all different
words in "Alice in Wonderland". In the following exercises
you can use the dictionary tree constructed by that code.

Write recursive functions to

a) count the number of leaves in the tree (about 5 
   lines o'code for the function body)
   
b) find the "deepest" word in the tree (i.e the word(s)
   with the longest path leading to them from the root)
   
c*) write a function to delete a word from the tree. The resulting
    tree must remain a proper BST, that is, the order should be 
    preserved at all nodes. When you delete a non-leaf node,
    you have to re-attach its left and right subtrees. Let's say
    the node you deleted was a right child, then its right subtree
    can be re-attached at the place of the deleted node (why?).
    As for the left subtree, you have to find a proper place for it.
    Design and code the algorithm (see also Sec. 12.6, pp. 287-288 of the text)

d)  On paper: draw a random binary tree and perform pre-order,
    post-order, in-order and level-order traversals. Which one
    will give you a completely ordered sequence of values in case
    of a BST?
    
2. Priority Queues

Under a priority queue we understand a data structure (a collection of elements
of some type for which operator< is defined) that supports the following 
operations: 
              * push  // add an element to the collection
              * top   // return the _largest_ element in the collection
              * pop   // remove the _largest_ element from the collection
              
This definition can be changed to returning and popping the _smallest_ element
instead. A priority queue is an _abstract data type_, i.e. we don't
want to care how it's implemented. Usually a heap is used.              
              
a) Implement a priority queue of ints on top of a list<int> or vector<int>. 
   This is going to be an inefficient but simple implementation. You will 
   have to do some sorting on every push and/or pop. 
   Test your code by pushing random integers into your priority queue and
   then poping them off. They should come out in decreasing order.        
      
b) Recall the extra-credit problem: 
   Print, in increasing order, all natural numbers <= N  
   such that their only prime divisors are 2, 3, and 5 (i.e. all  
   numbers one can get by multiplying together a certain number 
   of 2s, 3s, and 5s). Here's the beginning of this sequence: 
         2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, ... 
                  
   Read the buggy solution in  pqueue-example and fix it. Here's the text:

    const int N = 100;   //Buggy solution. Fix it. 
    main(){ 
        priority_queue<int, vector<int>, greater<int> > PQ;  
        int x = 1; 
        while( x <= N ){ 
            PQ.push( x*2 );  
            PQ.push( x*3 );   
            PQ.push( x*5 );   
            x = PQ.top();  
            cout << x << " ";  
            PQ.pop(); 
        } 
        cout << " finished.\n";  
    }  

3. Heaps and HeapSort -- see Heap on ambassador and 
                          read Sec. 15.2 pp. 362-369.

a) Make my heapsort in Heap into a template, so that it can
   heap-sort a vector of any type for which operator< is defined.

   Specifically, make it work for an array of strings (for
   strings operator< is defined as lexicographic, the same
   order you see in a dictionary).

b) On paper: Take an array of random ints and 
     -- make a heap out of it, as per heapsort algorithm
     -- apply adjust_heap repeatedly as per heapsort algorithm 
        to make the heap you've got into an ordered array
  
4. Hash functions -- Read Sec. 17 (you may skip 17.4, but read 17.6)

a) Do Ex. 1, 4, 5, 6

b) Look at the  hash-test  code. If we were doing a full-scale implementation
   of a hash table for strings (in this case, all different words found in
   "Alice in Wonderland" plus some fake words as a result of my lousy parsing),
   our hash_array would contain a word or a list of words whose indexes into 
   hash_array (calculated by hash_function) collide. In this example, we
   simply count the number of collisions and print a histogram.
   
   The file word_list contains approx 2,600 words. Experiment with different
   sizes of HASH_BASE to see how it affects the number of collisions. 
   Take HASH_BASE = 1009 or 997 and then take HASH_BASE = 1024. Can you
   explain the difference? Look at the number of 0's in the hash_array - these
   entries were not hit by any word.
   
   Write your own hash_functions as described in 17.6 ( pp. 423-424 ). 
   Compare them with mine. Then look in hash_function.h in STL sources to
   see their hash function for strings. Their code for hashing a 'string s' 
   looks like
                 long h = 0;
                 for(int i=0; i<s.length(); i++)
                   h = 5*h + s[i]; 
                 return h % HASH_TABLE_SIZE;
   
5.   Graphs  --  refer to the Graph project on Ambassador.
   
I have implemented Warshall's and Floyd's algorithms fro graphs represented
as adjacency matrices, and Dijkstra's algorithm (and traversals) for
graphs represented by edge lists. Both adj. matrix algorithms are in
matrix-alg.h , the supporting code for the matrix class is in  matrix.h
and the code for the graph (including the constructor that reads 
the data representing the graph from a file -- observe the expected format 
explained in comments!) are in  graph.h  .  The interaction between those 
is made clear by  test-graph.cc  . The code uses no STL data types, just 
my matrices.  matrix.h  has a constructor that reads a matrix from a file 
(observe the expected format!).  

The code for edge list representation is not yet on Ambassador, but you can 
get it off my page.
  
a) Write the code that reads an adjacency matrix from a file (you can use
   matrix::matrix( char* filename ) and constructs the _edge list_ 
   representation of the graph defined by adjacency matrix. 

   You can use any STL data structures (see Sec. 19.2 and Sec 19.3, 
   pp. 447-448, 451-453 for explanations of both implementations and 
   some STL code). Given the typedefs as follows, you could use a  
   list_of_vertex  to keep the edges going out of a vertex and a  
   vector_of_list_of_vertex  to keep the latter lists in one vector, 
   indexed by the vertex id number.
                
   typedef  int                                   vertex;             
   typedef  list< vertex, allocator<vertex> >     list_of_vertex;  
   typedef  vector< list_of_vertex,
                    allocator<list_of_vertex> >   vector_of_list_of_vertex;
    
   That way if you declare   vector_of_list_of_vertex  edge_list;  and fill
   it in properly, then  edge_list[2]  will be a list_of_vertex, the list of
   vertices you can get to by going along one edge from vertex 2 .
   
   Read in an adjacency matrix and print nicelythe resulting edge list.
                                                                       
Note: the type vertex is simply aliased to int. If you want your vertex to 
      be a more complicated struct, don't forget to add the macro 

                __MSL_FIX_ITERATORS__ (vertex);                      

      after your defnition of  struct vertex { ... };
   
b) Using either the existing matrix representation of a graph or your edge list
   represenattion, implement the depth-first and breadth-first traversals
   (see Sec. 19.3.1, pp. 453-455) If you decide to use the adjacency matrix
   representation, you will have to find all the neighbors of a a vertex  v  
   by something like this:                    
   
         graph  gr( "your_data_file.dat" );
         vertex v = 2;  //vertex is just an int
                          
         // find all neighbors of v                 
         for( int i=0; i < gr.size() ; i++){
           if ( gr.are_connected( v, i ) ) // edge  v --> i  exists, process it
              cout << "edge " << v << " --> " << i << " found\n";     //etc.
         }     
    
c) Construct the minimal spanning tree for the weighted graph on p. 456 using
   Kruskal's algortihm (you can wait for my handout on Thursday to do this). 
     
d) Dijkstra's algorithm finds the costs (distances, total weights) of the 
   minimal paths from the source to any vertex, BUT, what are these paths?
   Modify the code for Dijkstra's algorithm to keep track of the actual
   shortest paths so that a shortest path to a given target vertex can
   be recovered after the algorithm has finished. 
  
   Hint: Keep track of the predecessors along shortest paths, as in Floyd's.