LAB 5 -- Sorting arrays

In this lab you will learn Quicksort, one of the fastest known algorithms for sorting an array in-place, and Binary Search, the method for searching in a sorted array that is the ABC of many serious algorithms.

You will use templates, to make your code general enough to sort and search arrays of any type of values for which operator < is defined.

Preliminary facts

1. Let T be a type. The following lines cause the system to allocate an array of n contiguous values of that type. The starting address of this array is stored in ptr,

T * ptr;     // declare a pointer to T
ptr = new T[ n ];
or the same thing in one line,
T * ptr = new T[ n ]; 
After this, you will address the slots in the array as usual, ptr[0], ..., ptr[n-1]. The operating system will mark off the n * sizeof(T) bytes of RAM starting at that address as busy (that is, no other program will be allowed to use them). When you are finished with that memory, you should release it:
delete [] ptr; 
delete always takes a pointer. There are two forms of it, delete and delete [], the former is used to deallocate a single object, the latter to deallocate an array. In both case, the destructor ~T() is called for each object being deallocated, before the corresponding area of RAM is marked free. Forgetting to deallocate memory you allocated with new is a serious problem, called a memory leak.

2. Class string has all the standard comparison operators such as operator==, operator!=, operator < defined, so you can compare strings as you compare numbers. Strings are ordered, with respect to operator <, in lexicographic order, the one in which words are listed in a dictionary. Due to this, you will be able to sort and search arrays of strings with no extra effort.

A few helper functions

We suggest that you write and test a few helper functions for dealing with arrays before you start programming quicksort and binary search. Here is an example of a template functions:

template< class T >
void print_array( T * a, int len ){
  for( int i=0; i < len; i++ )
    cout << a[i] << " " ;
  cout << endl;
}
You have to pass the length of the array as a parameter, because the name of the of the array is just its starting address, no more. Continuing the previous example, you will write
print_array( ptr, n );  
to print out your array. Notice that the type T for the template is deduced by looking at the type of ptr.

We suggest that you write functions for filling your arrays randomly, checking if an array is sorted etc. Your tests should be as few lines of code as possible.

Task I

Write an implementation of Quicksort for an array of integers. If you are not entirely comfortable with templates, we suggest that you don't use them for this task (and add them later when you are sure your program is working. You will need to write the functions

int pivot( int a[], int start, int stop, int pos );  // one turn around the pivot

void quicksort( int a[], int low, int high );        // recursive  
Your can look at the implementation in Budd's book, pages 347--353. Keep in mind that he takes a slightly different approach to stopping his loops -- his last argument is always the first invalid index, not the last valid one. If you ignore this difference, you will get plenty of off-by-one errors that will eventually kill your algorithm. Understand the algorithm, do not try to adjust the given code hoping to hit upon the right adjustment.

To test these functions, create an array a of integers, fill it with random numbers, apply your quicksort(..) and check if a is indeed sorted now. Repeat the test 100 times :-). Here's what my test code looked like:

  int n;
  cout << "Enter array size: ";
  cin >> n ;

  int* arr = new int[n];

  for( int i=0; i < NUM_TESTS; i++ ){  
      fill_random( arr, n );
      quicksort( arr, 0, n-1 );  // quicksort invoked 
      // print_array( arr, n );  // commented out because it printed too much stuff.
      if( is_sorted( arr, n ) ) cout << "OK ";
      else cout << "ERROR! ";
  }
Most likely, your first version will contain bugs, and you will need to debug it by stopping after each swap in pivot and observing the results. User the debugger and/or a number or debugging prints-out on entry and on exit to your functions.

Task II

Implement the template versions of the functions pivot(..) and quicksort(..) and use them to sort an array of strings. You can enter the strings from the keyboard, or read them from a file. Recall that the following code code will read in a text file, word after word (spaces and newlines are considered word separators and skipped).

  ifstream f( "sometext.txt" );
  string s;
  
  while( f >> s ){  // f >> s fills s with the next word and returns true,
                    //   or fails (when at end of file) and returns false
     do_work_with_word( s );    
  } 

Task III

Implement binary search in the sorted array. Namely, write functions

template< class T > 
bool BinaryFind( T * array , int len, T elt ) 

template< class T > 
bool BinaryFindFirst( T * array , int len, T elt, int & pos ) 
template< class T > bool BinaryFindLast( T * array , int len, T elt, int & pos ) This function takes the array, its length, the element elt we are looking for and, for the last two functions, a reference to an integer variable pos. The function BinaryFind(..) returns true if elt is found in the array, false otherwise. The other two functions additionally fill pos with the position at which it occurs first/last. If elt does not occur in the array, their action on pos is not specified.

Test your functions interactively a number of times. Reading words from a file, storing them in an array, sorting them and searching for user input words in this array makes a good test. You can use the text fragment from "Alice in Wonderland" for your sample text: alice-proc.txt. See the appendix to the Hash Table extra credit assignment for details about how this text was obtained and prepared.

Task IV (extra credit)

Integrate your code with that for our vector class Table. Add a new method Table< T >::Sort(), which performs Quicksort on the contents of the table. Declare a Table<string>, read in a text file of words, and print the words sorted in alphabetic order.

Change the table by adding an extra array of counters and count how many times each word occurs in the text. Sort the words in the order of their frequency and print the list (now each word is printed once) so oredered.