Data organization is one of the most important parts of software design and implementation. The way you organize, access, or modify your data plays an important factor on your program's performance, i.e., memory used and speed.
We will go over some of the data structures you have already encountered in this class and provide some high level analysis on their performance. We will analyze, in terms of operations needed, what is the best, worst and average time needed for operations on our data structure.
       How long does it take us to retrieve an element from our data
       structure? We will not use a clock to measure "how long" an
       operation takes to complete. Instead we will measure how many
       operations the program has to perform. Our notion of
       operations will be abstract, e.g. retrieving the 5th
       element from our standard ConsList representation
       of lists takes 5 operations in total. 
       
Up to now we have seen two types of data structures.
ConsList
          ArrayList with indexes.
          | Data Structure | Number of Items | Best Case | Worst Case | 
|---|---|---|---|
| ConsList | 1000 | 1 | 1000 | 
| ArrayList | 1000 | 1 | 1000 | 
| SortedConsList | 1000 | 1 | 1000 | 
| SortedArrayList | 1000 | 1 | 10 | 
Let's look at the different cases that we have when we are looking for an element inside the data structures from the preceding table.
ConsList
          ConsList. One operation is
            all we need.
            ConsList (or it is not even in our
            ConsList). We have to traverse the whole list
            in order to find our element (or decide that it is not in
            our list). That takes 1000 operations.
            ArrayList. The elements in our
          ArrayList are not sorted. We can look in the array
          in a sequential manner, i.e., starting from index 0 until the
          end of the ArrayList. We can also pick indexes
          at random that are between 0 and the ArrayList's
          size. Each time we make sure we do not visit the same index
          twice.
          ArrayList. One
            operation is all we need.
            ArrayList (or it is not even
            in our ArrayList). We have looked at the whole
            ArrayList in order to find our element (or decide
            that it is not in our list). That takes 1000 operations.
            SortedConsList. The elements in our
          SortedConsList are sorted.
          SortedConsList. One operation is
            all we need.
            SortedConsList (or it is not even
            in our SortedConsList). Since we only have sequential
            access to reach the last element of our list (or even figure
            out that this element is not in our list) we have to traverse
            the whole list.  That takes 1000 operations.
            SortedArrayList. The elements in our
          SortedArrayList are sorted. In this case we'll
          be a bit more clever on how we access elements. Instead of
          looking randomly, or even in sequence from index 0 to the end,
          we will do the following.
          array.size() / 2. In the case were
              the length of our array is odd we can take the ceiling
              (or floor) after we divide by 2.
            array.size() / 2 with
            the element we are looking for. Then we check 
            array.size() / 2, then we
              need to look at the last half of our array. We go back to 
              step 1, but we consider the start of our array to be
              array.size() / 2 and the end of the array to
              be array.size().
              array.size() / 2, then
              we go back to step 1, but we consider the start of
              array to be index 0 and the end of our array to be
              array.size() / 2.
              array.size() / 2, we are done. 
              array.size() / 2. One operation is all we need. 
            Let's work out the worst case scenario as a game. Player A thinks of a number between 1 to 1000. Player B has to find this number by asking player A questions. Consider a list of integers from 1 to 1000.
 
              |-------------------------------------------------| 
              1                                                1000
              Player B : Is it 500? 
              Player A : NO. It's bigger.  
        
         
              |------------------------| 
              500                      1000
              Player B : Is it 750?  
              Player A : NO. Smaller 
        
         
              |-----------|
              500        750
              Player B : Is it 625? 
              Player A : NO. Bigger. 
        
         
              |-----|
              625   750
              Player B : Is it 687? 
              Player A : NO. Smaller
        
         
              |---|
              625 687
              Player B : Is it 657? 
              Player A : NO. Bigger
        
         
              |--| 
              657 687 
              Player B : Is it 672? 
              Player A : NO. Smaller
        
         
              |-|
              657 672 
              Player B : Is it 664? 
              Player A : NO. Bigger
        
         
              |-|
              664 672 
              Player B : Is is 668? 
              Player A : NO. Smaller
        
         
              |-|
              664 668  
              Player B : Is is 666? 
              Player A : Yes 
        
        We found it after 10 turns.
Let's think about inserting a new element to our data structure and how many operations it takes. This time we will assume we have n elements instead of 1000.
| Data Structure | Number of Items | Best Case | Worst Case | Average Case | 
|---|---|---|---|---|
| ConsList | n | 1 | n | n | 
| ArrayList | n | 1 | n | n | 
| SortedConsList | n | 1 | n | n/2 | 
| SortedArrayList | n | 1 | log2 n | log2 n | 
How does a binary search tree perform? Lets build a binary search tree by adding the numbers 2, 48, 36, 20, 67, 45, 46 in order.
                                 2
                                  \ 
                                   48
                                   / \ 
                                  36 67
                                  /  /
                                 20 45 
                                     \
                                     46
 
        To insert a new element we nee to
To find the appropriate place to add our new element in a binary search tree depends on the shape of our tree. For example adding 9 to
                             1
                              \ 
                               2
                                \ 
                                 3
                                  \
                                   4
                                    \ 
                                     5
                                      \ 
                                       6
                                        \
                                         7
                                          \ 
                                           8
        We have to walk down all 9 numbers to find the right position to add 10. On the other hand, with a balanced tree as below
                                    5
                                   / \
                                  2   7
                                 / \ / \
                                1  3 6  8
          only takes 3 operations. So what are the best and worst time for binary search trees.
| Number of nodes | Operations (balanced tree) | Operations (line tree) | 
|---|---|---|
| 7 | 3 | 7 | 
| 15 | 4 | 15 | 
| 31 | 5 | 31 |