Data organization is one of the most important parts of software design and implementation. The way you organize, access, or modify your data plays an important factor on your program's performance, i.e., memory used and speed.
We will go over some of the data structures you have already encountered in this class and provide some high level analysis on their performance. We will analyze, in terms of operations needed, what is the best, worst and average time needed for operations on our data structure.
How long does it take us to retrieve an element from our data
structure? We will not use a clock to measure "how long" an
operation takes to complete. Instead we will measure how many
operations the program has to perform. Our notion of
operations will be abstract, e.g. retrieving the 5th
element from our standard
of lists takes 5 operations in total.
Up to now we have seen two types of data structures.
|Data Structure||Number of Items||Best Case||Worst Case|
Let's look at the different cases that we have when we are looking for an element inside the data structures from the preceding table.
ConsList. One operation is all we need.
ConsList(or it is not even in our
ConsList). We have to traverse the whole list in order to find our element (or decide that it is not in our list). That takes 1000 operations.
ArrayList. The elements in our
ArrayListare not sorted. We can look in the array in a sequential manner, i.e., starting from index 0 until the end of the
ArrayList. We can also pick indexes at random that are between 0 and the
ArrayList's size. Each time we make sure we do not visit the same index twice.
ArrayList. One operation is all we need.
ArrayList(or it is not even in our
ArrayList). We have looked at the whole
ArrayListin order to find our element (or decide that it is not in our list). That takes 1000 operations.
SortedConsList. The elements in our
SortedConsList. One operation is all we need.
SortedConsList(or it is not even in our
SortedConsList). Since we only have sequential access to reach the last element of our list (or even figure out that this element is not in our list) we have to traverse the whole list. That takes 1000 operations.
SortedArrayList. The elements in our
SortedArrayListare sorted. In this case we'll be a bit more clever on how we access elements. Instead of looking randomly, or even in sequence from index 0 to the end, we will do the following.
array.size()/ 2. In the case were the length of our array is odd we can take the ceiling (or floor) after we divide by 2.
array.size()/ 2 with the element we are looking for. Then we check
array.size()/ 2, then we need to look at the last half of our array. We go back to step 1, but we consider the start of our array to be
array.size()/ 2 and the end of the array to be
array.size()/ 2, then we go back to step 1, but we consider the start of array to be index 0 and the end of our array to be
array.size()/ 2, we are done.
array.size()/ 2. One operation is all we need.
Let's work out the worst case scenario as a game. Player A thinks of a number between 1 to 1000. Player B has to find this number by asking player A questions. Consider a list of integers from 1 to 1000.
|-------------------------------------------------| 1 1000 Player B : Is it 500? Player A : NO. It's bigger.
|------------------------| 500 1000 Player B : Is it 750? Player A : NO. Smaller
|-----------| 500 750 Player B : Is it 625? Player A : NO. Bigger.
|-----| 625 750 Player B : Is it 687? Player A : NO. Smaller
|---| 625 687 Player B : Is it 657? Player A : NO. Bigger
|--| 657 687 Player B : Is it 672? Player A : NO. Smaller
|-| 657 672 Player B : Is it 664? Player A : NO. Bigger
|-| 664 672 Player B : Is is 668? Player A : NO. Smaller
|-| 664 668 Player B : Is is 666? Player A : Yes
We found it after 10 turns.
Let's think about inserting a new element to our data structure and how many operations it takes. This time we will assume we have n elements instead of 1000.
|Data Structure||Number of Items||Best Case||Worst Case||Average Case|
|SortedArrayList||n||1||log2 n||log2 n|
How does a binary search tree perform? Lets build a binary search tree by adding the numbers 2, 48, 36, 20, 67, 45, 46 in order.
2 \ 48 / \ 36 67 / / 20 45 \ 46
To insert a new element we nee to
To find the appropriate place to add our new element in a binary search tree depends on the shape of our tree. For example adding 9 to
1 \ 2 \ 3 \ 4 \ 5 \ 6 \ 7 \ 8
We have to walk down all 9 numbers to find the right position to add 10. On the other hand, with a balanced tree as below
5 / \ 2 7 / \ / \ 1 3 6 8
only takes 3 operations. So what are the best and worst time for binary search trees.
|Number of nodes||Operations (balanced tree)||Operations (line tree)|