#|

LECTURE 28: Binary Search Trees


A binary search tree is a data structure for quick access to
elements. The idea is to store elements in a tree, with the property
that all elements less than (or equal) to the root at in the left
subtree, and all elements greater than the root are in the right
subtree.  

It the tree is *balanced*, that is, as close to complete as possible,
to find an element in the tree (or to see that it is not in the tree),
it takes time worst-case roughly log(N), where N is the number of
elements in the tree. In contrast, looking for an element in a list
(or seeing that the element is not in the list) takes time worst-case
roughly N, where N is the number of elements in the list. When N is
large, log(N) is much much smaller than N. So binary search trees are
more efficient when search amongst large sets of numbers.

(Why might you use binary search trees? If you wanted to implement,
say, sets of numbers. You could use a binary search tree to implement
the set, and seeing if an element is in the set corresponds to looking
for an element in the tree.)

We are interested in coming up with an ADT for binary search
trees (or bst). The key is to make sure that every bst we construct
satisfies the property that all elements in the left subtree of a node
are less than or equal to the root value, and all elements in the
right subtree of a node are greater than the root value, and this for
every node in the bst.

To ensure this, instead of providing a constructor such as node (like
we did for the tree ADT), we provide a constructor insert that inserts
an element in the right spot in the tree.

  empty : () -> bst
  insert : (integer bst) -> bst

  isEmpty : (bst) -> bool
  root : (bst) -> integer
  left : (bst) -> bst
  right : (bst) -> bst

What's the algebratic specification? That's more interesting. Again,
we need to come up with equations corresponding to the application of
selectors to each of the constructors (empty, and insert). The
constructor insert is the more interesting one. Here is the algebraic
specification. 

  (isEmpty (empty)) = true
  (isEmpty (insert a t)) = false

  (root (insert a t)) = 
      a                 if (isEmpty t)=true
      (root t)          if (isEmpty t)=false

  (left (insert a t)) = 
      (empty)                if (isEmpty t)=true
      (insert a (left t))    if (isEmpty t)=false and a<=(root t)
      (left t)               if (isEmpty t)=false and a>(root t)

  (right (insert a t)) = 
      (empty)                if (isEmpty t)=true
      (right t)              if (isEmpty t)=false and a<=(root t)
      (insert a (right t))   if (isEmpty t)=false and a>(root t)

Intuitively, the specification for left and right tell you that to get
the left subtree of a bst, you keep all the inserts that inserted
something that was less than the root of the tree, and symmetrically
for getting the right subtree of a bst. If you're not convinced, try
the specification on a few examples.

So what's an implementation for the above? We use the same
representation for trees that we used last lecture, and the same
functions implementation for empty, isEmpty, left, right, root. 

|#

(defun empty () 
  NIL)

(defun isEmpty (tr)
  (endp tr))

(defun root (tr)
  (first tr))

(defun left (tr)
  (if (endp tr)
      NIL
    (second tr)))

(defun right (tr)
  (if (endp tr)
      NIL
    (third tr)))

#| 

For insert, we simply compare the value to insert to the root of the
tree to insert into. If we're inserting into an empty tree, we create
a node for the value. If the value is smaller or equal to the root
value, we recursively insert the value into the left subtree of the
current tre, and if the value is greater than the root value, we
recursively insert the value into the right subtree of the current
tree. The value gently finds its right spot in the tree.

|#

(defun insert (a tr)
  (if (endp tr)
      (list a NIL NIL)
    (if (<= a (root tr))
	(list (root tr) (insert a (left tr)) (right tr))
      (list (root tr) (left tr) (insert a (right tr))))))


#|

Let's write down a predicate bst? that checks that a structure is
actually a binary search tree, by checking the bst property that all
values in the left subtree are less than or equval to the root value,
and all values in the right subtree are greater than the root value.
We need a couple of auxiliary functions to make the definition of bst?
more palatable: (all-<= a tr) checks that every element in tree tr is
<= than a, and (all-> a tr) checks that every element in tree tr is >
than a.

|#

(defun all-<= (a tr)
  (if (endp tr)
      T
    (and (<= (root tr) a)
	 (all-<= a (left tr))
	 (all-<= a (right tr)))))

(defun all-> (a tr)
  (if (endp tr)
      T
    (and (> (root tr) a)
	 (all-> a (left tr))
	 (all-> a (right tr)))))

(defun bst? (tr)
  (if (endp tr)
      (= tr NIL)
    (and (all-<= (root tr) (left tr))
	 (all-> (root tr) (right tr))
	 (bst? (left tr))
	 (bst? (right tr)))))

#|

As a sanity check, we may want to prove that inserting a value into a
binary search tree yields a binary search tree. 

  (defthm insert-into-bst
    (implies (bst? tr)
             (bst? (insert a tr))))

That's not so obvious to prove. I'll leave it as a (hard) exercise for
now. I don't mind so much not having it, because we have the
algebraic specification for binary search trees, and we can prove that
our implementation satisfies the specification. 

Note how we translated the above algebraic specification into ACL2. 

|#

(defthm isempty-empty
  (isEmpty (empty)))

(defthm isempty-insert
  (implies (and (bst? tr)
		(integerp a))
	   (not (isEmpty (insert a tr)))))

(defthm root-insert-1
  (implies (and (bst? tr)
		(integerp a)
		(isEmpty tr))
	   (= (root (insert a tr)) a)))

(defthm root-insert-2
  (implies (and (bst? tr)
		(integerp a)
		(not (isEmpty tr)))
	   (= (root (insert a tr)) (root tr))))

(defthm left-insert-1
  (implies (and (bst? tr)
		(integerp a)
		(isEmpty tr))
	   (= (left (insert a tr)) (empty))))

(defthm left-insert-2
  (implies (and (bst? tr)
		(integerp a)
		(not (isEmpty tr))
		(<= a (root tr)))
	   (= (left (insert a tr)) (insert a (left tr)))))

(defthm left-insert-3
  (implies (and (bst? tr)
		(integerp a)
		(not (isEmpty tr))
		(> a (root tr)))
	   (= (left (insert a tr)) (left tr))))

(defthm right-insert-1
  (implies (and (bst? tr)
		(integerp a)
		(isEmpty tr))
	   (= (right (insert a tr)) (empty))))

(defthm right-insert-2
  (implies (and (bst? tr)
		(integerp a)
		(not (isEmpty tr))
		(<= a (root tr)))
	   (= (right (insert a tr)) (right tr))))

(defthm right-insert-3
  (implies (and (bst? tr)
		(integerp a)
		(not (isEmpty tr))
		(> a (root tr)))
	   (= (right (insert a tr)) (insert a (right tr)))))


#| 

There, so we have an implementation of binary search trees (which is,
by the way, the standard implementation), and we showed formally that
the implementation satisfied the algebraic specification. If you
believe that the algebraic specification truly captures what a binary
search tree is doing, then we're done.

Let's look at a very interesting property of binary search trees. You
remember inorder traversals, right, from last time? Here's the core
for a tree inorder traversal.

|#

(defun inorder (tr)
  (if (isEmpty tr)
      NIL
    (append (inorder (left tr))
	    (cons (root tr)
		  (inorder (right tr))))))

#| 

Binary search trees have an interesting property with respect to
inorder traversals. To get a sense for it, consider the following
function that lets us construct binary search trees easily, given a
list of integers.

|#

(defun make-tree (L)
  (if (endp L)
      (empty)
    (insert (car L) (make-tree (cdr L)))))

#|

Note that the elements are inserted in the tree in the reverse order
of that in which they appear in the list. Thus, for instance:
  ACL2 !>(make-tree '(3 5 6 4 7 4 7 9))
  (9 (7 (4 (4 (3 NIL NIL) NIL)
           (7 (6 (5 NIL NIL) NIL) NIL))
        NIL)
     NIL)

Now, let's take an inorder traversal of that tree:

  ACL2 !>(inorder (make-tree '(3 5 6 4 7 4 7 9)))
  (3 4 4 5 6 7 7 9)

Interesting. It produced a sort of the list of elements. Try it for
several other binary search trees, you'll see that you always get a
sorted list of the elements. Let's prove at least one part of this
conjecture, namely, that the result of making an inorder traversal of
a binary search tree always resulsts in an ordered list, ordered in
increasing order. Let's define that predicate first.

|#

(defun ordered->= (L)
  (if (endp L)
      T
    (if (endp (cdr L))
	T
      (and (<= (car L) (car (cdr L)))
	   (ordered->= (cdr L))))))

#|

The theorem we want is basically that 

  (implies (bst? tr) (ordered->= (inorder tr)))

Before we can prove it, though, we need a few lemmas. And to express
those lemmas, we needs a few auxiliary functions. How did I know I
needed to prove those lemmas? Well, I tried proving the above
directly, it didn't work. ACL2 got stuck basically trying to prove
that the result of appending the inorder traversal of the left
subtree with the root and the inorder traversal of the right subtree
yielded an ordered list. By the induction hypothesis, we know that the
inorder traversal of the left and the right subtrees of a bst are
ordered, but then you need a lemma that says that if you append
two ordered lists such that every element in the first list is less
than the first element of the second list, then the result is
ordered. Also, we need a lemma that says that consing an element in
front of an ordered list with the property that that element is less
than or equal to every element of the list yields a list which is
itself ordered. With two auxiliary functions (list-all-<= a L) that
says that a is <= than every element of L, and (list-all-> a L) that says
that a is > than every element of L, we can state and prove those
lemmas.

|#

(defun list-all-<= (a L)
  (if (endp L)
      T
    (and (<= (car L) a)
	 (list-all-<= a (cdr L)))))

(defun list-all-> (a L )
  (if (endp L)
      T
    (and (> (car L) a)
	 (list-all-> a (cdr L)))))


(defthm lemma-1
  (implies (and (ordered->= L)
		(list-all-> a L))
	   (ordered->= (cons a L))))

(defthm lemma-2
  (implies (and (ordered->= L)
		(not (endp M))
		(ordered->= M)
		(list-all-<= (car M) L))
	   (ordered->= (append L M))))

#| 

The next two lemmas make the link between the all-> function on trees
and the list-all-> function on lists, and similarly for all-<= and
list-all-<= 

|#

(defthm lemma-3
  (implies (all-> a tr)
	   (list-all-> a (inorder tr))))


(defthm lemma-4
  (implies (all-<= a tr)
	   (list-all-<= a (inorder tr))))

#|

We can now prove our main theorem.

|#

(defthm ordered-inorder
  (implies (bst? tr)
	   (ordered->= (inorder tr))))

#|

Voila.

|#