11 Abstraction, Invariants, Testing
11.1 Invariants of Data Structures
Here’s an interface for a sorted list of numbers.
#lang class/1 ;; An ISorted implements ;; insert : Number -> Sorted ;; contains? : Number -> Boolean ;; ->list : -> [Listof Number] ;; empty? : -> Boolean ;; Invariant: The list is sorted in ascending order. ;; Precondition: the list must not be empty ;; max : -> Number ;; min : -> Number
How would we implement this interface?
#lang class/1 ;; A Sorted is one of ;; - (new smt%) ;; - (new scons% Number Sorted) (define-class smt% (check-expect ((new smt%) . contains? 5) false) (define/public (contains? n) false)) (define-class scons% (fields first rest) (check-expect ((new scons% 5 (new smt%)) . contains? 5) true) (check-expect ((new scons% 5 (new smt%)) . contains? 7) false) (check-expect ((new scons% 5 (new scons% 7 (new smt%))) . contains? 3) false) (check-expect ((new scons% 5 (new scons% 7 (new smt%))) . contains? 9) false) (define/public (contains? n) (or (= n (field first)) ((field rest) . contains? n))))
However, we can write a new implementation that uses our invariant to avoid checking the rest of the list when it isn’t necessary.
(define/public (contains? n) (cond [(= n (field first)) true] [(< n (field first)) false] [else ((field rest) . contains? n)]))
Because the list is always sorted in ascending order, if n is less than the first element, it must be less than every other element, and therefore can’t possibly be equal to any element in the list.
Now we can implement the remaining methods from the interface. First, insert
smt%
(check-expect ((new smt%) . insert 5) (new scons% 5 (new smt%))) (define/public (insert n) (new scons% n (new smt%)))
scons%
(check-expect ((new scons% 5 (new smt%)) . insert 7) (new scons% 5 (new scons% 7 (new smt%)))) (check-expect ((new scons% 7 (new smt%)) . insert 5) (new scons% 5 (new scons% 7 (new smt%)))) (define/public (insert n) (cond [(< n (field first)) (new scons% n this)] [else (new scons% (field first) ((field rest) . insert n))]))
Note that we don’t have to look at the whole list to insert the elements. This is again a benefit of programming using the invariant that we have a sorted list.
Next, the max method. We don’t have to do anything for the empty list, because we have a precondition that we can only call max when the list is non-empty.
scons%
(define real-max max) (check-expect ((new scons% 5 (new smt%)) . max) 5) (check-expect ((new scons% 5 (new scons% 7 (new smt%))) . max) 7) (define/public (max) (cond [((field rest) . empty?) (field first)] [else ((field rest) . max)]))
Again, this implementation relies on our data structure invariant. To make this work, though, we need to implement empty?.
smt%
(check-expect ((new smt%) . empty?) true) (define/public (empty?) true)
scons%
(check-expect ((new scons% 1 (new smt%)) . empty?) false) (define/public (empty?) false)
smt%
; no min method (define/public (->list) empty)
scons%
(define/public (min) (field first)) (define/public (->list) (cons (field first) ((field rest) . ->list)))
11.2 Properties of Programs and Randomized Testing
A property is a claim about the behavior of a program. Unit tests check particular, very specific properties, but often there are more general properties that we can state and check about programs.
Here’s a property about our sorted list library, which we would like to be true:
∀ sls : ISorted . ∀ n : Number . ((sls . insert n) . contains? n)
How would we check this? We can check a few instances with unit tests, but this property makes a very strong claim. If we were working in ACL2, as in the Logic and Computation class, we could provide a machine-checked proof of the property, verifying that it is true for every single Sorted and Number.
For something in between these two extremes, we can use randomized testing. This allows us to gain confidence that our property is true, with just a bit of programming effort.
First, we want to write a program that asks the question associated with this property.
; Property: forall sorted lists and numbers, this predicate holds ; insert-contains? : ISorted Number -> Boolean (define (insert-contains? sls n) ((sls . insert n) . contains? n))
Now we make lots of randomly generated tests, and see if the predicate holds. First, let’s build a random sorted list generator.
; build-sorted : Nat (Nat -> Number) -> Sorted (define (build-sorted i f) (cond [(zero? i) (new smt%)] [else (new scons% (f i) (build-sorted (sub1 i) f))])) (build-sorted 5 (lambda (x) x))
Oh no! We broke the invariant. The scons% constructor allows you to break the invariant, and now all of our methods don’t work. Fortunately, we can implement a fixed version that uses the insert method to maintain the sorted list invariant:
; build-sorted : Nat (Nat -> Number) -> Sorted (define (build-sorted i f) (cond [(zero? i) (new smt%)] [else ((build-sorted (sub1 i) f) . insert (f i))])) (check-expect (build-sorted 3 (lambda (x) x)) (new scons% 1 (new scons% 2 (new scons% 3 (new smt%)))))
Now build-sorted produces the correct answer, which we can easily verify at the Interactions window.
Using build-sorted, we can develop random-sorted, which generates a sorted list of random numbers.:
; Nat -> Sorted (define (random-sorted i) (build-sorted i (lambda (_) (random 100))))
Given these building blocks, we can write a test that checks our property.
(check-expect (insert-contains? (random-sorted 30) (random 50)) true)
Every time we hit the Run button, we generate a random sorted list of numbers, and check if a particular random integer behaves appropriately when inserted into it. But if we could repeatedly check this property hundreds or thousands of times, it would be even more unlikely that our program violates the property. After all, we could have just gotten lucky.
; Nat (Any -> Any) -> 'done ; run the function f i times (define (do i f) (cond [(zero? i) 'done] [else (f (do (sub1 i) f))]))
(do 1000 (lambda (_) (check-expect (insert-contains? (random-sorted 30) (random 50)) true)))
When this says that we’ve passed 1000 tests, we’ll be more sure that we’ve gotten our function right.
; Property: forall sorted lists and numbers, this predicate holds ; insert-contains? : ISorted Number -> Boolean (define (insert-contains? sls n) (sls . contains? n))
Now we get lots of test failures, but the problem is not in our
implementation of sorted lists, it’s in our property definition. If
we had instead had a bug in our implementation, we would have
similarly seen many failures. Thus, it isn’t always possible to tell
from a test failure, or even many failures, whether it’s the code or
the specification is wrong—
This is why it’s extremely important to get your specifications (like contracts, data definitions, and interface definitions) correct. Your program can only be correct if they are.
11.3 Abstraction Barriers and Modules
Recall that in our original version of build-sorted, we saw that the
scons% constructor allowed us to violate the invariant—
To address this, we set up an abstraction barrier, preventing other people from seeing the scons% constructor. To create these barriers, we use a module system. We will consider our implementation of sorted lists to be one module, and we can add a simple specification to allow other modules to see parts of the implementation (but not all of it).
Modules in our languages are very simple—
Here’s the module implementing our sorted list, which we save in a file called "sorted-list.rkt".
sorted-list.rkt
#lang class/1 ;; ... all of the rest of the code ... (define smt (new smt%)) (provide smt)
We’ve added two new pieces to this file. First, we define smt to be an instance of the empty sorted list. Then, we use provide to make smt, but not any other definition from our module, available to other modules.
Therefore, the only part of our code that the rest of the world can see is the smt value. To add new elements, the rest of the world has to use the insert method.
#lang class/1 (require "sorted-list.rkt") (smt . insert 4)
Here, we’ve used require, which we’ve used to refer to libraries that come with DrRacket. However, we can specify the name of a file, and we get everything that the module in that file provides, which here is just the smt definition. Everything else, such as the dangerous scons% constructor, is hidden, and our implementation of sorted lists can rely on its invariant.