7  Types and Predicates

Scheme is an untyped language. Programmers don't specify the types of variables or functions, and Scheme implementations don't check types. Hence, it is perfectly okay to write

(define (f x)
  (if (pair? x)
      (+ (car x) (cdr x))
      (+ (car x) 10)))      

The compiler won't complain, and as long as nobody applies f to a non-pair, the program works just fine. If you do evaluate (f 10), the function car should complain that it was applied to a non-pair and the evaluation should stop with an error.

To make up for the lack of types, Scheme supports run-time predicates for values. That is, each class C of Scheme values or structures that we define comes with a predicate with which the program can determine whether some value belongs to C. Here are the predicates and the questions that they answer for those classes that we have discussed so far:

  1. number? -- is the value a number?

    The class of numbers comes with a number of subsets: integers, rationals, reals, and complex numbers. Each category of numbers comes in two forms exact and inexact. This table summarizes the categorization:

    exact? inexact?

  2. boolean? -- is the value #t or #f?

  3. char? -- is the value a character?

  4. symbol? -- is the value a symbol?

  5. string? -- is the value a string?

  6. vector? -- is the value a vector?

  7. pair? (pair?) -- is the value a pair, i.e., a cons cell?

  8. null? (null?) -- is the value the empty list?

  9. procedure? -- is the value a procedure?

  10. Every structure definition introduces a disjoint class of values and a predicate that recognizes these values.

Conventional Scheme implementation use tags to support predicates. Each value is tagged with its class of origin. A predicate just checks this tag. These tags are sometimes called the run-time types of a value, which is of course a rather misleading name considering the purely syntactic nature of true types and their role in programming.

Because Scheme doesn't use one particular type system, Scheme programmers are free to superimpose their own type-oriented thinking. In this book, we use a informal type notation and annotate all function, variable, and structure definitions with such type statements. Our type notation suggests that Scheme programs manipulate classes of values and that programmers reason about these classes.

;Type is one of:
  (Vectorof Type) 
  (Listof Type)
  (constructor Type ...)
  (Type ...  -->  Type)
  (union Type ...)

;Constructor is one of:
  ; and any other constructor 
  ; created with define-struct  

Figure 13:  Types for Scheme

The syntax of our notation is simple; see figure 13. The first ``types'' five just correspond to the basic types we have used so far. A (Listof T) means a list of items of type T and the same holds for (Vectorof T). In contrast, (list T1 T2 T3) means a list of three items: one of type T1, one of type T2, and one of type T3. Using such constructed types and concrete values allows us to describe a function with much more precision than a conventional type system would allow.

Consider these examples:

We also define sets of values as needed. These definitions use stylized English that describe newly introduced classes of structure or recursively defined sets. Here is a data definition for a simple family tree:

A family tree -- also known as FT -- is one of:
  1. 'unknown or

  2. (list String FT FT).

The data definition specifies exactly how we can construct a family tree. According to clause 1, the simplest family tree is the symbol 'unknown. Clause 2 says that to form a larger family tree than that, we must form a three-item list: a string, presumably the name of the person, and two family trees, e.g., (list "Annette" 'unknown 'unknown). The two trees may represent the family trees of the father and mother, respectively. For that, we need to add comments to a data definition that explain its pragmatics. Of course, we can also use structures where the field names already clarify which role these values play.

In addition to plain data definitions, we also define parameterized sets of values. Indeed, we have seen this several times already: (Listof X), (Vectorof X), and in function definitions. The parameters of such definitions (and their uses) stand for fixed, but random sets of values. For example, X can stand for Number, String, or (Listof Y). Hence, if we say (Vectorof X) we may mean vectors that contain numbers, strings, or lists with unknown items.

Function definitions are another useful place for set variables. Take a look at the header of a typical sort function:

;; (Listof X) (X X  -->  Boolean)  -->  (Listof X)
(define (sort alox compare) ...)

The comment specifies that alox is a list of Xen, whatever X is. And the comparison function, compare, knows how to compare two X. Hence if the given list is a list of numbers, compare must be applicable to two numbers; if the given list is a list of structures, compare must work on the same kind of structures. The contract for map connects the domain of the function with the range:

;; (X  -->  Y) (Listof X)  -->  (Listof Y)
(define (map f alox) ...)

Here the given function maps X to Y, no matter what they are. If the given list is a list of numbers, the function must work on numbers; if it is a list of structures, the function must work on that class of structures. And whatever the function produces determines the result of map. Since the real map actually consumes an arbitrary number of lists, its contract is really something like this:

;; (X Z ...  -->  Y) (Listof X) (Listof Z) ...  -->  (Listof Y)
(define (map f alox) ...)

Here Z ... and (Listof Z) ... denote zero or more occurrences of these types. Again, the function type and the domain type are tightly connected and gives you a lot of information.

As we encounter various Scheme constructs, we will introduce additional notations for specifying functions contracts and sets of values.