The class of S-expressions gives a name to the fact that lists are heterogeneous and that, in particular, lists can contain lists:
An S-expression (short:S-expressions differ from lists in two ways. First, any value is an S-expression. Second, a
S-exp) is one of:
(cons S L)where
AnyValue(but not a
cons'ed S-expression may contain an arbitrary value in the second field, not just a list. Still, most people use
consin S-expressions so that there is really a list of S-expressions, which greatly simplifies the design of functions.
S-expressions are tree-shaped pieces of data. They are thus extremely useful for representing any structure form of information. People use them for collections of records, messages between machines, family trees, Web pages, and many more things.
Defining functions that process S-expressions is barely more complicated than defining functions that process lists. After all, we use the same fundamental constant and constructor. This function counts the number of chars inside of strings in an S-expression:
S-exp --> Number;; to count the number of characters in strings within
s(define (count-char s) (cond [(pair? s) (+ (count-char (car s)) (count-char (cdr s)))] [else (if (string? s) (string-length s) 0)]))
To use the function, we construct an S-expression and apply the function to it:
(count-char (cons (cons (cons 0 "hello") 'silly) "world"))
The S-expression in this example contains two strings, one number, and one symbol. The expected result is 10.
While processing structured pieces of data is important, many programs must also construct large pieces of structured data. To this end, Scheme provides a quotation mechanism for S-expressions that seeks its analog in any modern programming language.4
A quotation mechanism introduces a short-hand for defining constants in a
simple and readable manner. The string quotes are a primitive example of a
quotation mechanism. A more interesting mechanism is the
form for lists. The syntax for
(quote d) or
'd. Roughly speaking,
quote traverses its
single subexpression and inserts
list to the right of each left
parentheses and turns every token into a symbol if it looks like a
symbol.5 For example,
We also get a cute name for the empty list as a bonus:
quote, you can write down large pieces of data easily. Say
you are a teaching assistant and you want to keep track of your students'
grades. Defining the ``grade database'' as one large Scheme expression is
straightforward: see figure 11 We simply quote the deeply
nested parenthetical structure and have a collection of records. Because
of this, quoted S-expressions play a huge role in Scheme programming.
quote alone is a powerful way of writing down large
pieces of information as structured data, Scheme has even more powerful
tools for this purpose. In particular, Scheme has a mechanism for building
complex S-expression shapes from templates. Roughly speaking, a template
is an S-expressions with holes. The holes are filled with Scheme values.
The mechanism for defining S-expression templates are
(or backquote) and
unquote (or comma).
At first glance, a
quasiquote expression is just like
quote expression. The major difference is that a
quasiquote expression may contain subexpressions that are
unquote (comma) and that Scheme evaluates these
subexpressions rather than quoting them.
unquote permits programmers to write
down (large) nearly constant S-expressions in a concise manner. Imagine we
represent binary trees with S-expressions. The empty list
plays the role of the empty tree. A node is a list with three items: the
left branch, the information, the right branch. For readability, we may
annotate each piece of a node with a symbol that explains which role the
subexpression plays. Here is an example:
'((left: ()) (information: 5) (right: ((left: ()) (information: 7) (right: ()))))
This tree has two nodes, three empty subtrees, and two pieces of
information (one per node):
Naturally, we now want to define a function that consumes two trees and a piece of information to construct a new tree:
Tree Number Tree --> Tree(define (make-tree left-tree info right-tree) `((left: ,left-tree) (information: ,info) (right: ,right-tree)))
quasiquote expression contains three occurrences of
unquoted expressions. All of them are variables and simply craft
onto the tree template the subtrees and the new piece of data. To create
the above tree, we just write
(make-tree '() 5 (make-tree '() 7 '()))
Compare two simpler applications of the function:
(make-tree '() 3 '()) ; produces '((left: ()) (information: 3) (right: ()))
(make-tree '() 7 '()) ; produces '((left: ()) (information: 7) (right: ()))
Of course, the
unquoted subexpressions doesn't need to be a
variable. Suppose we wish to deal with numeric trees that also contain a
running sum field. The leaf looks now like this:
'((sum: 0) (tree: ()))
and an interior node has this shape:
'((sum: 12) (left: ((sum: 0) (tree: ()))) (information: 5) (right: ((sum: 7) (left: ((sum: 0) (tree: ()))) (information: 7) (right: ((sum: 0) (tree: ()))))))
Dealing with such trees requires functions that maintain the extra field properly:
Tree Number Tree --> Tree(define (make-sum-tree left info right) `((sum: ,(+ (tree-sum left) info (tree-sum right))) (left: ,left) (information: ,info) (right: ,right))) ;;
Tree --> Number(define (tree-number t) (second (car t))) ;;
Tree(define the-empty-tree '((sum: 0) (tree: ())))
In the definition of
make-sum-tree, we have the same occurrences
unquoted variables as in
make-tree, but we also have
unquoted arithmetic expression. When we apply
make-sum-tree, this expression is evaluated and the resulting
number becomes the
'sum field of the new tree.
(listof (cons String (listof Number))) --> S-exp(define (make-table grades) `(table ([cellspacing "10"] [border "1"]) (tr (th "Name") (th "Grades")) ,@(grades-rows grades))) ;;
(listof (cons String (listof Number))) --> S-exp(define (grades-rows grades) (map entry-row grades)) ;;
(cons String (listof Number)) --> S-exp(define (entry-row e) `(tr (td ,(car e)) ,@(make-cells (cdr e)))) ;;
(listof Number) --> (listof String)(define (make-cells grades) (map (lambda (n) `(td ,(number-string n))) grades))
Figure 12: Backquote, unquote, and splice
Finally, on some occasions, a program may have to splice a list of values into an S-expression. For example, you may wish to splice a list of grades into a header template that attaches some additional information to a grade listing:
String String (listof (cons String (listof Number))) --> S-exp(define (header course teacher grades) `((Course: ,course) (Teacher: ,teacher) (Number of students: ,(length grades)) ,@grades))
We could accomplish this insertion with
append instead, but this would seriously reduce the readability
of such templates.
header play an important role in Web programming
(among many other things). S-expressions provide a simple and effective
representation of HTML and XML inside of Scheme. Thus, if we wish to
produce HTML pages, it is often convenient to define just such
functions. Consider the specific example of converting a list of grades
into an (S-expression representation of an) HTML table. That is, say we
want a function
make-table such that
(make-table '(("fritz" 1 1 1) ("ursula" 2 2 2))) produces '(table ((cellspacing "10") (border "1")) (tr (th "Name") (th "Grades")) (tr (td "fritz") (td " 1 ") (td " 1 ") (td " 1 ")) (tr (td "ursula") (td " 2 ") (td " 2 ") (td " 2 ")))
Figure 12 shows a program fragment that accomplishes just this kind of task. The program consists of four functions. Two wrap headers around tables and rows; two process rows and cells. The latter are written with a powerful loop operator, which we get to know in the next section.
The class of S-expressions serves as a universal data representation language in Scheme. With S-expressions, it is easy to represent all forms of information conveniently. As a matter of fact, some are so common that they are built in via primitives that work on just those subsets.
One of these is the notion of an association list. An association list is a
cons cells where the
car field is always a
symbol. The latter serves as a key to the data in the
field. It is similar to a dictionary that translates symbols to meaningful
information. Take a look at this phone directory:
"Robert F. Findler"
The symbols in each list are roughly like nicknames, and the rest of each list is a formal name and a phone number.
Scheme also provides a function for looking up the data that comes with a symbol in an association list:
Symbol (Listof (cons Symbol X)) --> (union #f (List Symbol X));; find a pair in
lwhose first field is
s(if there is one) (define (assq s l) (cond [(null? l) #f] [else (if (eq? (car (car l)) s) (car l) (assq s (cdr l)))])) (equal? '(a 9) (assq 'a '((b 2) (c 3) (a 9) (e 11)))) (not (assq 'z '((b 2) (c 3) (a 9) (e 11))))
The function is another instance of our template for list-processing
functions. The slightly unusual aspect is that
assq works with
lists inside of lists.
assq we can compose a program for looking up phone numbers:
Symbol --> String(define (lookup nickname) (let ([r (assq nickname PHONE-DIRECTORY)]) (if r (string-append (second r) ": " (third r)) (string-append ( symbol--> string nichname " not found")))))
This function looks up the nickname inn
#f or some
cons cell, and creates
a string as an answer. Remember that
#f from all other values, not just
Unfortunately, it is not only easy to encode data as S-expressions, it also easy to forget which subset of the class of S-expressions was chosen for a data representation. Because of that, we propose to use data definitions that describe the structure of the chosen data explicitly. For example, an association list is defined as
An association list is aor, more verbose, as
(Listof (cons Symbol X)).
An association list is one of:
(cons (cons Symbol X) L)where
Xis any kind of value and
Lis an association list.
We have also dealt with trees as S-expressions already:
A tree (also known asThis is a generic tree definition, where proper data shows up at the leaves only.
(Treeof X)) is one of:
(list Tree Tree).
S-expressions can also be used to encode Scheme's syntax:
An AST is one of:Since this definition is somewhat unusual, let's construct some examples and let's see how they represent Scheme expressions:
(list '+ AST AST)
(list '- AST AST).
a number represents a Scheme number, and
a symbol by itself represents a Scheme variable,
an S-expression with a
'+ in the first position and two
numbers is an additio expression, and
an S-expression with a
'- as the first symbol is a
subtraction expression. If it involves a symbol, it is a so-called variable
In general, we can find the represented expression by just stripping off the quote.
When data definitions become as large as the one for AST (or even larger), it is also a good practice to define selectors for the various pieces similar to the selectors that a structure definition introduces:
(define (addition-left-operand a) (second a)) (define (addition-right-operand a) (third a)) (define (subtraction-left-operand a) (second a)) (define (subtraction-right-operand a) (third a))
In this particular world, it may even make sense to use the same selectors for both kinds of complex expressions and additional selector for the operator:
(define (operator a) (car a)) (define (left-operand a) (second a)) (define (right-operand a) (third a))
In either case, the functions that process
ASTs become more
readable and less dependent on the actually chosen data
representation. Switching to post-fix representations or structures later
on is almost no problem, as long as we use the selectors and not the
To design a function template for this subclass of S-expressions, we proceed the same way we did for lists in section 4.3. First, we observe that the data definition has four clauses, which suggests that the function should have four clauses:
(define (function-for-ast an-ast) (cond [(symbol? an-ast) ...] [(number? an-ast) ...] [(eq? (operator an-ast) '+) ...] [(eq? (operator an-ast) '-) ...]))
Second, we can also see that in clauses 3 and 4 the given
should have two more components:
(define (function-for-ast an-ast) (cond [(symbol? an-ast) ...] [(number? an-ast) ...] [(eq? (operator an-ast) '+) ... (left-operand an-ast) ... (right-operand an-ast) ...] [(eq? (operator an-ast) '-) ... (left-operand an-ast) ... (right-operand an-ast) ...]))
Last but not least, we see that the data definition refers to itself for
the four subexpressions that we just added. This suggests that functions
ASTs naturally recur for those four expressions:
(define (function-for-ast an-ast) (cond [(symbol? an-ast) ...] [(number? an-ast) ...] [(eq? (operator an-ast) '+) ... (function-for-ast (left-operand an-ast)) ... ... (function-for-ast (right-operand an-ast)) ...] [(eq? (operator an-ast) '-) ... (function-for-ast (left-operand an-ast)) ... ... (function-for-ast (right-operand an-ast)) ...]))
Equipped with this template we can now write some functions on
The first function determines whether the given symbolic representation of a Scheme expression is free of variables, which are represented as symbols. It is straightforward instance of the template.
AST --> Boolean;; determine whether
an-astis free of symbols as leaves (define (no-variables an-ast) (cond [(symbol? an-ast) #f] [(number? an-ast) #t] [(eq? (operator an-ast) '+) (and (no-variables (left-operand an-ast)) (no-variables (right-operand an-ast)))] [(eq? (operator an-ast) '-) (and (no-variables (left-operand an-ast)) (no-variables (right-operand an-ast))])))
The second function substitutes symbols with values, which mimics
what a high school student and the Scheme evaluator do during the
evaluation of function applications. Note the use of backquote and comma
in the last two
cond clauses for the re-creation of
(representations of) additions and subtractions.
AST Symbol Number --> AST;; replace the symbol
an-ast(define (substitute an-ast x val) (cond [(symbol? an-ast) (if (eq? an-ast x) val an-ast)] [(number? an-ast) an-ast] [(eq? (operator an-ast) '+) `(+ ,(substitute (left-operand an-ast)) ,(substitute (right-operand an-ast)))] [(eq? (operator an-ast) '-) `(- ,(substitute (left-operand an-ast)) ,(substitute (right-operand an-ast))])))
The last function determines the value of an arithmetic
that is, an
AST that has no variables. Compare how this definition
- functions rather than the
'- symbols like the preceding definition.
AST --> Number;; replace the symbol
an-ast(define (evaluate an-ast x val) (cond [(symbol? an-ast) (error 'evaluate "undefined identifier")] [(number? an-ast) an-ast] [(eq? (operator an-ast) '+) (+ (evaluate (left-operand an-ast)) (evaluate (right-operand an-ast)))] [(eq? (operator an-ast) '-) (- (evaluate (left-operand an-ast)) (evaluate (right-operand an-ast))])))
As before, it is a bad idea to follow the function template blindly. On
occasion, we need to compose functions on
ASTs rather than just
design new ones. Here is an example:
AST (Listof (list Symbol Number)) --> Number;; evaluate
an-astafter substituting all symbols in
env;; with their corresponding numbers (define (eval an-ast env) (evaluate (substitute* an-ast env))) ;;
AST (Listof (list Symbol Number)) --> AST;; substitute all symbols in
an-astaccording to their association ;; in
env(define (substitute* an-ast env) (cond [(null? an-ast) an-ast] [else (substitute (substitute* an-ast (cdr env)) (car (car env)) (cdr (car env)))])) (equal? '(+ (- 1 1) (+ 5 1)) (substitute* '(+ (- x 1) (+ y 1)) (list (cons x 1) (cons y 5)))) (equal? 6 (eval '(+ (- x 1) (+ y 1)) (list (cons x 1) (cons y 5))))
eval composes a new function,
with an ``old'' function,
evaluate. If the given association list
contains a number for every symbol that occurs in
eval will compute the same number as Scheme for the S-expression.
A function like
eval is a part of Scheme's repertoire. Look in
Help Desk for details, because Scheme's
eval is far more powerful
than the one we just discussed.
In conclusion, self-referential data definitions are easy to deal with in programs. All we need to do is write functions whose organization is similar to the organization of the data definition. Furthermore, you have just seen a very powerful, and a very self-serving, application of S-expressions. In other parts of this book, you will see more practical applications of this important and old Scheme idea.