The class of Sexpressions gives a name to the fact that lists are heterogeneous and that, in particular, lists can contain lists:
An Sexpression (short:Sexpressions differ from lists in two ways. First, any value is an Sexpression. Second, aSexp
) is one of:
(cons S L)
whereS
andL
areSexp
s
AnyValue
(but not acons
tructed value).
cons
'ed Sexpression may contain an
arbitrary value in the second field, not just a list. Still, most people
use cons
in Sexpressions so that there is really a list of
Sexpressions, which greatly simplifies the design of functions. Sexpressions are treeshaped pieces of data. They are thus extremely useful for representing any structure form of information. People use them for collections of records, messages between machines, family trees, Web pages, and many more things.
Defining functions that process Sexpressions is barely more complicated than defining functions that process lists. After all, we use the same fundamental constant and constructor. This function counts the number of chars inside of strings in an Sexpression:
;;Sexp > Number
;; to count the number of characters in strings withins
(define (countchar s) (cond [(pair? s) (+ (countchar (car s)) (countchar (cdr s)))] [else (if (string? s) (stringlength s) 0)]))
To use the function, we construct an Sexpression and apply the function to it:
(countchar (cons (cons (cons 0 "hello") 'silly) "world"))
The Sexpression in this example contains two strings, one number, and one symbol. The expected result is 10.
While processing structured pieces of data is important, many programs must also construct large pieces of structured data. To this end, Scheme provides a quotation mechanism for Sexpressions that seeks its analog in any modern programming language.^{4}
A quotation mechanism introduces a shorthand for defining constants in a
simple and readable manner. The string quotes are a primitive example of a
quotation mechanism. A more interesting mechanism is the quote
form for lists. The syntax for quote
is (quote d)
or
even shorter 'd
. Roughly speaking, quote
traverses its
single subexpression and inserts list
to the right of each left
parentheses and turns every token into a symbol if it looks like a
symbol.^{5} For example,
'(1 2 3) is short for (list 1 2 3)

'((a 1) (#t "two"))
expands to
(list (list 'a 1) (list #t "two"))

We also get a cute name for the empty list as a bonus:
'()
is a natural abbreviation for
null

With quote
, you can write down large pieces of data easily. Say
you are a teaching assistant and you want to keep track of your students'
grades. Defining the ``grade database'' as one large Scheme expression is
straightforward: see figure 11 We simply quote the deeply
nested parenthetical structure and have a collection of records. Because
of this, quoted Sexpressions play a huge role in Scheme programming.
Figure 11: Sexpressions for managing data
Still, while quote
alone is a powerful way of writing down large
pieces of information as structured data, Scheme has even more powerful
tools for this purpose. In particular, Scheme has a mechanism for building
complex Sexpression shapes from templates. Roughly speaking, a template
is an Sexpressions with holes. The holes are filled with Scheme values.
The mechanism for defining Sexpression templates are quasiquote
(or backquote) and unquote
(or comma).
At first glance, a quasiquote
expression is just like
quote
expression. The major difference is that a
quasiquote
expression may contain subexpressions that are
prefixed with unquote
(comma) and that Scheme evaluates these
subexpressions rather than quoting them.
Using quasiquote
and unquote
permits programmers to write
down (large) nearly constant Sexpressions in a concise manner. Imagine we
represent binary trees with Sexpressions. The empty list ()
plays the role of the empty tree. A node is a list with three items: the
left branch, the information, the right branch. For readability, we may
annotate each piece of a node with a symbol that explains which role the
subexpression plays. Here is an example:
'((left: ()) (information: 5) (right: ((left: ()) (information: 7) (right: ()))))
This tree has two nodes, three empty subtrees, and two pieces of
information (one per node): 5
and 7
.
Naturally, we now want to define a function that consumes two trees and a piece of information to construct a new tree:
;; Tree Number Tree > Tree
(define (maketree lefttree info righttree)
`((left: ,lefttree)
(information: ,info)
(right: ,righttree)))
The quasiquote
expression contains three occurrences of
unquote
d expressions. All of them are variables and simply craft
onto the tree template the subtrees and the new piece of data. To create
the above tree, we just write
(maketree '() 5 (maketree '() 7 '()))
Compare two simpler applications of the function:
(maketree '() 3 '()) ; produces '((left: ()) (information: 3) (right: ()))  and  (maketree '() 7 '()) ; produces '((left: ()) (information: 7) (right: ())) 
Of course, the unquote
d subexpressions doesn't need to be a
variable. Suppose we wish to deal with numeric trees that also contain a
running sum field. The leaf looks now like this:
'((sum: 0) (tree: ()))
and an interior node has this shape:
'((sum: 12) (left: ((sum: 0) (tree: ()))) (information: 5) (right: ((sum: 7) (left: ((sum: 0) (tree: ()))) (information: 7) (right: ((sum: 0) (tree: ()))))))
Dealing with such trees requires functions that maintain the extra field properly:
;;Tree Number Tree > Tree
(define (makesumtree left info right) `((sum: ,(+ (treesum left) info (treesum right))) (left: ,left) (information: ,info) (right: ,right))) ;;Tree > Number
(define (treenumber t) (second (car t))) ;;Tree
(define theemptytree '((sum: 0) (tree: ())))
In the definition of makesumtree
, we have the same occurrences
of unquote
d variables as in maketree
, but we also have
a complex unquote
d arithmetic expression. When we apply
makesumtree
, this expression is evaluated and the resulting
number becomes the 'sum
field of the new tree.
Finally, on some occasions, a program may have to splice a list of values into an Sexpression. For example, you may wish to splice a list of grades into a header template that attaches some additional information to a grade listing:
;; String String (listof (cons String (listof Number))) > Sexp
(define (header course teacher grades)
`((Course: ,course)
(Teacher: ,teacher)
(Number of students: ,(length grades))
,@grades))
We could accomplish this insertion with list
, cons
, and
append
instead, but this would seriously reduce the readability
of such templates.
Functions like header
play an important role in Web programming
(among many other things). Sexpressions provide a simple and effective
representation of HTML and XML inside of Scheme. Thus, if we wish to
produce HTML pages, it is often convenient to define just such
functions. Consider the specific example of converting a list of grades
into an (Sexpression representation of an) HTML table. That is, say we
want a function maketable
such that
(maketable '(("fritz" 1 1 1) ("ursula" 2 2 2))) produces '(table ((cellspacing "10") (border "1")) (tr (th "Name") (th "Grades")) (tr (td "fritz") (td " 1 ") (td " 1 ") (td " 1 ")) (tr (td "ursula") (td " 2 ") (td " 2 ") (td " 2 ")))
Figure 12 shows a program fragment that accomplishes just this kind of task. The program consists of four functions. Two wrap headers around tables and rows; two process rows and cells. The latter are written with a powerful loop operator, which we get to know in the next section.
The class of Sexpressions serves as a universal data representation language in Scheme. With Sexpressions, it is easy to represent all forms of information conveniently. As a matter of fact, some are so common that they are built in via primitives that work on just those subsets.
One of these is the notion of an association list. An association list is a
list of cons
cells where the car
field is always a
symbol. The latter serves as a key to the data in the cdr
field. It is similar to a dictionary that translates symbols to meaningful
information. Take a look at this phone directory:
(define PHONEDIRECTORY
'((matthias "Matthias Felleisen"
"713.333.2001"
)
(matthew "Matthew Flatt"
"218.432.1119"
)
(robby "Robert F. Findler"
"713.317.2278"
)
(shriram "Shriram Krishnamurthi"
"713.289.0003"
)
...
The symbols in each list are roughly like nicknames, and the rest of each list is a formal name and a phone number.
Scheme also provides a function for looking up the data that comes with a symbol in an association list:
;;Symbol (Listof (cons Symbol X)) > (union #f (List Symbol X))
;; find a pair inl
whose first field iss
(if there is one) (define (assq s l) (cond [(null? l) #f] [else (if (eq? (car (car l)) s) (car l) (assq s (cdr l)))])) (equal? '(a 9) (assq 'a '((b 2) (c 3) (a 9) (e 11)))) (not (assq 'z '((b 2) (c 3) (a 9) (e 11))))
The function is another instance of our template for listprocessing
functions. The slightly unusual aspect is that assq
works with
lists inside of lists.
With assq
we can compose a program for looking up phone numbers:
;; Symbol > String
(define (lookup nickname)
(let ([r (assq nickname PHONEDIRECTORY)])
(if r
(stringappend (second r) ": " (third r))
(stringappend ( symbol> string nichname " not found")))))
This function looks up the nickname inn PHONEDIRECTORY
, checks
whether r
is #f
or some cons
cell, and creates
a string as an answer. Remember that if
, like cond
,
distinguishes #f
from all other values, not just #t
.
Unfortunately, it is not only easy to encode data as Sexpressions, it also easy to forget which subset of the class of Sexpressions was chosen for a data representation. Because of that, we propose to use data definitions that describe the structure of the chosen data explicitly. For example, an association list is defined as
An association list is a (Listof (cons Symbol X))
.
or, more verbose, as
An association list is one of:
null
(cons (cons Symbol X) L)
whereX
is any kind of value andL
is an association list.
We have also dealt with trees as Sexpressions already:
A tree (also known asThis is a generic tree definition, where proper data shows up at the leaves only.(Treeof X)
) is one of:
X
(list Tree Tree)
.
Sexpressions can also be used to encode Scheme's syntax:
An AST is one of:Since this definition is somewhat unusual, let's construct some examples and let's see how they represent Scheme expressions:
Symbol
Number
(list '+ AST AST)
(list ' AST AST)
.

a number represents a Scheme number, and
a symbol by itself represents a Scheme variable,
an Sexpression with a '+
in the first position and two
numbers is an additio expression, and
an Sexpression with a '
as the first symbol is a
subtraction expression. If it involves a symbol, it is a socalled variable
expression.
In general, we can find the represented expression by just stripping off the quote.
When data definitions become as large as the one for AST (or even larger), it is also a good practice to define selectors for the various pieces similar to the selectors that a structure definition introduces:
(define (additionleftoperand a) (second a)) (define (additionrightoperand a) (third a)) (define (subtractionleftoperand a) (second a)) (define (subtractionrightoperand a) (third a))
In this particular world, it may even make sense to use the same selectors for both kinds of complex expressions and additional selector for the operator:
(define (operator a) (car a)) (define (leftoperand a) (second a)) (define (rightoperand a) (third a))
In either case, the functions that process AST
s become more
readable and less dependent on the actually chosen data
representation. Switching to postfix representations or structures later
on is almost no problem, as long as we use the selectors and not the
concrete representation.
To design a function template for this subclass of Sexpressions, we proceed the same way we did for lists in section 4.3. First, we observe that the data definition has four clauses, which suggests that the function should have four clauses:
(define (functionforast anast) (cond [(symbol? anast) ...] [(number? anast) ...] [(eq? (operator anast) '+) ...] [(eq? (operator anast) ') ...]))
Second, we can also see that in clauses 3 and 4 the given AST
should have two more components:
(define (functionforast anast) (cond [(symbol? anast) ...] [(number? anast) ...] [(eq? (operator anast) '+) ... (leftoperand anast) ... (rightoperand anast) ...] [(eq? (operator anast) ') ... (leftoperand anast) ... (rightoperand anast) ...]))
Last but not least, we see that the data definition refers to itself for
the four subexpressions that we just added. This suggests that functions
on AST
s naturally recur for those four expressions:
(define (functionforast anast) (cond [(symbol? anast) ...] [(number? anast) ...] [(eq? (operator anast) '+) ... (functionforast (leftoperand anast)) ... ... (functionforast (rightoperand anast)) ...] [(eq? (operator anast) ') ... (functionforast (leftoperand anast)) ... ... (functionforast (rightoperand anast)) ...]))
Equipped with this template we can now write some functions on
AST
s:
The first function determines whether the given symbolic representation of a Scheme expression is free of variables, which are represented as symbols. It is straightforward instance of the template.
;;AST > Boolean
;; determine whetheranast
is free of symbols as leaves (define (novariables anast) (cond [(symbol? anast) #f] [(number? anast) #t] [(eq? (operator anast) '+) (and (novariables (leftoperand anast)) (novariables (rightoperand anast)))] [(eq? (operator anast) ') (and (novariables (leftoperand anast)) (novariables (rightoperand anast))])))
The second function substitutes symbols with values, which mimics
what a high school student and the Scheme evaluator do during the
evaluation of function applications. Note the use of backquote and comma
in the last two cond
clauses for the recreation of
(representations of) additions and subtractions.
;;AST Symbol Number > AST
;; replace the symbolx
withval
throughoutanast
(define (substitute anast x val) (cond [(symbol? anast) (if (eq? anast x) val anast)] [(number? anast) anast] [(eq? (operator anast) '+) `(+ ,(substitute (leftoperand anast)) ,(substitute (rightoperand anast)))] [(eq? (operator anast) ') `( ,(substitute (leftoperand anast)) ,(substitute (rightoperand anast))])))
The last function determines the value of an arithmetic AST
,
that is, an AST
that has no variables. Compare how this definition
uses the +
and 
functions rather than the '+
and '
symbols like the preceding definition.
;;AST > Number
;; replace the symbolx
withval
throughoutanast
(define (evaluate anast x val) (cond [(symbol? anast) (error 'evaluate "undefined identifier")] [(number? anast) anast] [(eq? (operator anast) '+) (+ (evaluate (leftoperand anast)) (evaluate (rightoperand anast)))] [(eq? (operator anast) ') ( (evaluate (leftoperand anast)) (evaluate (rightoperand anast))])))
As before, it is a bad idea to follow the function template blindly. On
occasion, we need to compose functions on AST
s rather than just
design new ones. Here is an example:
;;AST (Listof (list Symbol Number)) > Number
;; evaluateanast
after substituting all symbols inenv
;; with their corresponding numbers (define (eval anast env) (evaluate (substitute* anast env))) ;;AST (Listof (list Symbol Number)) > AST
;; substitute all symbols inanast
according to their association ;; inenv
(define (substitute* anast env) (cond [(null? anast) anast] [else (substitute (substitute* anast (cdr env)) (car (car env)) (cdr (car env)))])) (equal? '(+ ( 1 1) (+ 5 1)) (substitute* '(+ ( x 1) (+ y 1)) (list (cons x 1) (cons y 5)))) (equal? 6 (eval '(+ ( x 1) (+ y 1)) (list (cons x 1) (cons y 5))))
The function eval
composes a new function, substitute*
,
with an ``old'' function, evaluate
. If the given association list
contains a number for every symbol that occurs in anast
,
eval
will compute the same number as Scheme for the Sexpression.
A function like eval
is a part of Scheme's repertoire. Look in
Help Desk for details, because Scheme's eval
is far more powerful
than the one we just discussed.
In conclusion, selfreferential data definitions are easy to deal with in programs. All we need to do is write functions whose organization is similar to the organization of the data definition. Furthermore, you have just seen a very powerful, and a very selfserving, application of Sexpressions. In other parts of this book, you will see more practical applications of this important and old Scheme idea.