II Arbitrarily Large Data
Every data definition in Fixed-Size Data describes data of a fixed
size. To us, Boolean values, numbers, strings, and images are atomic;
computer scientists say they have a size of one unit. With a structure,
you compose a fixed number of pieces of data. Even if you use the language
of data definitions to create deeply nested structures, you always know the
exact number of atomic pieces of data in any specific instance. Many
programming problems, however, deal with an undetermined number of pieces
of information that must be processed as one piece of data. For example,
one program may have to compute the average of a bunch of numbers and
another may have to keep track of an arbitrary number of objects in an
interactive game. Regardless, it is impossible with your knowledge to
formulate a data definition that can represent this kind of information as
This part revises the language of data definitions so that it becomes
possible to describe data of (finite but) arbitrary size. For a concrete
illustration, the first half of this part deals with lists, a form of data
that appears in most modern programming languages. In parallel to the
extended language of data definitions, this part also revises the design
recipe to cope with such data definitions. The latter chapters demonstrate
how these data definitions and the revised design recipe work in a variety
You have probably not encountered self-referential definitions before. Your
English teachers certainly stay away from these, and many mathematics
courses are vague when it comes to such definitions. Programmers cannot
afford to be vague. Their work requires precision. While a definition may
in general contain several references to itself, this chapter presents
useful examples that need just one, staring with the one for lists.
The introduction of lists also spices up the kind of applications we can
study. While this chapter carefully builds up your intuition with
examples, it also motivates the revision of the design recipe in
the next chapter, which explains how to systematically create functions
that deal with self-referential data definitions.
8.1 Creating Lists
All of us make lists all the time. Before we go grocery shopping, we write
down a list of items we wish to purchase. Some people write down a to-do
list every morning. During December, many children prepare Christmas wish
lists. To plan a party, we make a list of invitees. Arranging
information in the form of lists is a ubiquitous part of our life.
Given that information comes in the shape of lists, we must clearly learn
how to represent such lists as BSL data. Indeed, because lists are so
important BSL comes with built-in support for creating and manipulating
lists, similar to the support for Cartesian points (posn). In contrast to points,
the data definition for lists is always left to you. But first things
first. We start with the creation of lists.
When we form a list, we always start out with the empty list. In BSL, we
represent the empty list with
which is pronounced “empty,” short for “empty list.” Like
is just a constant. When we add
something to a list, we construct another list; in BSL, the
operation serves this purpose. For example,
constructs a list from the '()
list and the string
. Figure 43
presents this list in the
same pictorial manner we used for structures. The box for cons
has two fields: first
. In this specific example
field contains "Mercury"
and the rest
field contains '()
Figure 43: Building a list
Once we have a list with one item in it, we can construct lists
with two items by using cons
again. Here is one:
And here is another:
The middle row of figure 43
shows how you can imagine
lists that contain two items. It is also a box of two fields, but this
time the rest
field contains a box. Indeed, it contains the box
from the top row of the same figure.
Finally, we construct a list with three items:
The last row of figure 43
illustrates the list with three
items. Its rest
field contains a box that contains a box
again. So, as we create lists we put boxes into boxes into boxes,
etc. While this may appear strange at first glance, it is just like a set
of Chinese gift boxes or a set of nested drinking cups, which we sometimes
get for birthdays. The only difference is that BSL programs can nest
lists much deeper than any artist could nest physical boxes.
Figure 44: Drawing a list
Because even good artists would have problems with drawing deeply nested
structures, computer scientists resort to box-and-arrow diagrams
instead. Figure 44 illustrates how to re-arrange the last row
of figure 43. Each cons structure becomes a
separate box. If the rest field is too complex to be drawn inside of the
box, we draw a bullet instead and a line with an arrow to the box that it
contains. Depending on how the boxes are arranged you get two kinds of
diagrams. The first, displayed in the top row of figure 44,
lists the boxes in the order in which they are created. The second,
displayed in the bottom row, lists the boxes in the order in which they
are consed together. Hence the second diagram immediately tells
you what first would produced when applied to the list, no matter
how long the list is. For this reason, people tend to prefer the second
129. Create BSL lists that represent
a list of celestial bodies, say, at least all the planets in our solar system;
a list of items for a meal, for example, steak, French fries, beans, bread, water,
brie cheese, and ice cream; and
a list of colors.
Sketch box representations of these lists, similar to those in
and figure 44
. Which of the sketches
do you like better?
You can also make lists of numbers. Here is a list with the 10 digits:
To build this list requires 10 list constructions and one
'(). For a list of three arbitrary numbers, for example,
In general a list does not have to contain values of one kind, but may
contain arbitrary values:
The firstThen again, if this list is supposed to represent a
record with a fixed number of pieces, use a structure type instead. item
of this list is a string, the second one is a number, and the last one a
Boolean. You may consider this list as the representation of a personnel
record with three pieces of data: the name of the employee, the number of
years spent at the company, and whether the employee has health insurance
through the company. Or, you could think of it as representing a virtual
player in some game. Without a data definition, you just can’t know what
data is all about.
Here is a first data definition that involves cons
Of course, this data definition uses cons
like others use
constructors for structure instances, and in a sense, cons
just a special constructor. What this data definition fails to demonstrate
is how to form lists of arbitrary length: lists that contain
nothing, one item, two items, ten items, or perhaps 1,438,901 items.
So let’s try again:
|; A List-of-names is one of: |
|; – '()|
|; – (cons String List-of-names)|
|; interpretation a list of invitees, by last name|
Take a deep breath, read it again. This data definition is one
of the most unusual definitions you have ever encountered—you
have never before seen a definition that refers to itself. It isn’t
even clear whether it makes sense. After all, if you had told your
English teacher that “a table is a table” defines the word “table,”
the most likely response would have been “Nonsense!” because a
self-referential definition doesn’t explain what a word means.
In computer science and in programming, though, self-referential
definitions play a central role, and with some care, such definitions
actually do make sense. Here “making sense” means that we can use the
data definition for what it is intended for, namely, to generate examples of
data that belong to the class that is being defined or to check
whether some given piece of data belongs to the defined class of
data. From this perspective, the definition of List-of-names
complete sense. At a minimum, we can generate '()
one example, using the first clause in the itemization. Given
as an element of List-of-names
, it is easy to
make a second one:
Here we are using a String
and the only list from
to generate a piece of data according to the second
clause in the itemization. With the same rule, we can generate many more
lists of this kind:
|(cons "Flatt" '())|
|(cons "Felleisen" '())|
|(cons "Krishnamurthi" '())|
And while these lists all contain one name (represented as a
), it is actually possible to use the second line of the data
definition to create lists with more names in them:
Exercise 130. Create an element of List-of-names that
contains five Strings. Sketch a box representation of the list
similar to those found in figure 43.
Exercise 131. Provide a data definition for representing lists of
Boolean values. The class contains all arbitrarily long lists of
8.2 What Is '(), What Is cons
Let’s step back for a moment and take a close look at '() and
cons. As mentioned, '() is just a constant. When
compared to constants such as 5 or "this is a string",
it looks more like a function name or a variable; but when compared with
#true and #false, it should be easy to see that it really
is just BSL’s representation for empty lists.
As for our evaluation rules, '()
is a new kind of atomic value,
distinct from any other kind: numbers, Booleans, strings, and so on. It
also isn’t a compound value, like Posn
s. Indeed, '()
so unique, it belongs into a class of values all by itself. As such, this
class of values comes with a predicate that recognizes only '()
and nothing else:
Like all predicates, empty?
can be applied to any value
from the universe of BSL values. It produces #true
when it is applied to '()
Next we turn to cons
. Everything we have seen so far suggests that
is a constructor just like those introduced by structure
type definitions. More precisely, cons
appears to be the
constructor for a two-field structure: the first one for any kind of value
and the second one for an list-like value. The following definitions
translate this idea into BSL:
The only catch is that our-cons
accepts all possible BSL values
for its second argument and cons
doesn’t, as the following
|> (cons 1 2)|
cons: second argument must be a list, but received 1 and 2
is a checked constructor function, you may be wondering
how to extract the pieces from the resulting structure. After all,
says that programming with structures requires
selectors. Since a cons
structure has two fields, there are two
. They are also easily defined
in terms of our pair
Stop! Define our-rest.
If your program can access the structure type definition for pair,
it is easy to create pairs that don’t contain '() or
another pair in the right field. Whether such bad
instances are created intentionally or accidentally, they tend to break
functions and programs in strange ways. BSL therefore hides the actual
structure type definition for cons to avoid these
problems. Local Definitions demonstrates one way how your
programs can hide such definitions, too, but for now, you don’t need this
a special value, mostly to represent the empty list
a predicate to recognize '() and nothing else
a checked constructor to create two-field instances
the selector to extract the last item added
the selector to extract the extended list
a predicate to recognizes instances of cons
Figure 45: List primitives
Figure 45 summarizes this section. The key insight is that
'() is a unique value and that cons is a checked
constructor that produces list values. Furthermore, first,
rest, and cons? are merely distinct names for the usual
predicate and selectors. What this chapter teaches then is not a
new way of creating data but a new way of formulating data definitions.
8.3 Programming with Lists
Say you’re keeping track of your friends with some list, and say the list
has grown so large that you need a program to determine whether
someHere we use the word “friend” in the sense of social
networks, not the real world. name is on the list. To make this idea
concrete, let us state it as an exercise:
Sample Problem You are working on the contact list for some new cell
phone. The phone’s owner updates and consults this list on various
occasions. For now, you are assigned the task of designing a function that
consumes this list of contacts and determines whether it contains the name
Once we have a solution to this sample problem, we will generalize it to a
function that finds any name on a list.
The data definition for List-of-names
from the preceding is
appropriate for representing the list of names that the function is to
search. Next we turn to the header material:
While a-list-of-names is a good name for the list of names that
the function consumes, it is a mouthful and we therefore shorten it to
Following the general design recipe, we next make up some examples that
illustrate the purpose of contains-flatt?. First, we determine the
output for the simplest input: '(). Since this list does not
contain any strings, it certainly does not contain "Flatt":
Then we consider lists with a single item. Here are two examples:
In the first case, the answer is #false because the single item on the
list is not "Flatt"; in the second case, the single item is
"Flatt", so the answer is #true. Finally, here is a
more general example:
Again, the answer case must be #true because the list contains
"Flatt". Stop! Make a general example for which the answer must be
Take a breath. Run the program. The header is a “dummy” definition for
the function; you have some examples; they have been turned into tests; and
best of all, some of them actually succeed. They succeed for the wrong
reason but succeed they do. If things make sense now, read on.
The fourth step is to design a function template that matches the data
definition. Since the data definition for lists of strings has two
clauses, the function’s body must be a cond
expression with two
clauses. The two conditions determine which of the two kinds of
lists the function received:
Instead of (cons? alon)
, we could use else
the second clause.
We can add one more hint to the template by studying each clause of the
expression in turn. Specifically, recall that the design
recipe suggests annotating each clause with selector expressions if the
corresponding class of inputs consists of compounds. In our case, we know
does not have compounds, so there are no
components. Otherwise the list is constructed from a string and
another list of strings, and we remind ourselves of this fact by adding
and (rest alon)
Now it is time to switch to the programming task proper, the fifth step of
our design recipe. It starts from template and deals with each
-clause separately. If (empty? alon)
is true, the input is the empty list, in which case the function must
produce the result #false
. In the second case, (cons? alon)
is true. The annotations in the template remind us that
there is a first string and the rest of the list. So let us consider an
example that falls into this category:
The function, like a person, must compare the first item with
"Flatt". In this example, the first one is "A" and not
"Flatt", so the comparison yields #false. If we had
considered some other example instead, say,
the function would determine that the first item on the input is
, and would therefore respond with #true
implies that the second line in the cond
expression should contain
an expression that compares the first name on the list with
Furthermore, if the comparison yields #true
, the function must
. If the comparison yields #false
, we are
left with another list of strings: (rest alon)
. Clearly, the function can’t know the final answer in
this case, because the answer depends on what “...” represents. Put
differently, if the first item is not "Flatt"
, we need some way
to check whether the rest of the list contains "Flatt"
Fortunately, we have such a function: contains-flatt?. According
to its purpose statement, it determines whether a list contains
"Flatt". The statement implies that, (contains-flatt? l)
tells us whether the list of strings l contains
"Flatt". And, in the same vein, (contains-flatt? (rest alon)) determines whether "Flatt" is a member of (rest alon),
which is precisely what we need to know.
In short, the last line should be (contains-flatt? (rest alon))
The trick is now to combine the values of the two expressions in the
appropriate manner. As mentioned, if the first expression yields
#true, there is no need to search the rest of the list; if it is
#false, though, the second expression may still yield
#true, meaning the name "Flatt" is on the rest of the
list. All of this suggests that the result of (contains-flatt? alon) is #true if either the first expression in the last line
or the second expression yields #true.
Figure 46: Searching a list
Figure 46 then shows the complete definition. Overall
it doesn’t look too different from the definitions in the first chapter of
the book. It consists of a signature, a purpose statement, two examples,
and a definition. The only way in which this function definition differs
from anything you have seen before is the self-reference, that is, the
reference to contains-flatt? in the body of the
define. Then again, the data definition is self-referential, too,
so in some sense the self-reference in the function shouldn’t be be too surprising.
132. Use DrRacket to run contains-flatt?
in this example:
What answer do you expect?
133. Here is another way of formulating the second
Explain why this expression produces the same answers as the or
expression in the version of figure 46
. Which version is
Exercise 134. Develop the function contains?.
It determines whether some given string occurs on a given list of strings.
BSL actually comes with member?
, a function that
consumes two values and checks whether the first occurs in the second,
Don’t use member?
to define the contains?
8.4 Computing with Lists
Since we are still using BSL, the rules of algebra—see the first
intermezzo—tell us how to determine the value of expressions such as
(contains-flatt? (cons "Flatt" (cons "C" '())))
without DrRacket. Programmers must have an intuitive understanding of how
this kind of calculation works, so we step through the one for this
The first step uses the usual substitution rule—or the beta rule as the first
intermezzo calls it—to determine the value of an application. The result
is a conditional expression, because, as an algebra teacher would say, the
function is defined in a step-wise fashion. So here is how it continues:
To find the correct clause of the cond
expression, we must
determine the value of the conditions, one by one. Since a cons
list isn’t empty, the first condition’s result is #false
therefore eliminate the first cond
clause. Finally the condition
in the second clause evaluates to #true
ed list holds. From here, it is just three more steps of
arithmetic to get the final result:
First, (first (cons "Flatt" ...))
evaluates to "Flatt"
due to the laws for first
. Second, "Flatt"
is a string
and equal to "Flatt"
. Third, (or #true X)
regardless of what X
135. Use DrRacket’s stepper to check the
(contains-flatt? (cons "Flatt" (cons "C" '())))
Also use the stepper to determine the value of
What happens when "Flatt"
is replaced with "B"
136. Validate with DrRacket’s stepper
|(our-first (our-cons "a" '())) == "a"|
|(our-rest (our-cons "a" '())) == "a"|
9 Designing with Self-Referential Data Definitions
Figure 47: Arrows for self-references in data definitions and templates
At first glance, self-referential data definitions seem to be far more
complex than those for mixed data. But, as the example of
contains-flatt? shows, the six steps of the design recipe still
work. Nevertheless, in this section we generalize the design recipe so that
it works even better for self-referential data definitions. The new parts
concern the process of discovering when a self-referential data definition
is needed, deriving a template, and defining the function body:
If a problem statement is about information of arbitrary size, you
need a self-referential data definition to represent it. At this point,
you have seen only one such class, List-of-names. The left side of
figure 47 shows how to define
List-of-strings in the same way. Other lists of atomic data work
the same way.
Numbers also seem to be arbitrarily large. For inexact
numbers, this is an illusion. For precise integers, this is indeed the
case. Dealing with integers is therefore a part of this chapter.
For a self-referential data definition to be valid, it must satisfy two
conditions. First, it must contain at least two clauses. Second, at least
one of the clauses must not refer back to the class of data that is being
defined. It is good practice to identify the self-references explicitly with
arrows from the references in the data definition back to the term being
defined; see figure 47 for an example of such an
You must check the validity of self-referential data definitions with the
creation of examples. Start with the clause that does not refer to the data
definition; continue with the other one, using the first example where the
clause refers to the definition itself. For the data
definition in figure 47
, you thus get lists like
the following three:
by the first clause
|(cons "a" '())|
by the second clause, previous example
by the second clause, previous example
If it is impossible to generate examples from the data definition, it is
invalid. If you can generate examples but you can’t see how to generate
increasingly larger examples, the definition may not live up to its
Nothing changes about the header material: the signature, the purpose
statement, and the dummy definition. When you do formulate the purpose
statement, focus on what the function computes not how it goes
about it, especially not how it goes through instances of the given data.
Here is an example to make this design recipe concrete:
The purpose statement clearly states that the function just counts the
strings on the given input; there is no need to think ahead about how you
might formulate this idea as a BSL function.
When it comes to functional examples, be sure to work through inputs
that use the self-referential clause of the data definition several
times. It is the best way to formulate tests that cover the entire
function definition later.
For our running example, the purpose statement almost generates functional
examples by itself from the data examples:
(cons "a" '())
(cons "b" (cons "a" '()))
The first row is about the empty list, and we know that empty
list contains nothing. The second row is a list of one string, so
1 is the desired answer. The last row is about a list of two
At the core, a self-referential data definition looks like a data
definition for mixed data. The development of the template can therefore
proceed according to the recipe in Itemizations and Structures. Specifically, we
formulate a cond expression with as many cond clauses as
there are clauses in the data definition, match each recognizing condition
to the corresponding clause in the data definition, and write down
appropriate selector expressions in all cond lines that process
Does the data definition distinguish among different sub-classes of data?
Your template needs as many cond clauses as sub-classes that
the data definition distinguishes.
How do the sub-classes differ from each other?
Use the differences to formulate a condition per clause.
Do any of the clauses deal with structured values?
If so, add appropriate selector expressions to the clause.
Does the data definition use self-references?
Formulate “natural recursions” for the template to represent the
self-references of the data definition.
If the data definition refers to some other data definition,
where is this cross-reference to another data definition?
Specialize the template for the other data definition. Refer to this
template. See Designing with Itemizations, Again,
steps 4 and 5 of the design recipe.
Figure 48: How to translate a data definition into a template
Figure 48 expresses this idea as a question-and-answer
game. In the left column it states questions about the data definition for
the argument, and in the right column it explains what the answer means for
the construction of the template.
If you ignore the last row and apply the first three questions to any
function that consumes a List-of-strings
, you arrive at this shape:
Recall, though, that the purpose of a template is to express the data
definition as a function layout. That is, a template expresses as code what
the data definition for the input expresses as a mix of English and
BSL. Hence all important pieces of the data definition must find a
counterpart in the template, and this guideline should also hold when a
data definition is self-referential—contains an arrow from inside the
definition to the term being defined. In particular, when a data definition
is self-referential in the ith clause and the kth field of
the structure mentioned there, the template should be self-referential in
the ith cond clause and the selector expression for the
kth field. For each such selector expression, add an arrow back to
the function parameter. At the end, your template must have as many arrows
as we have in the data definition.
Figure 47 illustrates this idea with the template
for functions that consume List-of-strings shown side by side with
the data definition. Both contain one arrow that
originates in the second clause—the rest field and selector,
respectively—and points back to the top of the respective definitions.
Since BSL and most programming languages are text-oriented, you must use
an alternative to the arrow, namely, a self-applications of the function to the
appropriate selector expression:
We refer to a self-use of a function as recursion and in the
first four parts of the book as natural recursion.
For the function body we start with thoseFor the
curious among our readers, the design recipe for arbitrarily large data
corresponds to so-called “proofs by induction” in mathematics and the
“leap of faith” represents the use of the induction hypothesis
for the inductive step of such a proof. Logic proves the
validity of this proof technique with an Induction Theorem.
cond lines without recursive function calls, known as
base cases.The corresponding answers are typically easy
to formulate or already given as examples.
Then we deal with the self-referential cases. We start by reminding
ourselves what each of the expressions in the template line computes. For
the natural recursion we assume that the function already works as
specified in our purpose statement. This last step is a leap of faith, but
as you will see, it always works.
The rest is then a matter of combining the various values.
What are the answers for the non-recursive cond clauses?
The examples should tell you which values you need here. If not,
formulate appropriate examples and tests.
What do the selector expressions in the recursive clauses compute?
The data definitions tell you what kind of data these expressions
extract and the interpretations of the data definitions tell you what
this data represents.
What do the natural recursions compute?
Use the purpose statement of the function to determine what the
value of the recursion means not how it computes this answer. If
the purpose statement doesn’t tell you the answer, improve the purpose
How can the function combine these values to get the desired answer?
Find a function in BSL that combines the values. Or, if that
doesn’t work, make a wish for a helper function. For many functions, this
last step is straightforward. The purpose, the examples, and the template
together tell you which function or expression “combines” the available
values into the proper result. We refer to this function or expression as
combinator, slightly abusing existing terminology.
Figure 49: How to turn a template into a function definition
So, if you are stuck here, ...
... arrange the examples from the third step in a table.
Place the given input in the first column and the desired output
in the last column. In the intermediate columns enter the values of the
selector expressions and the natural recursion(s). Add examples until you
see a pattern emerge that suggests a combinator.
If the template refers to some other template,
what does the auxiliary function compute?
Consult the other function’s purpose statement and examples to
determine what it computes and assume you may use the result even if you
haven’t finished the design of this helper function.
Figure 50: Turning a template into a function, the table method
formulates the first four questions and
answers for this step. Let’s use this game to complete the definition of
. Renaming the fun-for-los
template to how-many
gives us this much:
As the functional examples already suggest, the answer for the base case
. The two expressions in the second clause compute the
item and the number of strings in (rest alos)
compute how many strings there are on all of alos
, the function just needs
to add 1
to the value of the latter expression:
Felix Klock suggested this table-based approach to
guessing the combinator.
Finding the correct way to combine the values into the desired answer isn’t
always as easy. Novice programmers often get stuck with this step. As figure 50
suggests, it is a good idea to arrange the
functional examples into a table that also spells out the values of the
expressions in the template. Figure 51
shows what this
table looks like for our how-many
example. The leftmost column
lists the sample inputs, while the rightmost column contains the desired
answers for these inputs. The three columns in between show the values of
the template expressions: (first alos)
, (rest alos)
, and (how-many (rest alos))
, which is the natural
recursion. If you stare at this table long enough, you recognize that the
result column is always one more than the values in the natural recursion
column. You may thus guess that
(+ (how-many (rest alos)) 1)
is the expression that computes the desired result. Since DrRacket is fast at
checking these kinds of guesses, plug it in and click RUN. If the
examples-turned-into-tests pass, think through the expression to convince
yourself it works for all lists; otherwise add more example rows to the
table until you have a different idea.
The table also points out that some selector expressions in the template
are possibly irrelevant for the actual definition. Here (first alos) is not needed to compute the final answer—which is quite a contrast
to contains-flatt?, which uses both expressions from the template.
As you work your way through the rest of this book, keep in mind that, in
many cases, the combination step can be expressed with BSL’s primitives,
say, +, and, or cons. In some cases, though, you
may have to make a wish, that is, design an auxiliary function. Finally, in
yet other cases, you may need nested conditions.
Finally, make sure to turn all examples into tests, that these tests
pass, and that running them covers all the pieces of the function.
Here are our examples for how-many turned into tests:
Remember it is best to formulate examples directly as tests and BSL
allows this. Doing so also helps if you need to resort to the table-based
guessing approach of the preceding step.
Figure 51: Tabulating arguments, intermediate values, and results
Figure 52 summarizes the design recipe of this section in a
tabular format.The first column names the steps
of the design recipe, the second the expectedYou may want to copy figure 52
onto one side of an index card and write down your favorite versions of
the questions and answers for this design recipe onto the back of it. Then
carry it with you for future reference. results of each step. In the
third column, we describe the activities that get you there. The figure is
tailored to the kind of self-referential list definitions we use in this
chapter. As always, practice helps you master the process, so we strongly
recommend that you tackle the following exercises, which ask you to apply
the recipe to several kinds of examples.
develop a data representation for the information;
create examples for specific items of information
and interpret data as information;
signature; purpose; dummy definition
write down a signature using defined names;
formulate a concise purpose statement;
create a dummy function that produces a constant value from the specified range
examples and tests
work through several examples, at least one per
clause in the data definition
translate the data definition into a template:
one cond clauses per data clause;
selectors where the condition identifies a structure;
one natural recursion per self-reference
find a function that combines the values of the expressions in the cond
clauses into the expected answer
turn them into check-expect tests and run them
Figure 52: Designing a function for self-referential data
9.1 Finger Exercises: Lists
Exercise 137. Compare the template for contains-flatt? with
the one for how-many. Ignoring the function name, they are the
same. Explain the similarity.
138. Here is a data definition for representing sequences
of amounts of money:
Create some examples to make sure you understand the data definition. Also add
an arrow for the self-reference.
Design the sum function, which consumes a List-of-amounts
and computes the sum of the amounts. Use DrRacket’s stepper to see how
(sum l) works for a short list l in
139. Now take a look at this data definition:
Some elements of this class of data are appropriate inputs for sum
and some aren’t.
Design the function pos?, which consumes a List-of-numbers
and determines whether all numbers are positive numbers. In other words,
if (pos? l) yields #true, then l is an element
of List-of-amounts. Use DrRacket’s stepper to understand how
pos? works for (cons 5 '()) and (cons -1 '()).
Also design checked-sum. The function consumes a
List-of-numbers. It produces their sum if the input also belongs to
List-of-amounts; otherwise it signals an error. Hint Recall
to use check-error.
What does sum compute for an element of List-of-numbers?
Exercise 140. Design the function all-true, which consumes
a list of Boolean values and determines whether all of them are
#true. In other words, if there is any #false on the
list, the function produces #false.
Now design one-true, a function that consumes a list of Boolean
values and determines whether at least one item on the list is
Employ the table-based approach to coding. It may help with the base case.
Use DrRacket’s stepper to see how these functions process the lists
(cons #true '()), (cons #false '()), and
(cons #true (cons #false '())).}
141. If you are asked to design the function
, which consumes a list of strings and appends them all into
one long string, you are guaranteed to end up with this partial
|(cat (rest l))|
Figure 53: A table for cat
Fill in the table in figure 53. Guess a function that can
create the desired result from the values computed by the sub-expressions.
Use DrRacket’s stepper to evaluate (cat (cons "a" '())).
Exercise 142. Design ill-sized?. The function consumes a
list of images loi and a positive number n. It produces
the first image on loi that is not an n by n
square; if it cannot find such an image, it produces #false.
|; ImageOrFalse is one of:|
|; – Image|
|; – #false |
for the result part of the signature.
9.2 Non-empty Lists
Now you know enough to use cons and to create data definitions for
lists. If you solved (some of) the exercises at the end of the preceding
section, you can deal with lists of various flavors of numbers, lists of
Boolean values, lists of images, and so on. In this section we continue to explore
what lists are and how to process them.
Let’s start with the simple-looking problem of computing the average of a list
of temperatures. To simplify, we provide the data definitions:
For our intentions, you should think of temperatures as plain numbers, but the
second data definition reminds you that in reality not all numbers are
temperatures and you should keep this in mind.
The header material is straightforward:
Making up examples for this problem is also easy, and so we just formulate
The expected result is of course the sum of the temperatures divided by the
number of temperatures.
A moment’s thought tells you that the template for average should
be similar to the ones we have seen so far:
The two cond
clauses mirror the two clauses of the
data definition; the questions distinguish empty lists from non-empty
lists; and the natural recursion is needed because of the self-reference
in the data definition.
It is way too difficult, however, to turn this template into a working
function definition. The first cond clause needs a number that
represents the average of an empty collection of temperatures, but there
is no such number. Even if we ignore this problem for a moment, the second
clause demands a function that combines a temperature and an average for
many other temperatures into another average. Although it is isn’t
impossible to compute this average, it is not what you learned to do and
it isn’t natural.
When we compute the average of a bunch of temperatures, we divide their sum
by the number of temperatures. We said so when we formulated our trivial
little example. This sentence, however, suggests that average
a function of three tasks: division, summing, and counting. Our guideline
from Fixed-Size Data
tells us to write one function per task and if
we do so, the design of average is obvious:
The last two function definitions are wishes of course for which we need
to design complete definitions. Doing so is fortunately easy because
from above works for List-of-strings
(why?) and because the design of sum
follows the same old routine:
Stop! Use the example for average to create one for sum
and ensure that the test runs properly. Then run the tests
When you read this definition of average now, it is obviously correct
simply because it directly corresponds to what everyone learns about
averaging in school. Still, programs run not just for us but for others. In
particular, others should be able to read the signature and use the function
and expect an informative answer. But, our definition of average
does not work for empty lists of temperatures.
Exercise 143. Determine how average behaves in DrRacket when
applied to the empty list. Then design
checked-average, a function that produces an informative error
message when it is applied to '().
In mathematics, we would say exercise 143 shows that
average is a partial function because it raises an error
An alternative solution is to inform future readers through the signature that
average doesn’t work for empty lists. For that, we need a
data representation for lists that excludes '(), something like this:
The questions is with what we should replace “???” so that
the '() list is excluded but all other lists of temperatures
are still constructable. One hint is that while the empty list is the
shortest list, any list of one temperature is the next shortest list. In
turn, this suggests that the first clause should describe all possible
lists of one temperature:
While this definition differs from the preceding list definitions, it
shares the critical elements: a self reference and a clause that does
use a self-reference. Strict adherence to the design recipe
demands that you make up some examples of NEList-of-temperatures
ensure that the definition makes sense. As always, you should start with
the base clause, meaning the example must look like this:
Let us now return to the problem of designing average
everyone knows it is for non-empty lists only. With the definition of
we now have the means to say what we want in
the signature:This alternative development explains that, in
this case, we can narrow down the domain of average and create a
Naturally the rest remains the same: the purpose statement, the
example-test, and the function definition. After all, the very idea of
computing the average assumes a non-empty collection of numbers, and that
was the entire point of our discussion.
Exercise 144. Will sum and how-many work for
NEList-of-temperatures even though they are designed for inputs from
List-of-temperatures? If you think they don’t work, provide
counter-examples. If you think they would, explain why.
Nevertheless the definition also raises the question how to design
because they consume instances of
now. Here is the obvious result of the first three
steps of the design recipe:
The example is adapted from the example for average; the dummy
definition produces a number, but the wrong one for the given test.
The fourth step is the most interesting part of the design of sum
. All preceding examples of design demand a
template that distinguishes empty lists from non-empty, i.e., cons
lists because the data definitions have an appropriate shape. This is not true for
. Here both clauses add cons
ed lists. The
two clauses differ, however, in the rest
field of these lists. In
particular, the first clause always uses '()
in the rest
and the second one uses cons
instead. Hence the proper condition
to distinguish data the first kind of data from the second extracts the
field and then uses empty?
As always, else
is just a replacement for (cons? (rest ne-l))
in the second clause.
Next you should inspect both clauses and determine whether one or both of them deal
as if it were a structure. This is of course the case, which
the unconditional use of rest
differently, add appropriate selector expressions to the two clauses:
Before you read on, explain why the first clause does not contain the selector
expression (rest ne-l)
The final question of the template design concerns self-references in the data
definition. As you know, NEList-of-temperatures contains one and therefore
the template for sum demands one recursive use:
is called on (rest ne-l)
in the second
clause because the data definition is self-referential at the analogous point.
For the fifth design step, let us understand how much we already
have. Since the first cond
clause looks significantly simpler
than the second one with its recursive function call, you should start
with that one. In this particular case, the condition says that
is applied to a list with exactly one temperature,
. Clearly, this one temperature is the sum of all
temperatures on the given list:
The second clause says that the list consists of a temperature and at
least one more; (first ne-l)
extracts the first
position and (rest ne-l)
the remaining ones. Furthermore, the
template suggests to use the result of (sum (rest ne-l))
is the function that you are defining, and you can’t possibly
it uses (rest ne-l)
. All you do know is what
the purpose statement says, namely, that sum
adds all the
temperatures on the given list, which is (rest ne-l)
. If this
statement is true, then
(sum (rest ne-l))
adds up all but one of the numbers of
. To get the total, the function just has to add the first
If you now run the example/test for this function, you will see that the
leap of faith is justified. Indeed, for reasons beyond this book, this
leap of faith is always justified, which is why it is an inherent
part of the design recipe.
Exercise 145. Design sorted>?. The function consumes a
NEList-of-temperatures. It produces #true if the
temperatures are sorted in descending order, that is, if the second is
smaller than the first, the third smaller than the second, and so
on. Otherwise it produces #false.
Hint This problem is another one where the table-based method for
guessing the combinator works well. Here is a partial table for a number of
examples in figure 54. Fill in the rest of the table. Then
try to create an expression that computes the result from the pieces.
|(cons 2 '())|
Figure 54: A table for sorted>?
Exercise 146. Design how-many for NEList-of-temperatures.
Doing so completes average, so ensure that average
passes all of its tests, too.
Exercise 147. Develop a data definition for
NEList-of-Booleans, a representation of non-empty lists of
Boolean values. Then re-design the functions all-true and
one-true from exercise 140.
Exercise 148. Compare the function definitions from this section
(sum, how-many, all-true, one-true) with
the corresponding function definitions from the preceding sections. Is it
better to work with data definitions that accommodate empty lists as opposed to
definitions for non-empty lists? Why? Why not?
9.3 Natural Numbers
The BSL programming language supplies many functions that consume lists and a
few that produce them, too. Among those is make-list
, which consumes a
together with some other value v
produces a list that contains v n
times. Here are some
|> (make-list 2 "hello")|
(cons "hello" (cons "hello" '()))
|> (make-list 3 #true)|
(cons #true (cons #true (cons #true '())))
|> (make-list 0 17)|
In short, even though this function consumes atomic data, it produces
arbitrarily large pieces of data. Your question should be how this is possible.
Our answer is that make-list
’s input isn’t just a number, it is a
special kind of number. In kindergarten you called these numbers “counting
numbers,” i.e., these numbers are used to count objects. In computer science,
these numbers are dubbed natural numbers
. Unlike regular numbers,
natural numbers come with a data definition:
|; A N is one of: |
|; – 0|
|; – (add1 N)|
|; interpretation represents the counting numbers|
The first clause says that 0
is a natural number; it is of course
used to say that there is no object to be counted. The second clause tells
you that if n
is a natural number, then n+
is one too,
is a function that adds 1
number it is given. We could write this second clause as (+ n 1)
but the use of add1
is supposed to signal that this addition is
What is special about this use of add1, is that it acts more like a
constructor from some structure type definition than a regular function. For
that reason, BSL also comes with the function sub1, which is the
“selector” corresponding to add1. Given any natural number m
not equal to 0, you can use sub1 to find out the number that
went into the construction of m. Put differently, add1 is like
cons and sub1 is like first and rest.
At this point you may wonder what the predicates are that distinguish 0
from those natural numbers that are not 0. There are two, just as for
lists: zero?, which determines whether some given number is
0, and positive?, which determines whether some number is
larger than 0.
Now you are in a position to design functions on natural numbers, such as
, yourself. The data definition is already available, so
let us add the header material:
Developing the template is the next step. The questions for the template suggest
’s body is a cond
expression with two clauses: one
and one for positive numbers. Furthermore, 0
considered atomic and positive numbers are considered structured values,
meaning the template needs a selector expression in the second clause. Last but
not least, the data definition for N
is self-referential in the second
clause. Hence the template needs a recursive application to the correspond
selector expression in the second clause:
Figure 55: Creating a list of copies
contains a complete definition of the copier
function, as obtained from its template. Let us reconstruct this step
carefully. As always, we start with the cond
clause that has no
recursive calls. Here the condition tells us that the (important) input is
and that means the function must produce a list with
items, that is, none
. Of course, working through the
second example has already clarified this case. Next we turn to the other
clause and remind ourselves what its expressions compute:
(sub1 n) extracts the natural number that went into the
construction of n, which we know is larger than 0;
(copier (sub1 n) s) produces a list of (sub1 n)
strings s according to the purpose statement ofcopier.
But the function is given n
and must therefore produce a list
. Given a list with one too few strings,
it is easy to see that the function must simply cons
onto the result of (copier (sub1 n) s)
. And that is
precisely what the second clause specifies.
At this point, you should run the tests to ensure that this function works
at least for the two worked examples. In addition, you may wish to use the
function on some additional inputs.
Exercise 149. Does copier function properly when you apply it to
a natural number and a Boolean or an image? Or do you have to design another
function? Read Similarities Everywhere for an answer.
An alternative definition of copier
might use else
How do copier
behave when you apply them to
? Explain. Use DrRacket’s stepper to confirm
150. Design the function add-to-pi
. It consumes a
natural number n
and adds it to pi without
the primitive +
operation. Here is a start:
Once you have a complete definition, generalize the function to
, which adds a natural number n
to some arbitrary
without using +
. Why does the skeleton use
Exercise 151. Design the function multiply. It consumes a
natural number n and multiplies it with a number x
without using *.
Use DrRacket’s stepper to evaluate (multiply 3 x) for any x
you like. How does multiply relate to what you know from grade
Exercise 152. Design two functions: col and row.
The function col consumes a natural number n and an image
img. It produces a column—a vertical arrangement—of n
copies of img.
The function row consumes a natural number n and an image
img. It produces a row—a horizontal arrangement—of n
copies of img.
Figure 56: Random attacks
Exercise 153. The goal of this exercise is to visualizes the
result of a 1968-style European student riot. Here is the rough idea. A
small group of students meets to make paint-filled balloons, enters some
lecture hall and randomly throws the balloons at the attendees. Your
program displays how the balloons color the seats in the lecture hall.
Use the two functions from exercise 152 to create a rectangle of 8 by
18 squares, each of which has size 10 by 10. Place it in an
empty-scene of the same size. This image is your lecture hall.
Design add-balloons. The function consumes a list of Posn
whose coordinates fit into the dimensions of the lecture hall. It produces
an image of the lecture hall with red dots added as specified by the
Figure 56 shows the output of our solution when
given some list of Posns. The leftmost is the clean lecture hall,
the second one is after two balloons have hit, and the last one is a
highly unlikely distribution of 10 hits. Where is the 10th?
9.4 Russian Dolls
Wikipedia defines a Russian doll as “ a set of dolls of decreasing sizes
placed one inside the other.” The paragraph is accompanied by an
appropriate picture of dolls:
Of course, here the dolls are taken apart so that the viewer can see them
The problem may strike you as abstract or even absurd;
it isn’t clear why you would want to represent Russian dolls or
what you would do with such a representation. Just play along for now.
Now consider the problem of representing such Russian dolls with BSL data.
With a little bit of imagination, it is easy to see that an artist can
create a nest of Russian dolls that consists of an arbitrary number of
dolls. After all, it is always possible to wrap another layer around some
given Russian doll. Then again, you also know that deep inside there is a
solid doll without anything inside.
For each layer of a Russian doll, we could care about many different
things: its size, though it is related to the nesting level; its color;
the image that is painted on the surface; and so on. Here we just pick
one, namely the color of the doll, which we represent with a string.
Given that, we know that each layer of the Russian doll has two
properties: its color and the doll that is inside. To represent pieces of
information with two properties, we always define a structure type:
And then we add a data definition:
|; An RD (short for Russian doll) is one of: |
|; – String |
|; – (make-layer String RD)|
Naturally, the first clause of this data definition represents the
innermost doll or, to be precise, its color. The second clause is for
adding a layer around some given Russian doll. We represent this with an
instance of layer, which obviously contains the color
of the doll and one other field: the doll that is nested immediately
inside of this doll.
Take a look at this doll:
It consists of three dolls. The red one is the innermost one, the green one
sits in the middle, and the yellow is the current outermost wrapper. To
represent this doll with an element of RD
, you start on either
end. We proceed from the inside out. The red doll is easy to represent as
. Since nothing is inside and since it is red, the string
will do fine. For the second layer, we use
(make-layer "green" "red")
which says that a green (hollow) doll contains a the red doll. Finally,
to get the outside we just wrap another layer around this last doll:
(make-layer "yellow" (make-layer "green" "red"))
This process should give you a good idea how to go from any set of colored
Russian dolls to a data representation. But keep in mind that a programmer
must also be able to do the converse, that is, go from a piece of data to
concrete information. In this spirit, draw a schematic Russian doll for
the following element of RD
(make-layer "pink" (make-layer "black" "white"))
You might even try this in BSL.
Now that we have a data definition and understand how to represent actual
dolls and how to interpret elements of RD
as dolls, we are ready to
design functions that consume RD
s. Specifically, let us design the
function that counts how many dolls a Russian doll set contains. This
sentence is a fine purpose statement and determines the signature, too:
|; RD -> Number|
|; how many dolls are part of an-rd |
As for data examples, let us start with (make-layer "yellow" (make-layer "green" "red")). The image above tells us that 3 is
the expected answer because there are three dolls: the red one, the green
one, and the yellow one. Just working through this one example, also tells
us that when the input is a representation of this doll
then the answer is 1.
Step four demands the development of a template. Using the standard
questions for this step produces this template:
The number of cond
clauses is determined by the number of clauses
in the definition of RD
. Each of the clauses specifically spells
out what kind of data it is about, and that tells us which predicates to
. While strings aren’t compound data,
instances of layer
contain two values. If the function needs
these values, it uses the selector expressions (layer-color an-rd)
and (layer-doll an-rd)
. Finally, the second clause of the data
definition contains a self-reference from the doll
field of the
structure to the definition itself. Hence we need a
recursive function call for the second selector expression.
The examples and the template almost dictate the function definition. For
the non-recursive cond
clause, the answer is obviously
. For the recursive clause, the template expressions compute the
(layer-color an-rd) extracts the string that describes the
color of the current layer;
(layer-doll an-rd) extracts the doll contained within the
current layer; and
(depth (layer-doll an-rd)) determines how many dolls are
part of (layer-doll an-rd), according to the purpose statement
This last number is almost the desired answer but not quite because the
difference between an-rd and (layer-doll an-rd) is one
layer meaning one extra doll. Put differently, the function must add
1 to the recursive result to obtain the actual answer:
Note how the function definition does not use (layer-color an-rd)
in the second clause. Once again, we see that the template is an
organization schema for everything we know about the data definition but
we may not need all of these pieces for the actual definition.
Let’s finally translate the examples into tests:
If you run these in DrRacket, you will see that their evaluation touches all
pieces of the definition of depth.
154. Design the function colors
. It consumes a Russian
doll and produces a string of all colors, separate by a comma and a
space. Thus our example should produce
Exercise 155. Design the function inner, which consumes an
RD and produces the (color of the) innermost doll. Use DrRacket’s stepper
to evaluate (inner rd) for you favorite rd.
9.5 Lists and World
With lists and self-referential data definitions in general, you can design
and run many more interesting world programs than with finite data. Just
imagine you can now create a version of the space invader program from
Itemizations and Structures
that allows the player to fire as many shots from the tank
as desired. Let us start with a simplistic version of this
problem:If you haven’t designed a world program in a while,
re-read Designing World Programs.
Sample Problem Design a world program that simulates firing shots. Every
time the “player” hits the space bar, the program adds a shot to the
bottom of the canvas. These shots rise vertically at the rate of one pixel
Designing a world program starts with a separation of information into
constants and elements of the ever-changing state of the world. For the
former we introduce physical and graphical constants; for the latter we
need to develop a data representation for world states. While the sample
problem is relatively vague about the specifics, it clearly assumes a
rectangular scenery with shots painted along a vertical line. Obviously
the locations of the shots change with every clock tick but the size of
the scenery and x-coordinate of the line of shots remain the same:
Nothing in the problem statement demands these particular choices, but as
long as they are easy to change—meaning changing by editing a single
definition—we have achieved our goal.
As for those aspects of the “world” that change, the problem statement
mentions two. First, hitting the space bar adds a shot. Second, all the
shots move straight up by one pixel per clock tick. Given that we cannot
predict how many shots the player will “fire,” we use a list to
|; A List-of-shots is one of: |
|; – '()|
|; – (cons Shot List-of-shots)|
|; interpretation the collection of shots fired |
The one remaining question is how to represent each individual shot. We
already know that all of them have the same x-coordinate and that
this coordinate stays the same throughout. Furthermore, all shots look
alike. Hence, their y-coordinates are the only property in which
they differ from each other. It therefore suffices to represent each shot
as a number:
|; A Shot is a Number.|
|; interpretation represents the shot's y-coordinate |
We could restrict the representation of shots to the interval of numbers
below HEIGHT because we know that all shots are launched from the
bottom of the canvas and that they then move up, meaning their
y-coordinate continuously decreases.
You can also use a data definition like this to represent this
|; A ShotWorld is List-of-numbers. |
|; interpretation each number on such a list|
|; represents the y-coordinate of a shot |
After all, the above two definitions describe all list of numbers; we
already have a definition for lists of numbers; and the name
tells everyone what this class of data is about.
Once you have defined constants and developed a data representation for
the states of the world, the key task is to pick which event handlers you
wish to employ and to adapt their signatures to the given problem. The running
example mentions clock ticks and the space bar, all of which translates
into a wish list of three functions:
the function that turns a world state into an image:
|; ShotWorld -> Image|
|; adds the image of a shot for each y on w |
|; at (MID,y} to the background image |
|(define (to-image w) BACKGROUND)|
because the problem demands a visual rendering;
one for dealing with tick events:
and one function for dealing with key events:
Don’t forget that in addition to the initial wish list, you also need to
define a main
function that actually sets up the world and
installs the handlers. Figure 57
includes this one function that
is not designed but defined as a modification of standard schema.
Let us start with the design of to-image
. We have its signature,
purpose statement and header, so we need examples next. Since the data
definition has two clauses, there should be at least two examples:
and a cons
ed list, say, (cons 9 '())
. The expected result for '()
; if there is a y-coordinate, though, the
function must place the image of a shot at MID
and the specified
Before you read on, work through an example that applies to-image
to a list of two Shot
s. Doing so helps understand
the function works.
The fourth step is about translating the data definition into a template:
The template for data definitions for lists is so familiar now that it
doesn’t need much explanation. If you have any doubts, read
over the questions in figure 48
and design the template on your
From here it is straightforward to define the function. The key is to
combine the examples with the template and to answer the questions from
. Following those, you start with the base case
of an empty list of shots and, from the examples, you know that the
expected answer is BACKGROUND
. Next you formulate what the
template expressions in the second cond
(first w) extracts the first coordinate from the list;
(rest w) is the rest of the coordinates; and
(to-image (rest w)) adds each shot on (rest w) to
the background image, according to the purpose statement of to-image.
In other words, (to-image (rest w))
renders the rest of the list
as an image and thus performs almost all the work. What is missing is the
first shot, (first w)
. If you now apply the purpose
statement to these two expressions, you get the desired expression for the
The added icon is the standard image for a shot; the two coordinates are
spelled out in the purpose statement; and the last argument to
is the image constructed from the rest of the list.
Figure 57 displays the complete function definition for
to-image and indeed the rest of the program, too. The design of
tock is just like the design of to-image and you should
work through it for yourself. The signature of the keyh handler,
though, poses one interesting question. It specifies that the handler
consumes two inputs with non-trivial data definitions. On one hand, the
ShotWorld is self-referential data definition. On the other hand,
the definition for KeyEvents is a large enumeration. For now, we
have you “guess” which of the two arguments should drive the development
of the template; later we will study such cases in depth.
Figure 57: A list-based world program
As far as a world program is concerned, a key handler such as keyh
is about the key event that it consumes. Hence, we consider it the main
argument and use its data definition to derive the template. Specifically,
following the data definition for KeyEvent
dictates that the function needs a cond
expression with numerous
clauses like this:
Of course, just like for functions that consume all possible BSL values, a
key handler usually does not need to inspect all possible cases. For our
running problem, you specifically know that the key handler reacts only
to the space bar and all others are ignored. So it is natural to collapse
all of the cond
clauses into an else
clause except for
the clause for " "
Exercise 156. Equip the program in figure 57 with tests and
make sure it passes those. Explain what main does. Then run the
program via main.
157. Experiment whether the arbitrary decisions concerning constants
are truly easy to change. For example, determine whether changing a single
constant definition achieves the desired outcome:
change the height of the canvas to 220 pixels;
change the width of the canvas to 30 pixels;
change the x location of the line of shots to “somewhere to
the left of the middle;”
change the background to a green rectangle; and
change the rendering of shots to a red elongated rectangle.
Also check whether it is possible to double the size of the shot without
changing anything else or change its color to black.
Exercise 158. If you run main, press the space bar (fire a
shot), and wait for a good amount of time, the shot disappears from the
canvas. When you shut down the world canvas, however, the result is a
world that still contains this invisible shot.
Design an alternative tock function, which not just moves shots
one pixel per clock tick but also eliminates those whose coordinates
places them above the canvas. Hint You may wish to consider the design of
an auxiliary function for the recursive cond clause.
Exercise 159. Turn the exercise of exercise 153 into a world
program. Its main function, dubbed riot, consumes how many
balloons the students want to throw; its visualization shows one balloon
dropping after another at a rate of one per second. The function produces
the list of Posns where the balloons hit.
Hints (1) Here is one possible data representation:
|(define-struct pair [balloon# lob])|
|; A Pair is a structure (make-pair N List-of-posns)|
|; A List-of-posns is one of: |
|; – '()|
|; – (cons Posn List-of-posns)|
|; interpretation (make-pair n lob) means n balloons |
|; must yet be thrown and added to lob|
(2) A big-bang expression is really just an expression. It is
legitimate to nest it within another expression.
(3) Recall that random creates random numbers.
9.6 A Note on Lists and Sets
This book relies on your intuitive understanding of sets as
collections of BSL values. The Universe of Data specifically says that a data definition
introduces a name for a set of BSL values. There is one question that
this book consistently asks about sets, and it is whether some element is
in some given set. For example, 4 is in Number, while
"four" is not. The book also shows how to use a data definition to
check whether some value is a member of some named set and how to use some
of the data definitions to generate sample elements of sets; but, these two
procedures are about data definitions not sets per se.
At the same time, lists represent collections of values.
Hence you might be wondering what the difference between a list and a set
is or whether this is a needless distinction. If so, this section is for you.
Right now the primary difference between sets and lists is that the former
is a concept we use to discuss steps in the design of code and the latter
is one of many forms of data in BSL, our chosen programming
language. The two ideas live at rather different levels in our
conversations. However, given that a data definition introduces a data
representation of actual information inside of BSL and given that sets
are collections of information, you may now ask yourself how sets are
represented inside of BSL as data.
Most full-fledged languages directly support data
representations of both lists and sets.
While lists have a special status in BSL, sets don’t, but at the same
time sets somewhat resemble lists. The key difference is the kind of
functions a program normally uses with either form of data. BSL provides
several basic constants and functions for lists—
and some functions that you could define yourself—
, and so on. Here is an example of a function you can
define but does not come with BSL
Stop! Finish the design of this function.
Let’s proceed in straightforward and possibly naive manner and say sets are
basically lists. And, to simplify further, let’s focus on lists of numbers
in this section. If we now accept that it merely matters whether a number
is a part of a set or not, it is almost immediately clear that we can use
lists in two different ways to represent sets.
Figure 58: Two data representations for sets
Figure 58 displays the two data definitions. Both
basically say that a set is represented as a list of numbers. The
difference is that the definition on the right comes with the constraint
that no number may occur more than once on the list. After all, the key
question we ask about a set is whether some number is in the set or not,
and whether it is in a set once, twice or three times makes no difference.
Regardless of which definition you choose, you can already define two
The first one is the empty set, which in both cases is represented
by the empty list. The second one is a membership test.
One way to build larger sets is to use cons
and the above
definitions. Say we wish to build a representation of the set that contains
, and 3
. Here is one such representation:
And it works for both data representations. But, is
really not a representation of the same set? Or how about
The answer has to be affirmative as long as the primary concern is whether
a number is in a set or not. Still, while the order of cons
matter, the constraint in the right-hand data definition rules out the last
list as a Son.R
because it contains 1
Figure 59: Functions for the two data representations of sets
The difference between the two data definitions shows up when we design
functions. Say we want a function that removes
a number from a set. Here is a wish list entry that applies to both
The purpose statement uses the word “subtract” because this is what
logicians and mathematicians use when they work with sets.
shows the results. The two columns
differ in two points:
The test on the left uses a list that contains 1 twice,
while the one on the right represents the same set with a single
Because of these differences, the set- on the left must use
remove-all, while the one on the right gets away with
Stop! Copy the code into the DrRacket definitions area and make sure the tests
pass. Then read on and experiment with the code as you do.
An unappealing aspect of figure 59
is that the
tests use es
, a plain list as the expected result. This problem
may seem minor at first glance. Consider the following example, however:
where set123 represents the set containing 1, 2,
and 3 in one of two ways:
Regardless of which representation we choose, (set- 1 set123)
evaluates to one of these two lists:
But, we cannot predict which of those two set- produces.
For the simple case of two alternatives, it is possible to use the
testing facility as follows:
If the expected set contains three elements, there are six possible
variations, not including representations with repetitions, which
the left-hand data definition allows.
Fixing this problem calls for the combination of two ideas. First, recall
that set- is really about ensuring that the given element does not
occur in the result. It is an idea that our way of turning the examples
into tests does not bring across. Second, with BSL’s
check-satisfied testing facility, it is possible to state just
briefly mentions check-satisfied
but, in a nutshell,
the facility determines whether an expression satisfies a certain property.
A property is a function from values to Boolean
. In our specific
case, we wish to state that 1
is not a member of some set:
|; Son -> Boolean|
|; #true if 1 a member of s; #false otherwise|
|(define (not-member-1? s)|
| (not (in? 1 s)))|
Using not-member-1?, we can formulate the test case as follows:
and this variant clearly states what the function is supposed to
accomplish. Better yet, this formulation simply does not depend on how the
input or output set is represented.
In sum, lists and sets are related in that both are about collections of
values but they also differ strongly:
one among many
# of occurrences
finite but arbitrary
finite or infinite
The last row in this table presents a new idea, though an obvious one,
too. Many of the sets mentioned in this book are infinitely large, for
, and also List-of-strings
contrast, a list is always
finite though it may contain an
arbitrarily large number of items.
In sum, this section explains the essential differences between sets and
lists and how to represent finite sets with finite lists in two different
ways. BSL is not expressive enough to represent infinite sets;
exercise 299 introduces a completely different representation
of sets, a representation that can cope with infinite sets, too. The
question of how actual programming languages represent sets is beyond the
scope of this book, however.
Exercise 160. Design the functions set+.L and
set+.R, which create a set by adding a number x to some
given set s for the left-hand and right-hand data definition,
10 More on Lists
Lists are a versatile form of data that come with almost all languages
now. Programmers have used them to build large applications, artificial
intelligences, distributed systems, and more. This chapter illustrates
some ideas from this world, including functions that create lists,
data representations that call for structures inside of lists, and
representing text files as lists.
10.1 Functions that Produce Lists
Here is a function for determining the wage of an hourly employee:
It consumes the number of hours worked and produces the wage. A company
that wishes to use payroll software isn’t interested in this function,
however. It wants a function that computes the wages for all of employees.
Call this new function wage*. Its task is to process all employee
work hours and to determine the wages due to each of them. For
simplicity, let us assume that the input is a list of numbers, each
representing the number of hours that one employee worked, and that
the expected result is a list of the weekly wages earned, also represented
with a list of numbers.
Since we already have a data definition for the inputs and outputs, we can
immediately move to the second design step:
Next you need some examples of inputs and the corresponding outputs. So you
make up some short lists of numbers that represent weekly hours:
In order to compute the output, you determine the weekly wage for each
number on the given input list. For the first example, there are no
numbers on the input list so the output is '(). Make sure you
understand why the second and third expected output is what you want.
Given that wage* consumes the same kind of data as several other
functions from Lists and given that a template depends only on
the shape of the data definition, you can reuse these template:
In case you want to practice the development of templates, use the
questions from figure 48
It is now time for the most creative design step. Following the design
recipe, we consider each cond-line of the template in isolation.
For the non-recursive case, (empty? whrs) is true, meaning the
input is '(). The examples from above specify the desired
answer, '(), and so we are done.
In the second case, the design questions tell us to state what each
expression of the template computes:
(first whrs) yields the first number on whrs, which
is the first number of hours worked;
(rest whrs) is the rest of the given list; and
(wage* (rest whrs)) says that the rest is processed by the
very function we are defining. As always we use its signature and its
purpose statement to figure out the result of this expression. The
signature tells us that it is a list of numbers, and the purpose statement
explains that this list represents the list of wages for its input, which
is the rest of the list of hours.
The key is to rely on these facts when you formulate the expression that
computes the result in this case, even if the function is not yet
Since we already have the list of wages for all but the first item of
, the function must perform two computations to produce the
expected output for the entire whrs
: compute the weekly
wage for (first whrs)
and construct the list that represents all
weekly wages for whrs
. For the first part, we reuse
. For the second, we cons
the two pieces of
information together into one list:
And with that, the definition is complete: see figure 60
Figure 60: Computing the wages of all employees
Exercise 161. Translate the examples into tests and make sure they
all succeed. Then change the function in figure 60 so that
everyone gets $14 per hour. Now revise the entire program so that changing
the wage for everyone is a single change to the entire program and
Exercise 162. No employee could possibly work more than 100 hours per
week. To protect the company against fraud, the function should check that
no item of the input list of wage* exceeds 100. If one of them
does, the function should immediately signal an error. How do we have to
change the function in figure 60 if we want to perform this basic
Show the products of the various steps in the design
recipe. If you are stuck, show someone how far you got according to the
design recipe. The recipe isn’t just a design tool for you to use; it is
also a diagnosis system so that others can help you help yourself.
Exercise 163. Design convertFC. The function converts
a list of measurements in Fahrenheit to a list of Celsius measurements.
Exercise 164. Design the function convert-euro, which
converts a list of US$ amounts into a list of € amounts. Look up
the current exchange rate on the web.
Generalize convert-euro to the function convert-euro*,
which consumes an exchange rate and a list of US$ amounts and converts
the latter into a list of € amounts.
Exercise 165. Design the function subst-robot, which consumes
a list of toy descriptions (one-word strings) and replaces all occurrences of
"robot" with "r2d2"; all other descriptions remain the same.
Generalize subst-robot to substitute. The latter
consumes two strings, called new and old, and a
list of strings. It produces a new list of strings by substituting all
occurrences of old with new.
10.2 Structures in Lists
Representing a work week as a number is a bad choice because the printing
of a paycheck requires more information than hours worked per week. Also,
not all employees earn the same amount per hour. Fortunately a list may
contain items other than atomic values; indeed, lists may contain whatever
values we want, especially structures.
Our running example calls for just such a data representation. Instead of
numbers, we use structures that represent employees plus their work hours
and pay rates:
|(define-struct work [employee rate hours])|
|; A (piece of) Work is a structure: |
|; (make-work String Number Number)|
|; interpretation (make-work n r h) combines the name |
|; with the pay rate r and the number of hours h.|
While this representation is still simplistic, it is just enough of an
additional challenge because it forces us to formulate a data definition
for lists that contain structures:
|; Low (short for list of works) is one of: |
|; – '()|
|; – (cons Work Low)|
|; interpretation an instance of Low represents the |
|; hours worked for a number of employees|
Here are three elements of Low
|(cons (make-work "Robby" 11.95 39)|
|(cons (make-work "Matthew" 12.95 45)|
| (cons (make-work "Robby" 11.95 39)|
Use the data definition to explain why these pieces of data belong to
Stop! Also use the data definition to generate two more examples.
When you work on real-world projects, you won’t use such
suffixes; instead you will use a tool for managing different versions
Now that you know that the definition of Low
makes sense, it is time
to re-design the function wage*
so that it consumes elements of
not just lists of numbers:
The suffix “.v2” at the end of the function name informs every reader of
the code that this is a second, revised version of the function. In this
case, the revision starts with a new signature and an adapted purpose
statement. The header is the same as above.
The third step of the design recipe is to work through an example. Let us
start with the second list above. It contains one work record, namely,
(make-work "Robby" 11.95 39). Its interpretation is that
"Robby" worked for 39 hours and that he is paid at the
rate of $11.95 per hour. Hence his wage for the week is $466.05, i.e., (* 11.95 39). The desired result for
wage*.v2 is therefore (cons 466.05 '()). Naturally,
if the input list contained two work records, we would perform this kind
of computation twice, and the result would be a list of two
numbers. Before you read on, determine the expected result for the third
data example above.
Note on Numbers Keep in mind that BSL—unlike most other programming
languages—understands decimal numbers just like you do, namely, as exact
fractions. A language such as Java, for example, would produce 466.04999999999995 for the expected wage of the first work record. Since you
cannot predict when operations on decimal numbers behave in this strange
way, you are better off writing down such examples as
just to prepare yourself for other programming languages. Then again,
writing down the example in this style also means you have really figured
out how to compute the wage. End
From here we move on to the development of the template. If you use
the template questions, you quickly get this much:
because the data definition consists of two clauses, because it introduces
in the first clause and cons
ed structures in the
second, and so on. But, you also realize that you know even more about the
input than this template expresses. For example, you know that
extracts a structure of three fields from the given
list. This seems to suggest the addition of three more expressions to the template:
This template lists all potentially interesting data.
We use a different strategy here. Specifically, we suggest to
create and to refer to a separate function template whenever you are
developing a template for a data definition that refers to other data
Splitting the templates leads to a natural partition of work into
functions and among functions; none of them grow too large, and
all of them relate to a specific data definition.
Finally, you are ready to program. As always you start with the
simple-looking case, which is the first cond
line here. If
is applied to '()
, you expect '()
back and that settles it. Next you move on to the second line and remind
yourself of what these expressions compute:
(first an-low) extracts the first work structure
from the list;
(for-work ...) says that you wish to design a
function that processes work structures;
(rest an-low) extracts the rest of the given list; and
(wage*.v2 (rest an-low)) determines the list of wages for
all the work records other than the first one, according to the
purpose statement of the function.
If you are stuck here, use the table method from figure 50
If you understand it all, you see that it is enough to cons
two expressions together:
assuming that for-work computes the wage for the first work
record. In short, you have finished the function by adding another entry
to your wish list of functions.
Since for-work is a name that just serves as a stand-in and since
it is a bad name for this function, let us call the function wage.v2
and write down its complete wish list entry:
|; Work -> Number|
|; computes the wage for the given work record w|
|(define (wage.v2 w)|
This design of this kind of function is extensively covered in
and thus doesn’t need any additional explanation
here. Figure 61
shows the final result of developing
Figure 61: Computing the wages from work records
Exercise 166. The wage*.v2 function consumes a list of work
records and produces a list of numbers. Of course, functions may also
produce lists of structures.
Develop a data representation for pay checks. Assume that a pay check
contains two distinctive pieces of information: the employee’s name
and an amount. Then design the function wage*.v3. It consumes a
list of work records and computes a list of pay checks from it, one per
In reality, a pay check also contains an employee number. Develop a data
representation for employee information and change the data definition for
work records so that it uses employee information and not just a string
for the employee’s name. Also change your data representation of pay
checks so that it contains an employee’s name and number, too. Finally,
design wage*.v4, a function that maps lists of revised work
records to lists of revised pay checks.
Note on Iterative Refinement This exercise demonstrates the
iterative refinement of a task. Instead of using data
representations that include all relevant information, we started from
simplistic representation of pay checks and gradually made the
representation realistic. For this simple program, refinement is overkill;
later we will encounter situations where iterative refinement is not just
an option but a necessity.
Exercise 167. Design the function sum, which consumes a list
of Posns and produces the sum of all of its x-coordinates.
Exercise 168. Design the function translate. It consumes and
produces lists of Posns. For each (make-posn x y) in the
former, the latter contains (make-posn x (+ y 1)).—We borrow the word
“translate” from geometry, where the movement of a point by a constant
distance along a straight line is called a translation.
Exercise 169. Design the function legal. Like
translate from exercise 168 the function consumes and produces
a list of Posns. The result contains all those Posns whose
x-coordinates are between 0 and 100 and whose y-coordinates
are between 0 and 200.
170. Here is one way to represent a phone number:
Design the function replace
. It consumes a list of Phone
and produces one. It replaces all occurrence of area code 713
10.3 Lists in Lists, Files
Functions and Programs introduces read-file, a function for
readingAdd (require 2htdp/batch-io) to your
an entire text file as a string. In other words, the creator of
read-file chose to represent text files as strings, and the
function creates the data representation for specific files (specified by
a name). Text files aren’t plain long texts or strings, however. They are
organized into lines and words, rows and cells, and many other ways. In
short, representing the content of a file as a plain string might work on
rare occasions but is usually a bad choice.
Put up in a place
where it's easy to see
the cryptic admonishment
When you feel how depressingly
slowly you climb,
it's well to remember that
Things Take Time.
Figure 62: Things take time
For concreteness, take a look at the sample file in figure 62
It contains a poem by Piet Hein, and it consists of many lines and
words. When you use the program
to turn this file into a BSL string, you get this:The
dots aren’t really a part of the result as you probably guessed.
"TTT\n \nPut up in a place\nwhere ...."
where the "\n" inside the string indicates line breaks.
While it is indeed possible to break apart this string with primitive
operations on strings, e.g., explode
, most programming
support many different representations of
files and functions that create such representations from existing
One way to represent this file is as a list of lines, where each line
is represented as one string:
Here the second item of the list is the empty string because the file
contains an empty line.
Another way is to use a list of words, again each word represented as
Note how the empty second line disappears with this representation. After
all, there are no words on the empty line.
And a third representation relies on lists of list of words:
This representation has an advantage over the second one in that it
preserves the organization of the file, including the emptiness of the second
line. The price is that all of a sudden lists contain lists.
While the idea of list-containing lists may sound frightening at first,
you need not worry. The design recipe helps even with such complications.
Figure 63: Reading files
Before we get started, take a look at figure 63. It introduces
a number of useful file reading functions. They are not comprehensive and
there are many other ways of dealing with text from files, and you will
need to know a lot more to deal with all possible text files. For our
purposes here—teaching and learning the principles of systematic program
design—they suffice, and they empower you to design reasonably
Figure 63 uses the names of two
data definitions that do not exist yet, including one involving
list-containing lists. As always, we start with a data definition,
but this time, we leave this task to you. Hence, before you read
on, solve the following exercises. The solutions are needed to make
complete sense out of the figure, and without working through the
solutions, you cannot really understand the rest of this section.
Exercise 171. You know what the data definition for
List-of-strings looks like. Spell it out. Make sure that you can
represent Piet Hein’s poem as an instance of the definition where each
line is a represented as a string and another one where each word is a
string. Use read-lines and read-words to confirm
your representation choices.
Next develop the data definition for List-of-list-of-strings.
Again, represent Piet Hein’s poem as an instance of the definition where
each line is a represented as a list of strings, one per word, and the
entire poem is a list of such line representations. You may use
read-words/line to confirm your choice.
As you probably know, operating systems come with programs that measure
files. One counts the number of lines, an others determines how many words
appear per line. Let us start with the latter to illustrate how the design
recipe helps with the design of complex functions.
The first step is to ensure that we have all the necessary data
definitions. If you solved the above exercise, you have a data definition
for all possible inputs of the desired function, and the preceding section
, which describes all possible inputs. To
keep things short, we use LLS
to refer to the class of lists of
lists of strings, and use it to write down the header material for the
We name the functions words-on-line, because it is appropriate
and captures the purpose statement in one phrase.
What is really needed though is a set of data examples:
The first two definitions introduce two examples of lines: one contains
two words, the other contains none. The last two definitions show how to
construct instances of LLS
from these line examples. Determine what
the expected result is when the function is given these two examples.
Once you have data examples, it is easy to formulate functional
examples; just imagine applying the function to each of the data
example. When apply words-on-line to lls0, you should
get the empty list back, because there are no lines. When you apply
words-on-line to lls1, you should get a list of two
numbers back, because there are two lines. The two numbers are
2 and 0, respectively, given that the two lines in
lls1 contain two and no words each.
Here is how you translate all this into test cases:
By doing it at the end of the second step, you have a complete program,
though running it just fails some of the test cases.
The development of the template is the interesting step for this sample
problem. By answering the template questions from figure 48
you get the usual list-processing template immediately:
As in the preceding section, we know that the expression (first lls)
extracts a List-of-strings
, which has a complex
organization, too. The temptation is to insert a nested template to express
this knowledge, but as you should recall, the better idea is to develop a
second auxiliary template and to change the first line in the second
condition so that it refers to this auxiliary template.
Since this auxiliary template is for a function that consumes a list, the
template looks nearly identical to the previous one:
The important differences are that (first ln)
extracts a string
from the list, and we consider strings as atomic values. With this
template in hand, we can change the first line of the second case in
which reminds us for the fifth step that the definition for
words-on-line may demand the design of an auxiliary function.
Now it is time to program. As always, we use the questions from
to guide this step. The first case, concerning
empty lists of lines, is the easy case. Our examples tell us that the
answer in this case is '()
, i.e., the empty list of
numbers. The second case, concerning cons
several expressions and we start with a reminder of what they compute:
(first lls) extracts the first line from the non-empty list
of (represented) lines;
(line-processor (first lls)) suggests that we may wish to
design an auxiliary function to process this line;
(rest lls) is the rest of the list of line;
(words-on-line (rest lls)) computes a list of words per line
for the rest of the list. How do we know this? We promised just that with
the signature and the purpose statement for words-on-line.
Assuming we can design an auxiliary function that consumes a line and
counts the words on one line—let’s call it words#—it is easy
to complete the second condition:
This expressions cons
es the number of words on the first line of
onto a list of numbers that represents the number of words on
the remainder of the lines of lls
It remains to design the words# function. Its template is dubbed
line-processor and its purpose is to count the number of words on
a line, which is just a list of strings. So here is the wish-list entry:
At this point, you may recall the example used to illustrate the design
recipe for self-referential data in Designing with Self-Referential Data Definitions
. The function is
, and it too counts the number of strings on a
list of strings. Even though the inputs for how-many
to represent a list of names, this difference simply doesn’t matter; as
long as it correctly counts the number of strings on a list of strings,
solves our problem.
Since it is good to reuse existing functions, you may define words# as
|(define (words# los)|
| (how-many los))|
In reality, however, programming languages come with functions that solve
such problems already. BSL calls this function length
, and it
counts the number of values on any list of values, no matter what the
Figure 64: Counting the words on a line
You may wish to look over the list of functions that come
with BSL. Some may look obscure but may become useful in one of the
upcoming problems. Using such functions saves your time, not ours.
Figure 64 summarizes the full design for our sample
problem. The figure includes two test cases. Also, instead of using the
separate function words#, the definition of
words-on-line simply calls the length function that
comes with BSL. Experiment with the definition in DrRacket and make sure that
the two test cases cover the entire function definition.
With one small step, you can now design your first file utility:
It merely composes a library function with
. The former reads a file as a
and hands this value to the latter.
This idea of composing a built-in function with a newly designed function
is common. Naturally, people don’t design functions randomly and expect to
find something in the chosen programming language to complement their
design. Instead, program designers plan ahead and design the function
to the output that available functions deliver. More generally
still and as mentioned above, it is common to think about a solution as a
composition of two computations and to develop an appropriate data
collection with which to communicate the result of one computation to the
second one, where each computation is each implemented with a function.
Figure 65: Encoding strings
Exercise 172. Design the function collapse, which converts
a list of lines into a string. The strings should be separated by blank
spaces (" "). The lines should be separated with a newline
Challenge When you are finished, use the program like this:
To make sure the two files "ttt.dat"
identical, remove all extraneous white spaces in your version of
Exercise 173. Design a program that removes all articles from a
text file. The program consumes the name n of a file, reads the
file, removes the articles, and writes the result out to a file whose name
is the result of concatenating "no-articles-" with n. For this
exercise, an article is one of the following three words: "a",
"an", and "the".
Use read-words/line so that the transformation retains the
organization of the original text into lines and words. When the program is
designed, run it on the Piet Hein poem.
Exercise 174. Design a program that encodes text files numerically.
Each letter in a word should be encoded as a numeric three-letter string
with a value between 0 and 256. Figure 65 shows our
encoding function for single letters. Before you start, explain these
Hints (1) Use read-words/line to preserve the organization of
the file into lines and words. (2) Read up on explode again.
Exercise 175. Design a BSL program that simulates the Unix command
wc. The purpose of the command is to count the number of
1Strings, words, and lines in a given file. That is, the command
consumes the name of a file and produces a value that consists of three
Figure 66: Transpose a matrix
176. Mathematics teachers may have introduced you to
matrix calculations by now. In principle, matrix just means rectangle of
numbers. Here is one possible data representation for matrices:
|; A Matrix is one of: |
|; – (cons Row '())|
|; – (cons Row Matrix)|
|; constraint all rows in matrix are of the same length|
|; An Row is one of: |
|; – '() |
|; – (cons Number Row)|
Note the constraints on matrices. Study the data definition and translate
the two-by-two matrix consisting of the numbers 11, 12, 21, 22 into this
data representation. Stop, don’t read on until you have figured out the
Here is the solution for the five-second puzzle:
If you didn’t create it yourself, study it now.
The function in figure 66 implements the important
mathematical operation of transposing the entries in a matrix. To
transpose means to mirror the entries along the diagonal, that is, the
line from the top-left to the bottom-right.
Stop! Transpose mat1 by hand, then read
figure 66. Why does transpose ask
(empty? (first lln))?
The definition assumes two auxiliary functions:
first*, which consumes a matrix and produces the first
column as a list of numbers;
rest*, which consumes a matrix and removes the first
column. The result is a matrix.
Even though you lack definitions for these functions, you should be able to
understand how transpose works. You should also understand that
you cannot design this function with the design recipes you have
seen so far. Explain why.
Design the two “wish list” functions. Then complete the design of the
transpose with some test cases.
10.4 A Graphical Editor, Revisited
A Graphical Editor is about the design of an interactive graphical one-line
editor. It suggests two different ways to represent the state of the
editor and urges you to explore both: a structure that contains pair of
strings or a structure that combines a string with an index to a current
position (see exercise 87).
A third alternative is to use structures that combine two lists of 1String
Before you wonder why, let us make up two data examples:
The two examples demonstrate how important it is to write down an
interpretation. While the two fields of an editor clearly represent the
letters to the left and right of the cursor, the two examples demonstrate
that there are at least two ways to interpret the structure types:
(make-editor pre post) could mean the letters in
pre precede the cursor and those in post succeed it and
that the combined text is
(make-editor pre post) could equally well mean that the
letters in pre precede the cursor in reverse order. If
so, we obtain the text in the displayed editor like this:
The function rev
must consume a list of 1String
Even without a complete definition for rev you can imagine how it
works. Use this understanding to make sure you understand that
translating the first data example into information according to the
first interpretation and treating the second data example according to the
second interpretation yields the same editor display:
Both interpretations are fine choices, but it turns out that using the
second one greatly simplifies the design of the program. The rest of this
section demonstrates this point, illustrating the use of lists inside
of structures at the same time. To appreciate the lesson properly, you
should have solved the exercises in A Graphical Editor.
Let’s start with rev, because we clearly need this function to
make sense out of the data definition. Its header material is
For good measure, we have added one “obvious” example as a test
case. You may want to add some extra examples just to make sure you
understand what is needed.
The template for rev is the usual list template:
There are two cases, and the second case comes with several selector
expressions and a self-referential one.
Filling in the template is easy for the first clause: the reverse version
of the empty list is the empty list. For the second clause, we once again
use the coding questions:
(first l) is the first item on the list of 1Strings;
(rest l) is the rest of the list; and
(rev (rest l)) is the reverse of the rest of the list.
Stop! Try to finish the design of rev with these hints.
Figure 67: Tabulating for rev
If these hints leave you stuck, remember to create a table from the
examples. Figure 67
shows the table for two examples:
(cons "a" '())
and (cons "a" (cons "b" (cons "c" '())))
. The second example is particularly illustrative. A look at the
next to last column shows that (rev (rest l))
of the work by producing (cons "c" (cons "b" '()))
. Since the
desired result is (cons "c" (cons "b" (cons "a" '())))
must somehow add "a"
to the end of the result of the
recursion. Indeed, because (rev (rest l))
is always the reverse of the
rest of the list, it clearly suffices to add (first l)
end. While we don’t have a function that adds items to the end of a list,
we can wish for it and use it to complete the function definition:
Here is the extended wish list entry for add-at-end:
It is “extended” because it comes with an example formulated as a test
case. The example is derived from the example for rev, and
indeed, it is precisely the example that motivates the wish list entry.
Make up an example where add-at-end consumes an empty list before
you read on.
Since add-at-end is also a list-processing function, the template
is just a renaming of the one you know so well now:
To complete it into a full function definition, we proceed according to
the recipe questions for step 5. Our first question is to formulate an
answer for the “basic” case, i.e., the first case here. If you did
worked through the suggested exercise, you know that the result of
is always (cons s '())
. After all, the result must be a list
and the list must contain the given 1String
The next two questions concern the “complex” or “self-referential”
case. We know what the expressions in the second cond
compute: the first expression extracts the first 1String
given list and the second expression “creates a new list by adding
to the end of (rest l)
.” That is, the purpose
statement dictates what the function must produce here. From here, it is
clear that the function must add (first l)
back to the result of
Run the tests-as-examples to reassure yourself that this function works
and that therefore rev
works, too. Of course, you shouldn’t be
surprised to find out that BSL already provides a function that reverses
any given list, including lists of 1String
s. And naturally, it is
Exercise 177. Design the function create-editor. The
function consumes two strings and produces an Editor. The first
string is the text to the left of the cursor and the second string is the
text to the right of the cursor. The rest of the section relies on this
At this point, you should have a complete understanding of our
data representation for the graphical one-line editor. Following the
design strategy for interactive programs from Designing World Programs
should define physical constants—
the width and height of the editor, for
and graphical constants—
e.g., the cursor. Here are ours:
The important point, however, is to write down the wish list for your
event handler(s) and your function that draws the state of the
editor. Recall that the 2htdp/universe library dictates the header
material for these functions:
|; Editor -> Image|
|; renders an editor as an image of the two texts |
|; separated by the cursor |
|(define (editor-render e) MT)|
|; Editor KeyEvent -> Editor|
|; deals with a key event, given some editor|
|(define (editor-kh ed ke) ed)|
Re-read exercise 177
to determine the initial editor for
While it does not matter which wish you tackle next, we choose to design
editor-kh first and editor-render second. Since we have
the header material, let us explain the functioning of the key event
handler with two examples:
|(check-expect (editor-kh (create-editor "" "") "e")|
| (create-editor "e" ""))|
| (editor-kh (create-editor "cd" "fgh") "e")|
| (create-editor "cde" "fgh"))|
Both of these examples demonstrate what happens when you press the letter
“e” on your keyboard. The computer runs the function editor-kh
on the current state of the editor and "e". In the first example,
the editor is empty, which means that the result is an editor with just
the letter "e" in it followed by the cursor. In the second
example, the cursor is between the strings "cd" and
"fgh", and therefore the result is an editor with the cursor
between "cde" and "fgh". In short, the function always
inserts any normal letter at the cursor position.
Before you read on, you should make up examples that illustrate how
editor-kh works when you press the backspace ("\b") key
to delete some letter, the "left" and "right" arrow keys
to move the cursor, or some other arrow keys. In all cases, consider what
should happen when the editor is empty, when the cursor is at the left end
or right end of the non-empty string in the editor, and when it is in the
middle. Even though you are not working with intervals here, it is still a
good idea to develop examples for the “extreme” cases.
Once you have test cases, it is time to develop the template. In the case
of editor-kh you are working with a function that consumes two
complex forms of data: one is a structure containing lists, the other one
is a large enumeration of strings. Generally speaking, this design case
calls for an improved design recipe; but in cases like these, it is also
clear that you should deal with one of the inputs first, namely, the
Having said that, the template is just a large cond
checking which KeyEvent
the function received:
expression doesn’t quite match the data definition for
because some KeyEvent
s need special attention
, and so on); some need to
be ignored because they are special ("\t"
some should be classified into one large group (ordinary keys).
Exercise 178. Explain why the template for editor-kh
deals with "\t" and "\r" before it checks for strings of
For the fifth step—
the definition of the function—
we tackle each clause
in the conditional separately. The first clause demands a result that
moves the cursor and leaves the string content of the editor alone. So
does the second clause. The third clause, however, demands the deletion of
a letter from the editor’s content—
if there is a letter. Last but not
least, the sixth cond
clause concerns the addition of letters at
the cursor position. Following the first basic guideline, we make
extensive use of a wish list and imagine one function per task:
As you can tell from the definition of editor-kh, three of the
four wish list functions have the same signature:
The last one takes two arguments instead of one:
We leave the proper formulation of wishes for the first three functions to
you and focus on the fourth one.
Let us start with a purpose statement and a function header:
|; insert the 1String k between pre and post|
|(define (editor-ins ed k)|
The purpose is straight out of the problem statement. For the construction
of a function header, we need an instance of Editor
are the pieces of the current one, we just
put them back together.
Next we derive examples for editor-ins from those for
You should work through these examples using the interpretation of
. That is, make sure you understand what the given editor
means as information and what the function call is supposed to achieve in
terms of information. In this particular case, it is best to draw the
visual representation of the editor because it is the information of that
we have in mind.
The fourth step demands the development of the template. The first argument
is guaranteed to be a structure, and the second one is a string, an atomic
piece of data. In other words, the template just pulls out the pieces from
the given editor representation:
Remember a template lists parameters because they are available, too.
From the template and the examples, it is relatively easy to conclude what
editor-ins is supposed to create an editor from the given
editor’s pre and post fields with k added to
the front of the former:
|(define (editor-ins ed k)|
| (make-editor (cons k (editor-pre ed))|
| (editor-post ed)))|
Even though both (editor-pre ed)
and (editor-post ed)
are list of 1String
s, there is no need to design auxiliary
functions. To get the desired result, it suffices to use cons
which creates lists.
At this point, you should do two things. First, run the tests for this
function. Second, use the interpretation of Editor and explain
abstractly why this function performs the insertion. And if this isn’t
enough, you may wish to compare this simple definition with the one from
exercise 84 and figure out why the other one needs an auxiliary
function while our definition here doesn’t.
179. Design the functions
Again, it is critical that you work through a good range of examples.
Designing the rendering function for Editor
s poses some new but
small challenges. The first one is to develop a sufficiently large number
of test cases. On one hand, it demands coverage of the possible
combinations: an empty string to the left of the cursor, an empty one on
the right, and both strings empty. On the other hand, it also requires
some experimenting with the functions that the image library
provides. Specifically, it needs a way to compose the two pieces of
strings rendered as text images; and it needs a way of placing the text
image into the empty image frame (MT
). Here is what we do to
create an image for the result of (create-editor "pre" "post")
If you compare this with the editor image above, you notice some
differences, which is fine because the exact layout isn’t essential to the
purpose of this exercise, and because the revised layout doesn’t
trivialize the problem. In any case, do experiment in the interactions
area of DrRacket to find your favorite editor display.
You are now ready to develop the template, and you should come up with this
|(define (editor-render e)|
| (... (editor-pre e) ... (editor-post e)))|
The given argument is just a structure type with two fields. Their values,
however, are lists of 1String
s, and you might be tempted to refine
the template even more. Don’t! Instead, keep in mind that when one
data definition refers to another complex data definition, you are better
off using the wish list.
If you have worked through a sufficient number of examples, you also know
what you want on your wish list: one function that turns a string into a
text of the right size and color. Let us call this function
. Then the definition of editor-render
twice and then composes the result with
Although this definition nests expressions three levels deep, the use of
the imaginary editor-text function renders it quite readable.
What remains is to design editor-text
. From the design of
, we know that editor-text
list of 1String
s and produces a text image:
This dummy definition produces an empty text image.
To demonstrate what editor-text is supposed to compute, we work
through an example. The example input is
(create-editor "pre" "post")
which was also used to explain editor-render and is equivalent to
We pick the second list as our sample input for editor-text, and
we know the expected result from the example for editor-render:
You may wish to make up a second example before reading on.
Given that editor-text
consumes a list of 1String
s, we can
write down the template without much ado:
After all, the template is dictated by the data definition that describes
the function input. But you don’t need the template if you understand and
keep in mind the interpretation for Editor
. It uses
to turn a string into a list of
s. Naturally, there is a function implode
performs the inverse computation, i.e.,
Using this function, the definition of editor-text is just a
small step from the example to the function body:
Exercise 180. Design editor-text without
The true surprise comes when you test the two functions. While our test for
editor-text succeeds, the test for editor-render
fails. An inspection of the failure shows that the string to the left of
the cursor— "pre"—is type-set backwards. We forgot that this
part of the editor’s state is represented in reverse. Fortunately, the
unit tests for the two functions pinpoint which function is wrong and even
tell us what is wrong with the function and suggests how to fix the problem:
It uses the reverse
function on the pre
Note Modern applications allow users to position the cursor with the
mouse (or other gesture-based devices). While it is in principle possible
to add this capability to your editor, we wait with doing so until
A Graphical Editor, with Mouse.
11 Design by Composition
By now you know that programs are complex products and that their
production requires the design of many collaborating functions. This
collaboration works well if the designer knows when to design several
functions and how to compose these functions into one program.
You have encountered this need to design interrelated functions several
times. Sometimes a problem statement implies several different tasks, and
each task is best realized with a function. At other times, a data
definition may refer to another one, and in that case, a function
processing the former kind of data relies on a function processing the
In this chapter, we present several scenarios that call for the design of
programs that compose many functions. To support this kind of design, the
chapter presents some informal guidelines on divvying up functions and
composing them. Since these examples demand complex forms of lists,
however, this chapter starts with a section on concise list notation.
11.1 The list Function
At this point, you should have tired of writing so many conses
just to create a list, especially for lists that contain a bunch of
values. Fortunately, we have an additional teaching language for you
thatYou have graduated from BSL. It is time to use the
“Language” menu and to select “Beginning Student with List
Abbreviations” for your studies. provides mechanisms for simplifying
this part of a programmer’s life. BSL+ does so, too.
The key innovation is list
, which consumes an
arbitrary number of values and creates a list. The simplest way to
is to think of it as an
abbreviation. Specifically, every expression of the shape
stands for a series of n cons
Keep in mind that '() is not an item of the list here, but the
rest of the list. Here is a table with three examples:
They introduce lists with one, two, and three items, respectively.
Of course, we can apply list
not only to values but also
|> (list (+ 0 1) (+ 1 1))|
(list 1 2)
|> (list (/ 1 0) (+ 1 1))|
/: division by zero
Before the list is constructed, the expressions must be evaluated.
If during the evaluation of an expression an error occurs, the list
is never formed. In short, list
behaves just like any other
primitive operation that consumes an arbitrary number of arguments; its
result just happens to be a list constructed with cons
The use of list
greatly simplifies the notation for lists
with many items and lists that contains lists or structures. Here
is an example:
(list 0 1 2 3 4 5 6 7 8 9)
This list contains 10 items and its formation with cons
would require 10 uses of cons
and one instance of
. Similarly, the list
requires 11 uses of list
, which sharply contrasts with 40
and 11 additional uses of '()
181. Use list
to construct the equivalent of these lists:
Also try your hands at this one:
Start by determining how many items each list and each nested list
contains. Use check-expect
to express your answers; this ensures
that your abbreviations are really the same as the long-hand.
182. Use cons
to form the equivalent of these lists:
183. On some occasions lists are formed with
Reformulate each of the following expressions using only cons
or only list
. Use check-expect
to check your answers.
184. Determine the values of the following expressions:
185. You know about first
from BSL, but BSL+
comes with even more selectors than that. Determine the values of the
Find out from the documentation whether third
11.2 Composing Functions
How to Design Programs
explains that programs are collections of definitions:
structure type definitions, data definitions, constant definitions, and function
definitions.And don’t forget tests.
To guide the division of
labor among functions, the section also suggests a rough guideline:
Design one function per task. Formulate auxiliary function definitions for
every dependency between quantities in the problem.
This part of the book introduces another guideline on auxiliary functions:
Design one template per data definition.
Formulate auxiliary function definitions when one data definition points to
a second data definition.
In this section, we take a look at one specific place in the design process
that may call for additional auxiliary functions: the definition step,
which creates a full-fledged definition from a template. Turning a template
into a complete function definition means combining the values of the
template’s sub-expressions into the final answer. As you do so, you might
encounter several situations that suggest the need for auxiliary functions:
If the composition of values requires knowledge of a particular domain of
application—for example, composing two (computer) images, accounting,
music, or science—design an auxiliary function.
If the composition of values requires a case analysis of the
available values—for example, is a number positive, zero, or negative—
use a cond expression. If the cond looks complex, design
an auxiliary function whose arguments are the template’s expressions and whose
body is the cond expression.
If the composition of values must process an element from a
self-referential data definition—a list, a natural number, or something
like those—design an auxiliary function.
If everything fails, you may need to design a more general
function and define the main function as a specific use of the general
function. This suggestion sounds counter-intuitive but it is called for in
a remarkably large number of cases.
The last two criteria are situations that we haven’t discussed in any
detail, though examples have come up before. The next two sections
illustrate these principles with additional examples.
Before we continue, though, remember that the key to managing the
design of programs is to maintain the often-mentioned
Maintain a list of function headers that must be designed to complete a
program. Writing down complete function headers ensures that you can test
those portions of the programs that you have finished, which is useful
even though many tests will fail. Of course, when the wish list is empty,
all tests should pass and all functions should be covered by tests.
Before you put a function on the wish list, you should check whether
something like the function already exists in your language’s library or
whether something similar is already on the wish list. BSL, BSL+, and
indeed all programming languages provide many built-in operations and many
library functions. You should explore your chosen language when you have
time and when you have a need, so that you know what it provides.
11.3 Auxiliary Functions that Recur
People need to sort things all the time, and so do programs. Investment
advisors sort portfolios by the profit each holding generates. Game
programs sort lists of players according to scores. And mail programs sort
messages according to date or sender or some other criteria.
In general, you can sort a bunch of items if you can compare and order each
pair of data items. Although not every kind of data comes with a
comparison primitive, we all know one that does: numbers. Hence, we use a
simplistic but highly representative sample problem in this section:
Sample Problem Design a function that sorts a list of reals.
The exercises below clarify how to adapt this function to other data.
Since the problem statement does not mention any other task and since
sorting does not seem to suggest other tasks, we just follow the design
recipe. Sorting means rearranging a bunch of numbers. This re-statement
implies a natural data definition for the inputs and outputs of the
function and thus its signature. Given that we have a definition for
, the first step is easy:
Returning alon ensures that the result is appropriate as far as
the function signature is concerned, but in general, the given
list isn’t sorted and this result is wrong.
When it comes to making up examples, it quickly becomes clear that the
problem statement is quite imprecise. As before, we use the data
definition of List-of-numbers
to organize the development of
examples. Since the data definition consists of two clauses, we need two
examples. Clearly, when sort>
is applied to '()
result must be '()
. The question is what the result for
should be. The list isn’t sorted, but there are two ways to sort it:
(cons 20 (cons 12 (cons -5 '()))), that is, a
list with the numbers arranged in descending order; and
(cons -5 (cons 12 (cons 20 '()))), that is, a
list with the numbers arranged in ascending order.
In a real-world situation, you would now have to ask the person who posed
the problem for clarification. Here we go for the descending alternative;
designing the ascending alternative doesn’t pose any different obstacles.
The decision calls for a revision of the header material:
The header material now includes the examples reformulated as unit tests
and using list
. If the latter makes you uncomfortable,
reformulate the test with cons
to exercise translating back and
forth. As for the additional two examples, they demand that sort works on
lists sorted in ascending and descending order.
Next we must translate the data definition into a function template. We
have dealt with lists of numbers before, so this step is easy:
Using this template, we can finally turn to the interesting part of the
program development. We consider each case of the cond
separately, starting with the simple case. If sort>
’s input is
, the answer is '()
, as specified by the example.
’s input is a cons
ed list, the template suggests
two expressions that might help:
(first alon) extracts the first number from the input;
(sort> (rest alon)) re-arranges (rest alon) in
descending order, according to the purpose statement of the function.
To clarify these abstract answers, let us use the second example to
explain these pieces in detail. When sort>
consumes (list 12 20 -5)
(first alon) is 12,
(rest alon) is (list 20 -5), and
(sort> (rest alon)) produces (list 20 -5),
because this list is already sorted.
To produce the desired answer, sort>
must insert 12
between the two numbers of the last list. More generally, we must find an
expression that inserts (first alon)
in its proper place into the
result of (sort> (rest alon))
. If we can do so, sorting
is an easily solved problem.
Inserting a number into a sorted list clearly isn’t a simple task. It
demands searching through the sorted list to find the proper place of the
item. Searching through any list demands an auxiliary function,
because lists are of arbitrary size and, by hint 3 of the preceding
section, processing values of arbitrary size, calls for the design of an
So here is the new wish list entry:
That is, insert consumes a number and a list sorted in descending
order and produces a sorted list by inserting the former into the latter.
With insert, it is easy to complete the definition of sort>:
In order to produce the final result, sort> extracts the first
item of a non-empty list, computes the sorted version of the rest, and
uses insert to produce the completely sorted list from the two
Stop! Test the program as is. Some test cases pass, and some fail. That’s
progress. The next step in its design is the creation of functional
examples. Since the first input of insert is any number, we use
5 and use the data definition for List-of-numbers to make
up examples for the second input.
First we consider what insert should produce when given a
number and '(). According to insert’s purpose
statement, the output must be a list, it must contain all numbers from the
second input, and it must contain the first argument. This suggests the
Second, we use a non-empty list of just one item:
The reasoning of why these are the expected results is just like
before. For one, the result must contain all numbers from the second list
and the extra number. For two, the result must be sorted.
Finally, let us create an example with a list that contains more than one
item. Indeed, we can derive such an example from the examples for
and especially from our analysis of the second cond
clause. From there, we know that sort>
works only if 12
inserted into (list 20 -5)
at its proper place:
That is, insert is given a second list and it is sorted in
Note what the development of examples teaches us. The insert
function has to find the first number that is smaller than the given
n. When there is no such number, the function eventually reaches
the end of the list and it must add n to the end. Now, before we
move on to the template, you should work out some additional examples. To
do so, you may wish to use the supplementary examples for sort>.
In contrast to sort>, the function insert consumes
two inputs. Since we know that the first one is a number and atomic, we
can focus on the second argument—the list of numbers—for the template
The only difference between this template and the one for sort> is
that this one needs to take into account the additional argument n.
To fill the gaps in the template of insert, we again proceed on a
case-by-case basis. The first case concerns the empty list. According to
the first example, (list n) is the expression needed in the first
cond clause, because it constructs a sorted list from n
The second case is more complicated than the first, and so we follow the
questions from figure 49
(first alon) is the first number on alon, and
(rest alon) is the rest of alon and, like
alon, it is sorted in descending order;
(insert n (rest alon)) produces a sorted list from
n and the numbers on (rest alon).
The problem is how to combine these pieces of data to get the final
Let us work through some examples to make all this concrete:
and larger than any of the numbers in the
second input. We know so by just looking at the first item of the list. It
but because the list is sorted, all other numbers on the
list are even smaller than 6
. Hence it suffices if we just
onto (list 6 5 4)
In contrast, when the application is something like
(insert 0 (list 6 2 1 -1))
must indeed be inserted into the rest of the list. More
concretely, (first alon)
; (rest alon)
(list 2 1 -1)
; and (insert n (rest alon))
(list 2 1 0 -1)
according to the purpose statement. By adding
back onto that last list, we get the desired answer for
(insert 0 (list 6 2 1 -1))
To get a complete function definition, we must generalize these
examples. The case analysis suggests a nested conditional that determines
is larger than (or equal to) (first alon)
If so, all the items in alon are smaller than n
because alon is already sorted. The answer in that case is
(cons n alon).
If, however, n
is smaller than (first alon)
the function has not yet found the proper place to insert n
. The first item of the result must be
and that n
must be inserted into
. The final result in this case is
because this list contains n and all items of alon in
sorted order—which is what we need.
The translation of this discussion into BSL+ calls for an
expression for such cases. The condition is (>= n (first alon))
and the expressions for the two branches have been
Figure 68 contains the complete sort program. Copy it into the
definition area of DrRacket, add the test cases back in, and test the
program. All tests should pass now and they should cover all expressions.
Terminology This particular program for sorting is known as
insertion sort in the programming literature. Later we will
study alternative ways to sort lists, using an entirely different design
Figure 68: Sorting lists of numbers
Exercise 186. Take a second look at Intermezzo: BSL, the intermezzo
that presents BSL and its ways of formulating tests. One of the latter
is check-satisfied, which determines whether an expression
satisfies a certain property. Use sorted>? from exercise 145 to
re-formulate the tests for sort> with check-satisfied.
Now consider this function definition:
Can you formulate a test case that shows sort>/bad
sorting function? Can you use check-satisfied
to formulate this
Notes (1) What may surprise you here is that we define a function to
create a test. In the real world, this step is common and, on occasion, you
really need to design functions for tests—with their own tests and all.
(2) Formulating tests with check-satisfied is occasionally easier
than using check-expect (or other forms), and it is also a bit
more general. When the predicate completely describes the relationship
between all possible inputs and outputs of a function, computer scientists
speak of a specification. Specifying with lambda explains how
to specify sort> completely.
187. Design a program that sorts lists of game players by score:
|(define-struct gp [name score])|
|; A GamePlayer is a structure: |
|; (make-gp String Number)|
|; interpretation (make-gp p s) represents player p who |
|; scored a maximum of s points |Hint
Formulate a function that compares two elements of GamePlayer
188. Design a program that sorts lists of emails by date:
|(define-struct email [from date message])|
|; A Email Message is a structure: |
|; (make-email String Number String)|
|; interpretation (make-email f d m) represents text m |
|; sent by f, d seconds after the beginning of time |
Also develop a program that sorts lists of email messages by name. To
compare two strings alphabetically, use the string<?
189. Here is the function search
It determines whether some number occurs in a list of numbers. The function
may have to traverse the entire list to find out that the number of
interest isn’t contained in the list.
Develop the function search-sorted, which determines whether a
number occurs in a sorted list of numbers. The function must take advantage
of the fact that the list is sorted.
Exercise 190. Design the prefixes function, which consumes
a list of 1Strings and produces the list of all prefixes. A list
p is a prefix of l if p and
l are the same up through all items in p. For example,
(list "a" "b" "c") is a prefix of itself and (list "a" "b" "c" "d").
Design the function suffixes, which consumes a list of
1Strings and produces all suffixes. A list s is a
suffix of l if p and l are the same from the
end, up through all items in s. For example, (list "b" "c" "d") is a
suffix of itself and (list "a" "b" "c" "d").
11.4 Auxiliary Functions that Generalize
On occasion an auxiliary function is not just a small helper function but a
solution to a more general problem. Such auxiliaries are needed when a
problem statement is too narrow. As programmers work through the steps of
the design recipe, they may discover that the “natural” solution is
wrong. An analysis of this broken solution may suggest a slightly
different, but more general problem statement, and a simple way of using
the solution to the general problem for the original one.
We illustrate this idea with a solution to the following problem:Paul C. Fisher suggested this problem.
Sample Problem Design a function that adds a polygon to a given scene.
Just in case you don’t recall your basic geometry (domain) knowledge, we
add a (simplistic) definition of polygon:The statement
should also say that the points are not on a line.
A polygon is a planar figure with at least three points
connected by three straight sides.
One natural data representation for a polygon is thus a list of
s. For example, the following two definitions
introduce a triangle and a square, just as the names say. Now you may
wonder how to interpret '()
or (list (make-posn 30 40))
as polygons, and the answer is that they do not
polygons. Because a polygon consists of at least three three points,
a good data representations of polygons is the collection of lists with at
least three Posn
Following the development of the data definition for non-empty lists of
, in Non-empty Lists
formulating a data representation for polygons is straightforward:
The first clause says that a list of three Posn
s is a
and the second clause says that cons
onto some existing Polygon
creates another one. Since
this data definition is the very first to use list
in one of its
clauses, we spell it out with cons
just to make sure you see this
conversion from an abbreviation to long hand in this context:
The point is that a naively chosen data representation—plain lists of
Posns—may not properly represent the intended
information. Revising the data definition during an initial exploration is
normal; indeed, on occasion such revisions become necessary during the rest
of the design process. As long as you stick to a systematic approach,
though, changes to the data definition can naturally be propagated through
the rest of the design.
The second step calls for the signature, purpose statement, and header of the
function. Since the problem statement mentions just one task and no other task is
implied, we start with one function:
The additional definition of MT is called for, because it
simplifies the formulation of examples.
For the first example, we use the above-mentioned triangle. A quick look in
the 2htdp/image library
is the function
needed to render the three lines for a triangle:
The innermostOf course, we experimented in DrRacket’s interaction area to
get this expression right. scene+line
renders the line from the first to the second
; the middle one uses the second and third Posn
the outermost scene+line
connects the third and the first
Given that the first and smallest polygon is a triangle, a rectangle or a
square suggests itself as the second example. We use square-p:
A square is just one more point than a triangle, and it is easy
render. You may also wish to draw these shapes on a piece of graph paper.
The construction of the template poses a challenge. Specifically,
the first and the second question of figure 48
the data definition differentiates distinct subsets and how to distinguish
among them. While the data definition clearly sets apart triangles from
all other polygons in the first clause, it is not immediately clear how to
differentiate the two. Both clauses describe lists of Posn
first describes lists of three Posn
s, the second one describes
lists of Posn
s that have at least four items. Thus one
alternative is to ask whether the given polygon is three items long:
Using the long-hand version of the first clause, that is,
suggests a second way to formulate the first condition, namely, checking whether
the given Polygon
is empty after using three rest
Since all Polygon
s consist of at least three Posn
three times is legal. Unlike length
is a primitive, easy-to-understand operation with a clear operational
meaning. It selects the second field in a cons
structure and that
is all it does.
It is truly better to formulate conditions in terms of built-in
predicates and selectors than your own (recursive) functions. See
Intermezzo: The Cost of Computation for an explanation.
The rest of the questions in figure 48
have direct answers,
and thus we get this template:
describes a triangle in the first clause,
it must consist of exactly three Posn
s, which are extracted via
, and third
. In the second clause,
consists of a Posn
, justifying (first p)
and (rest p)
The former extracts a Posn
, the latter a
. We therefore add a self-referential function call around
it; we must also keep in mind that dealing with (first p)
this clause and the three Posn
s in the first clause may
demand the design of an auxiliary function.
Now we are ready to focus on the function definition, dealing with one
clause at a time. The first clause concerns triangles, which
suggests a straightforward answer. Specifically, there are
s and render-poly
should connect the three in
an empty scene of 50 by 50 pixels. Given Posn
is a separate data
definition, we get an obvious wish list entry:
Using this function, the first cond
This expression obviously renders the given Polygon p
triangle by drawing a line from the first to the second, the second to the
third, and the third to the first Posn
The second cond
clause is about Polygon
s that have been
extended with one Posn
. In the template, we find two expressions
and, following figure 49
, we remind ourselves of what
these expressions compute:
(first p) extracts the first Posn;
(rest p) extracts the Polygon from p; and
(render-polygon img (rest p)) renders (rest p),
which is what the purpose statement of the function says.
The question is how to use these pieces to render the given
One idea that may come to mind is that (rest p)
consists of at
least three Posn
s. It is therefore possible to extract at least
from this embedded Polygon
and to connect
with this additional point. Here is what this idea looks
like with BSL+ code:
|(render-line (render-poly MT (|