If you follow the design recipe of the first four parts, you either turn domain knowledge into code or you exploit the structure of the data definition to organize your code. The latter functions typically decomposeSome functions merely compose such functions; we group those functions also with the “structural” group. their arguments into their immediate structural components and then process those components. If one of these immediate components belongs to the same class of data as the input, the function is structurally recursive. While structurally designed functions make up the vast majority of code in the real world, some problems cannot be solved with a structural approach to design.
To solve such complicated problems, programmers use generative recursion, a form of recursion that is strictly more powerful than structural recursion. The study of generative recursion is as old as mathematics and is often called the study of algorithms. The inputs of an algorithm represent a problem. An algorithm tends to re-arrange a problem into a set of several problems, solve those, and combine their solutions into one overall solution. Often some of these newly generated problems are the same kind of problem as the given one, in which case the algorithm can be re-used to solve them. In these cases, the algorithm is recursive but its recursion uses newly generated data not immediate parts of the input data.
From the very description of generative recursion, you can tell that designing a generative recursive function is more of an ad hoc activity than designing a structurally recursive function. Still, many elements of the general design recipe apply to the design of algorithms, too, and this part of the book illustrates how and how much the design recipe helps. The key to designing algorithms is the “generation” step, which often means dividing up the problem. And figuring out a novel way of dividing a problem requires insight.Greeks call it a “eureka.” Sometimes very little insight is required. For example, it might just require a bit of common-sense knowledge about breaking up sequences of letters. At other times, it may rely on deep mathematical theorems about numbers. In practice, programmers design simple algorithms on their own and rely on domain specialists for their complex brethren. For either kind, programmers must thoroughly understand the underlying ideas so that they can code up algorithms and have the program communicate with future readers. The best way to get acquainted with the idea is to study a wide range of examples and to develop a sense for the kinds of generative recursions that may show up in the real world.
At this point you have designed numerous functions that employ structural recursion. When you design a function, you know you need to look at the data definition for its major input. If this input is described by a self-referential data definition, you end up with a function that refers to itself basically where the data definition refers to itself.
This chapter presents two sample programs that use recursion differently. They are illustrative of the problems that require some eureka, ranging from the obvious idea to the sophisticated insight.
Imagine you have joined the DrRacket team. The team is working on a sharing service to support collaborations among programmers. Concretely, the next revision of DrRacket is going to enable ISL programmers to share the content of their DrRacket’s definition area across several computers. Each time one programmer modifies the buffer, the revised DrRacket broadcasts the content of the definitions area to the instances of DrRacket that participate in the sharing session.
Sample Problem: Your task is to design the function bundle, which prepares the content of the definitions area for broadcasting. DrRacket hands over the content as a list of 1Strings. The function’s task is to bundle up sub-sequences of individual “letters” into chunks and to thus produce a list of strings—
called chunks— of a given length, called chunk size.
(list "a" "b" "c" "d" "e" "f" "g" "h")
(list "ab" "cd" "ef" "gh")
(check-expect (bundle '() 3) '())
; N as the driving data definition, s considered atomic ; according to Processing Two Lists Simultaneously: Case 1 (define (bundle s n) (cond [(zero? n) (...)] [else (... s ... n ... (bundle s (sub1 n)))])) ; [List-of 1String] as the driving data definition, n considered atomic ; according to Processing Two Lists Simultaneously: Case 1 (define (bundle s n) ; s as the driving data definition (cond [(empty? s) (...)] [else (... s ... n ... (bundle (rest s) n))])) ; [List-of 1String] and N are on equal footing ; according to Processing Two Lists Simultaneously: Case 2 (define (bundle s n) ; lock step (cond [(and (empty? s) (zero? n)) (...)] [else (... s ... n ... (bundle (rest s) (sub1 n)))])) ; the cross-product of possibilities, ; according to Processing Two Lists Simultaneously: Case 3 (define (bundle s n) (cond [(and (empty? s) (zero? n)) (...)] [(and (cons? s) (zero? n)) (...)] [(and (empty? s) (positive? n)) (...)] [else (... s ... n ... (bundle (rest s) (sub1 n)) ... (bundle s (sub1 n)) ... (bundle (rest s) n) ...)]))
The template step reveals that a structural approach cannot
work. Figure 99 shows four possible
templates given that both arguments to bundle are complex arguments. The first
two consider one of the arguments atomic, but that clearly cannot be the case
because the function has to understand each argument. The third template
is based on the assumption that the two arguments are processed in lock
step, which is close—
; [List-of 1String] N -> [List-of String] ; bundles sub-sequences of s into strings of length n ; idea take and drop n items at a time (check-expect (bundle (explode "abcdefg") 3) (list "abc" "def" "g")) (check-expect (bundle (explode "ab") 3) (list "ab")) (check-expect (bundle '() 3) '()) (define (bundle s n) (cond [(empty? s) '()] [else (cons (implode (take s n)) (bundle (drop s n) n))])) ; [List-of X] N -> [List-of X] ; retrieves the first n items in l if possible or everything (define (take l n) (cond [(zero? n) '()] [(empty? l) '()] [else (cons (first l) (take (rest l) (sub1 n)))])) ; [List-of X] N -> [List-of X] ; remove the first n items from l if possible or everything (define (drop l n) (cond [(zero? n) l] [(empty? l) l] [else (drop (rest l) (sub1 n))]))
if the given list is '(), the result is '() as decided upon;
it then recurs with a list that is shortened by n items, which is accomplished with drop;
finally, cons combines the string from 2 with the list of strings from 3 to create the result for the complete list.
While the definition of bundle is unusual, the underlying ideas are intuitive and not too different from the functions seen so far. Indeed, if the chunk size n is 1, bundle specializes to a structurally recursive definition. Also, drop is guaranteed to produce an integral part of the given list, not some arbitrarily rearranged version. And this idea is precisely what the next section presents.
Exercise 421. Define the function list->chunks. It consumes a list l of arbitrary data and a natural number n. The function’s result is a list of list chunks of size n. Each chunk represents a sub-sequence of items in l.
Use list->chunks to define bundle via function composition.
For non-empty strings s and positive natural numbers n,is #true. But don’t use this equality as the definition for partition; use substring instead.
Hint Have partition produce its “natural” result for the empty string. For the case where n is 0, see exercise 420.
Note The partition function is somewhat closer to what a cooperative DrRacket environment would need than bundle.
Recall that the sort> function from section Design by Composition consumes a list of, say, numbers and re-arranges the same list of numbers in some order, typically ascending or descending. It proceeds by inserting the first number into the appropriate position of the sorted rest of the list. Put differently, it is a structurally recursive function that re-processes the result of the natural recursions.
one that contains all the numbers that are strictly smaller than the first item
and another one with all those items that are strictly larger than the first.
To develop an understanding of how the quick-sort algorithm works, let’s walk through an example, quick-sorting (list 11 8 14 7). Figure 101 illustrates the process in a graphical way. The figure consists of a top half, the divide phase, and the bottom half, the conquer phase.
The partition phase is represented with boxes and solid arrows. Three arrows emerge from each boxed list and go to a box with three pieces: the circled pivot element in the middle, to its left the boxed list of numbers smaller than the pivot, and to its right the boxed list of those numbers that are larger than the pivot. Each of these steps isolates at least one number as the pivot, meaning the two neighboring lists are shorter than the given list. Consequently, the overall process terminates too.
Consider the first step where the input is (list 11 8 14 7). The pivot item is 11. Partitioning the list into items larger and smaller than 11 produces (list 8 7) and (list 14). The remaining steps of the partitioning phase work in an analogous way. Partitioning ends when all numbers have been isolated as pivot elements. At this point, you can already read off the final result by reading the pivots from left to right.
The conquering phase is represented with dashed arrows and boxed lists. Three arrows enter each result box: the middle one from a pivot, the left one from the boxed result of sorting the smaller numbers, and the right one from the boxed result of sorting the larger ones. Each step adds at least one number to the result list, the pivot, meaning the lists grow toward the bottom of the diagram. The box at the bottom is a sorted variant of the given list at the top.
Take a look at the left-most, upper-most conquer step. It combines the pivot 7 with two empty lists, resulting in '(7). The next one down corresponds to the partitioning step that isolated 8 and thus yields '(7 8). Each level in the conquering phase mirrors a corresponding level from the partitioning phase. After all, the overall process is recursive.
Exercise 423. Draw a quick-sort diagram for (list 11 9 2 18 12 14 4 1).
Since the rest of the list is of unknown size, we leave the task of partitioning the list to two auxiliary functions: smaller-items and larger-items. They process the list and filter out those items that are smaller and larger, respectively, than the pivot. Hence each auxiliary function accepts two arguments, namely, a list of numbers and a number. Designing these two functions is an exercise in structural recursion. Try on your own or read the definitions shown in figure 102.
; [List-of Number] -> [List-of Number] ; creates a list of numbers with the same numbers as ; alon, sorted in ascending order ; assume the numbers are all distinct (define (quick-sort alon) (cond [(empty? alon) '()] [else (local ((define pivot (first alon))) (append (quick-sort (smaller-items alon pivot)) (list pivot) (quick-sort (larger-items alon pivot))))])) ; [List-of Number] Number -> [List-of Number] ; creates a list with all those numbers on alon ; that are larger than n (define (larger-items alon n) (cond [(empty? alon) '()] [else (if (> (first alon) n) (cons (first alon) (larger-items (rest alon) n)) (larger-items (rest alon) n))])) ; [List-of Number] Number -> [List-of Number] ; creates a list with all those numbers on alon ; that are smaller than n (define (smaller-items alon n) (cond [(empty? alon) '()] [else (if (< (first alon) n) (cons (first alon) (smaller-items (rest alon) n)) (smaller-items (rest alon) n))]))
(quick-sort (smaller-items alon pivot)), which sorts the list of items smaller than the pivot; and
(quick-sort (larger-items alon pivot)), which sorts the list of items larger than the pivot.
(quick-sort (list 11 8 14 7)) == (append (quick-sort (list 8 7)) (list 11) (quick-sort (list 14))) == (append (append (quick-sort (list 7)) (list 8) (quick-sort '())) (list 11) (quick-sort (list 14))) == (append (append (append (quick-sort '()) (list 7) (quick-sort '())) (list 8) (quick-sort '())) (list 11) (quick-sort (list 14))) == (append (append (append '() (list 7) '()) (list 8) '()) (list 11) (quick-sort (list 14))) == (append (append (list 7) (list 8) '()) (list 11) (quick-sort (list 14))) ...
Both figure 101 and the calculation also show how quick-sort completely ignores the structure of the given list. The first recursion works on two distant numbers from the originally given list and the second one on the list’s third item. These recursions aren’t random but they are certainly not relying on the structure of the data definition.
Contrast quick-sort’s organization with that of the sort> function from Design by Composition. The design of the latter follows the structural design recipe, yielding a program that processes a list item by item. By splitting the list, quick-sort can speed up the process of sorting the list, though at the cost of not using plain first and rest.
The hand evaluation of (quick-sort (list 11 8 14 7)) suggests an additional trivial case for quick-sort. Every time quick-sort consumes a list of one item, it produces the very same list. After all, the sorted version of a list of one item is the list itself.
Modify the definition of quick-sort to take advantage of this observation.
Hand evaluate the example again. How many steps does the extended algorithm save?
Exercise 425. While quick-sort quickly reduces the size of the problem in many cases, it is inappropriately slow for small problems. Hence people use quick-sort to reduce the size of the problem and switch to a different sort function when the list is small enough.
Develop a version of quick-sort that uses sort> from Recursive Auxiliary Functions if the length of the input is below some threshold.
Exercise 426. If the input to quick-sort contains the same number several times, the algorithm returns a list that is strictly shorter than the input. Why? Fix the problem so that the output is as long as the input.
Exercise 427. Use filter to define smaller-items and larger-items as one-liners.
Exercise 428. Develop a variant of quick-sort that uses only one comparison function, say, <. Its partitioning step divides the given list alon into a list that contains the items of alon smaller than (first alon) and another one with those that are not smaller.
Use local to package up the program as a single function: Abstract this function so that it consumes a list and a comparison function.
The overview for this part already explains that the design of generative recursion functions is more ad hoc than structural design. As the first chapter shows, two generative recursions can radically differ in how they process functions. Both bundle and quick-sort process lists, but while the former at least respects the sequencing in the given list, the latter re-arranges its given list at will. The question is whether a single design recipe can help with the creation of such widely differing functions.
The first section of this chapter shows how to adapt the process dimension of the design recipe to generative recursion. The second section hones in on another new phenomenon: an algorithm may fail to produce an answer for some of its inputs. Programmers must therefore analyze their programs and supplement the design information with a comment on termination. The remaining sections in this chapter compare and contrast structural and generative recursion.
As before, we must represent the problem information as data in our chosen programming language. The choice of a data representation for a problem affects our thinking about the computational process, so some planning ahead is necessary. Alternatively, be prepared to backtrack and to explore different data representations. Regardless, we must analyze the problem information and define data collections.
We also need a signature, a function header, and a purpose statement. Since the generative step has no connection to the structure of the data definition, the purpose statement must go beyond what the function is to compute and also explain how the function computes its result.
It is useful to explain the “how” with function examples, the way we explained bundle and quick-sort in the previous chapter. That is, while function examples in the structural world merely specify which output the function is to produce for which input, the purpose of examples in the world of generative recursion is to explain the underlying idea behind the computational process.
For bundle, the examples specify how the function acts in general and in certain boundary cases. For quick-sort, the example in figure 101 illustrates how the function partitions the given list with respect to the pivot item. By adding such worked examples to the purpose statement, we—
the designers— gain an improved understanding of the desired process, and we communicate this understanding to future readers of this code.
Our discussion suggests a general template for algorithms. Roughly speaking, the design of an algorithm distinguishes two kinds of problems: those that are trivially solvableFor this part of the book, “trivial” is a technical term. and those that are not. If a given problem is trivially solvable, an algorithm produces the matching solution. For example, the problems of sorting an empty list or a one-item list are trivially solvable. A list with many items is a non-trivial problem. For these non-trivial problems, algorithms commonly generate new problems of the same kind as the given one, solve those recursively, and combine the solutions into an overall solution.Based on this rough sketch, all algorithms have roughly this organization:The original problem is occasionally needed to combine the solutions for the newly generated problems, which is why it is handed over to combine-solutions.
- This template is only a suggestive blueprint, not a definitive shape. Each piece of the template is to remind us to think about the following four questions:
To define the algorithm as a function, we must express the answers to these four questions as functions and expressions in terms of the chosen data representation.For this step, the table-driven attempt from Designing with Self-Referential Data Definitions might help again. Reconsider the quick-sort example from Recursion that Ignores Structure. The central idea behind quick-sort is to divide a given list into a list of smaller items and larger items and to sort those separately. Figure 103 spells out how some simple numeric examples work out for the non-trivial cases. From these examples it is straightforward to guess that the answer to the fourth question is
What is a trivially solvable problem?
How are trivial solutions solved?
How does the algorithm generate new problems that are more easily solvable than the original one? Is there one new problem that we generate or are there several?
Is the solution of the given problem the same as the solution of (one of) the new problems? Or, do we need to combine the solutions to create a solution for the original problem? And, if so, do we need anything from the original problem data?
; append sorted-smaller, pivot, and sorted-larger into one listwhich can easily be translated into code.
Once the function is complete, it is time to test it. As before, the goal of testing is to discover and eliminate bugs. Remember that testing cannot validate that the function works correctly for all possible inputs.
'(2 3 1 4)
'(1 2 3 4)
'(2 0 1 4)
'(0 1 2 4)
'(3 0 1 4)
'(0 1 3 4)
Exercise 431. Exercise 219 defines the function food-create, which consumes a Posn and produces a randomly chosen, guaranteed distinct Posn. The exercise defers to this part of the book for an explanation of the design of the function. Justify the design of food-create. Re-formulate the two functions as a single definition; use local..
Contrast this insight with the designs presented in the first four parts. Every function designed according to the recipe either produces an answer or raises an error signal for every input. After all, the recipe dictates that each natural recursion consumes an immediate piece of the input, not the input itself. Because data is constructed in a hierarchical manner, input shrinks at every stage. Eventually the function is applied to an atomic piece of data, and the recursion stops.
This reminder also explains why generative recursive functions may diverge. According to the design recipe for the latter, an algorithm may generate new problems without any limitations. If the design recipe required the designer to guarantee that the new problem were “smaller” than the given one, it would terminate. But, imposing such a restrictionThe theory of computation actually shows that we must lift these restrictions eventually. would needlessly complicate the design of functions such as bundle.
In this book, we therefore keep the first six steps of the design recipe
intact and supplement them with a seventh step: the termination
argument. Figure 104 summarizes this decision in a table. It
shows the new design recipe in the conventional tabular form. The
unmodified steps come with —
data representation and definition
a purpose statement concerning the ``how'' of the function
supplement the explanation of what the function computes with a one-liner on how it computes the result
examples and tests
explain the ``how'' with several examples
formulate conditions for trivially solvable problems; formulate answers for these trivial cases; determine how to generate new problems for non-trivial problems, possibly using auxiliary functions; determine how to combine the solutions of the generated problems into a solution for the given problem
(1) a size argument for each recursive call or (2) examples of exceptions to termination
investigate whether the problem data for each recursive data is smaller than the given data; find examples that cause the function to loop
A termination argument comes in one of two forms. The first one argues why each recursive call works on a problem that is smaller than the given one. Often this argument is straightforward; on rare occasions, you will need to work with a mathematician to prove a new theorem for such arguments. The second kind of termination argument illustrates with an example that the function may not terminate. Ideally it should also describe the classYou cannot define a predicate for this class; otherwise you could modify the function and ensure that it always terminates. of data for which the function may loop. In some extremely rare cases, you may not be able to make either argument because computer science does not know enough yet.
Exercise 433. Consider the following definition of smaller-items, one of the two “problem generators” for quick-sort:
; [List-of Number] Number -> [List-of Number] ; creates a list with all those numbers on alon ; that are smaller than or equal to threshold (define (smaller-items alon threshold) (cond [(empty? alon) '()] [else (if (<= (first alon) threshold) (cons (first alon) (smaller-items (rest alon) threshold)) (smaller-items (rest alon) threshold))]))What can go wrong when this version is used with the quick-sort definition from Recursion that Ignores Structure?
Exercise 434. When you worked on exercise 428 or exercise 426, you may have produced looping solutions. Similarly, exercise 433 actually reveals how brittle the termination argument is for quick-sort. In all cases, the argument relies on the idea that smaller-items and larger-items produce lists that are maximally as long as the given list, and our understanding that neither includes the given pivot in the result.
Based on this explanation, modify the definition of quick-sort so that both functions already receive lists that are shorter than the given one.
Exercise 435. Formulate a termination argument for food-create from exercise 431.
The template for algorithms is so general that it includes structurally recursive functions. Consider the left side of figure 105. This template is specialized to deal with one trivial clause and one generative step. If we replace trivial? with empty? and generate with rest, we get a template for list-processing functions; see the right side of figure 105.
special computes the length of its input,
special negates each number on the given list of numbers, and
special uppercases the given list of strings.What do you conclude from these exercises?
Is there a real difference between structural recursive design and the one for generative recursion?
Our answer is “it depends.” Of course, we could say that all functions using structural recursion are just special cases of generative recursion. This “everything is equal” attitude, however, is of no help if we wish to understand the process of designing functions. It confuses two kinds of design that require different forms of knowledge and that have different consequences. One relies on a systematic data analysis and not much more; the other requires a deep, often mathematical, insight into the problem-solving process itself. One leads programmers to naturally terminating functions; the other requires a termination argument. Conflating these two approaches is simply unhelpful.
When you interact with a function f that consumes lists of numbers and produces sorted variants, it is impossible for you to know whether f is the sort> function or quick-sort. The two functions behave in an observably equivalent way.Observable equivalence plays a central role in the study of programming languages. This raises the question of which of the two a programming language should provide in its library. More generally, when we can design a function using structural recursion and generative recursion, we need to figure out which one to pick.
6 is evenly divisible by 1, 2, 3, and 6;
25 is evenly divisible by 1, 5, and 25.
18 is evenly divisible by 1, 2, 3, 6, 9, and 18;
24 is evenly divisible by 1, 2, 3, 4, 6, 8, 12, and 24.
; N[>= 1] N[>= 1] -> N ; finds the greatest common divisor of n and m (check-expect (gcd 6 25) 1) (check-expect (gcd 18 24) 6) (define (gcd n m) 1)
From here we design both a structural and a generative recursive solution. Since this part of the book is about generative recursion, we merely present a structural solution in figure 106 and leave the design ideas to exercises. Just note that (= (remainder n i) (remainder m i) 0) encodes the idea that both n and m are “evenly divisible” by i.
(gcd-structural 101135853 45014640)
(time (gcd-structural 101135853 45014640))in the interactions area.
for two natural numbers—
large and small— the greatest common divisor is equal to the greatest common divisor of small and the remainder of large divided by small.
The given numbers are 18 and 24.
According to the insight, they have the same greatest common divisor as 18 and 6.
And these two have the same greatest common divisor as 6 and 0.
when the smaller of the two given numbers is 0, we are faced with a trivial case;
the larger of the two numbers is the solution in the trivial case;
generating a new problem requires a single remainder operation; and
the above equation tells us that the answer to the newly generated problem is also the answer to the originally given problem.
Figure 107 presents the definition of the algorithm. The local definition introduces the workhorse of the function: clever-gcd. Its first cond line discovers the trivial case by comparing smaller to 0 and produces the matching solution. The generative step uses smaller as the new first argument and (remainder large small) as the new second argument to clever-gcd.
(gcd-generative 101135853 45014640)
... == (clever-gcd 101135853 45014640) == (clever-gcd 45014640 11106573) == (clever-gcd 11106573 588348) == (clever-gcd 588348 516309) == (clever-gcd 516309 72039) == (clever-gcd 72039 12036) == (clever-gcd 12036 11859) == (clever-gcd 11859 177) == (clever-gcd 177 0)
(time (gcd-generative 101135853 45014640))in the interactions area.
You may now think that generative recursion design has discovered a much faster solution to the gcd problem, and you may conclude that generative recursion is always the right way to go. This judgment is too rash for three reasons. First, even a well-designed algorithm isn’t always faster than an equivalent structurally recursive function. For example, quick-sort wins only for large lists; for small ones, the standard sort> function is faster. Worse, a badly designed algorithm can wreak havoc on the performance of a program. Second, it is typically easier to design a function using the recipe for structural recursion. Conversely, designing an algorithm requires an idea of how to generate new problems, a step that often requires some deep insight. Finally, people who read functions can easily understand structurally recursive functions, even without much documentation. To understand an algorithm, the generative step must be explained really well, but generating a really good explanation can be a lot of hard work.
Experience shows that most functions in a program employ structural design; only a few exploit generative recursion. When we encounter a situation where a design could use the recipe for either structural or generative recursion, the best approach is to start with a structural version. If it turns out to be too slow for the task at hand, explore the use of generative recursion.
(quick-sort (list 10 6 8 9 14 12 3 11 14 16 2))by hand. Show only those lines that introduce a new recursive call to quick-sort. How many recursive applications of quick-sort are required? How many recursive applications of the append function? Suggest a general rule for a list of length n.Evaluate
(quick-sort (list 1 2 3 4 5 6 7 8 9 10 11 12 13 14))by hand. How many recursive applications of quick-sort are required? How many recursive applications of append? Does this contradict the first part of the exercise?
Exercise 441. Add sort> and quick-sort to the definitions area. Run tests on the functions to ensure that they work on basic examples. Also develop create-tests, a function that creates large test cases randomly. Then explore how fast each works on various lists.
Does the experiment confirm the claim that the plain sort> function wins in many comparisons over quick-sort for short lists and vice versa?
Determine the cross-over point. Use this information to build a clever-sort function that behaves like quick-sort for large lists and switches over to the plain sort> function for lists below this cross-over point. See exercise 425.
Exercise 442. Given the header material for gcd-structural, a naive use of the design recipe might use the following template or some variant:Why is it impossible to find a divisor with this strategy?
Exercise 443. Exercise 442 means that the design for gcd-structural calls for some planning and a design by composition approach.The very explanation of “greatest common denominator” suggests a two-stage approach. First design a function that can compute the listIdeally, you should use sets not lists. of divisors of a natural number. Second, design a function that picks the largest common number in the list of divisors of n and the list of divisors of m. The overall function would look like this:
(define (gcd-structural small large) (largest-common (divisors smaller smaller) (divisors small large))) ; N[>= 1] N[>= 1] -> [List-of N] ; computes the list of divisors of l smaller or equal to k (define (divisors k l) '()) ; [List-of N] [List-of N] -> N ; finds the largest number common to both k and l (define (largest-common k l) 1)Before you design largest-common explain why divisors consumes two numbers and why it consumes smaller in both cases.
The design of an algorithm starts with an informal description of a process of how to create a problem that is more easily solvable than the given one and whose solution contributes to the solution of the given problem. Coming up with this kind of idea requires inspiration, penetration of an area, and experience with many different kinds of examples.
This chapter presents several illustrative examples of algorithms. Some are directly drawn from mathematics, which is the source of many ideas; others come from computational settings. The first example is a graphical illustration of our principle: the Sierpinski triangle. The second one explains the divide-and-conquer principle with the simple mathematical example of finding the root of a function. It then shows how to turn this idea into a fast algorithm for searching sequences, a widely used application. The third section concerns “parsing” of sequences of 1Strings, also a common problem in real-world programming.
Fractals play an important role in computational geometry. Flake writes in The Computational Beauty of Nature (The MIT Press, 1998) that “geometry can be extended to account for objects with a fractional dimension. Such objects, known as fractals, come very close to capturing the richness and variety of forms found in nature. Fractals possess structural self-similarity on multiple ... scales, meaning that a piece of a fractal will often look like the whole.”
Figure 108 displays an example of a fractal shape, known as the Sierpinski triangle. The basic shape is an (equilateral) triangle, like the one in the center. When this triangle is composed sufficiently many times in a triangular fashion, we get the left-most shape.
The right-most image in figure 108 explains the generative step. When taken by itself, it says that given a triangle, find the midpoint of each side and connect them to each other. This step yields four triangles; repeat the process for each of the outer of these three triangles unless these triangles are too small.
When the given number is so small that drawing triangles inside of it is pointless, the problem is trivial.
In that case, it suffices to generate a triangle.
Otherwise, the algorithm must generate a Sierpinski triangle of size side / 2 because juxtaposing two such triangles in either direction yields one of size side.
- If half-sized is the Sierpinski triangle of size side / 2, thenis a Sierpinski triangle of size side.
(define SMALL 4) (define small-triangle (triangle SMALL 'outline 'red)) ; Number -> Image ; generative creates Sierpinski triangle of size side by generating ; one of size side/2 and placing one copy above two composed copies (check-expect (sierpinski SMALL) small-triangle) (check-expect (sierpinski (* 2 SMALL)) (above small-triangle (beside small-triangle small-triangle))) (define (sierpinski side) (cond [(<= side SMALL) (triangle side 'outline 'red)] [else (local ((define half-sized (sierpinski (/ side 2)))) (above half-sized (beside half-sized half-sized)))]))
Figure 109 spells out the details. The “triviality condition” translates to (<= side SMALL) for some constant SMALL. For the trivial answer, the function returns a (red) triangle of the given size. In the recursive case, a local expression introduces the name half-sized for the Sierpinski triangle that is half as big as the specified size. Once the recursive call has generated the small Sierpinski triangle, the above-beside composition of three copies yields the desired triangle.
; create Sierpinski triangle of size side by ...
; generating one of size side/2 and ; placing one copy above two composed copies
Since sierpinski is based on generative recursion, defining the function and testing it is not the last step. We must also consider why the algorithm terminates for any given legal input. The input of sierpinski is a single positive number. If the number is smaller than SMALL, the algorithm terminates. Otherwise, the recursive call uses a number that is half as large as the given one. Hence, the algorithm must terminate for all positive sides, assuming SMALL is positive, too.
One view of the Sierpinski process is that it divides its problem in half until it is immediately solvable. With a little imagination, you can see that the process can be used to search for numbers with certain properties. The next section explains this idea in detail.
f(r) = 0.
Sample Problem: A rocket is flying at the constant speed of v miles per hour on a straight line towards some target, d0 miles away. It then accelerates at the rate of a miles per hour squared for t hours. When will it hit its target?
d(t) = (v * t + 1/2 * a * t2)
d0 = (v * t0 + 1/2 * a * t02)
Generally such problems call for more complex equations than quadratic ones. In response, mathematicians have spent the last few centuries developing root-finding methods for different types of functions. In this section, we study a solution that is based on the Intermediate Value Theorem, an early result of mathematical analysis. The resulting algorithm is a primary example of generative recursion based on a mathematical theorem. It has been adapted to other uses and has become known as the binary search algorithm in computer science.
The Intermediate Value Theorem says that a continuous function f has a root in an interval [a,b] if f(a) and f(b) are on the opposite side of the x-axis. By continuous we mean a function that doesn’t “jump,” that doesn’t have gaps, and that proceeds on a “smooth” path.
We thank Dr. Neil Toronto for plot.
Figure 110 illustrates the Intermediate Value Theorem in a graphical manner. The function f is below the x-axis at a and above the x-axis at b. It is a continuous function, as suggested by the uninterrupted, smooth graph. And indeed, f intersects the x-axis somewhere between a and b, labeled “range 1” in the figure.
m = (a+b) / 2 .
; [Number -> Number] Number Number -> Number ; determine R such that f has a root in [R,(+ R TOLERANCE)] ; assume f is continuous ; assume (or (<= (f left) 0 (f right)) (<= (f right) 0 (f left))) ; generative divide interval in half, the root is in one of the two ; halves, pick according to assumption (define (find-root f left right) 0)
It defines a binomial for which we can determine its roots by hand:
> (poly 2)
> (poly 4)
0Use poly to formulate a check-satisfied test for find-root.Also use poly to illustrate the root-finding process. Start with the interval [3,6] and tabulate the information as follows:
?Assume TOLERANCE is 0.5 for the construction of this table.
- We need a condition that describes when the problem is solved and a matching answer. Given our discussion so far, this is straightforward:
The matching result in the trivial case is left.
- For the generative case, we need an expression that generates new problems for find-root. According to our informal description, this step first requires determining the midpoint and its function value:The midpoint is then used to pick the next interval. Following Intermediate Value Theorem, the interval [left,mid] is the next candidate ifwhile [mid,right] is used for the recursive call ifTranslated into code, the body of the local expression must be a conditional:In both clauses, we use find-root to continue the search.
The answer to the final question is obvious. Since the recursive call to find-root finds the root of f, there is no need to process its solution any further.
; [Number -> Number] Number Number -> Number ; determines R such that f has a root in [R,(+ R TOLERANCE)] ; assume f is continuous ; assume (or (<= (f left) 0 (f right)) (<= (f right) 0 (f left))) ; generative divide interval in half, the root is in one of the two ; halves, pick according to assumption (define (find-root f left right) (cond [(<= (- right left) TOLERANCE) left] [else (local ((define mid (/ (+ left right) 2)) (define f@mid (f mid))) (cond [(or (<= (f left) 0 f@mid) (<= f@mid 0 (f left))) (find-root f left mid)] [(or (<= f@mid 0 (f right)) (<= (f right) 0 f@mid)) (find-root f mid right)]))]))
Hint Suppose the arguments of find-root describe an interval of size S1. How large is the distance between left and right for the first, second, third recursive call to find-root? After how many steps is (- right left) smaller than or equal to TOLERANCE?
In addition, find-root recomputes the value of a boundary across recursive calls. For example, (find-root f left right) computes (f left) and, if [left,mid] is chosen as the next interval, find-root computes (f left) again. Introduce a helper function that is like find-root but consumes not only left and right but also (f left) and (f right) at each recursive stage.
How many re-computations of (f left) does this design maximally avoid?
Note The two additional arguments to this helper function change at each recursive stage but the change is related to the change in the numeric arguments. These arguments are so-called accumulators, which are the topic of Accumulators.
Exercise 449. A function f is monotonically increasing if (<= (f a) (f b)) holds whenever (< a b) yields #true. Simplify find-root assuming the given function is not only continuous but also monotonically increasing.
Exercise 450. A tableMany programming languages, including Racket, support arrays and vectors, which are similar to tables. is a structure of two fields: the natural number VL and a function array, which consumes natural numbers and, for those between 0 and VL (exclusive), produces answers:Since this data structure is somewhat unusual, it is critical to illustrate it with examples:Here table1’s array function is defined for more inputs than its length field allows; table2 is defined for just one input, namely 0. Finally, we also define a useful function for looking up values in tables:
The root of a table t is the number in (table-array t) that is closest to 0. A root index is a natural number i such that (table-ref t i) is a root of table t.
A table t is monotonically increasing if (table-ref t 0) is less then (table-ref t 1), (table-ref t 1) is less than (table-ref t 2), and so on.
Design find-linear. The function consumes a monotonically increasing table and finds the smallest index for a root of the table. Use the structural recipe for N, proceeding from 0 through 1, 2, and so on to the array-length of the given table. This kind of root-finding process is often called a linear search.
Design find-binary, which also finds the smallest index for root of a monotonically increasing table but uses generative recursion to do so. Like ordinary binary search, the algorithm narrows an interval down to the smallest possible size and then chooses the index. Don’t forget to formulate a termination argument.
Hint The key problem is that a table index is a natural number, not a plain number. Hence the interval boundary arguments for find must be natural numbers. Consider how this observation changes (1) the nature of trivially solvable problem instances, (2) the midpoint computation, (3) and the decision which interval to generate next?
Consider (make-table 1024 a) and assume (= (a 1023) 0). How many recursive calls to find are needed in find-linear and find-binary respectively?
(list "h" "o" "w" " " "a" "r" "e" " " "y" "o" "u" "\n" "d" "o" "i" "n" "g" "?" "\n" "a" "n" "y" " " "p" "r" "o" "g" "r" "e" "s" "s" "?")
(list "a" "b" "c" "\n" "d" "e" "\n" "f" "g" "h" "\n")
The problem of turning a sequence of 1Strings into a list of lines is called the parsing problem. Many programming languages provide functions that retrieve lines, words, numbers and other kinds of so-called tokens from files. But even if they do, it is common that programs need to parse these tokens even further. This section provides a glimpse at a parsing technique. Parsing is so complex and so central to the creation of full-fledged software applications, however, that most undergraduate curricula come with at least one course on parsing. So do not think you can tackle real parsing problems properly even after mastering this section.
The problem is trivially solvable if the file is '().
In that case, the file doesn’t contain a line.
Otherwise, the file contains at least "\n" or one 1String. These items—
up to and including the first "\n", if any— must be separated from the rest of the File. The remainder is a new problem of the same kind that file->list-of-lines can solve.
; File -> [List-of Line] ; converts a file into a list of lines (check-expect (file->list-of-lines '("\n" "\n")) '(() ())) (check-expect (file->list-of-lines (list "a" "b" "c" "\n" "d" "e" "\n" "f" "g" "h" "\n")) (list (list "a" "b" "c") (list "d" "e") (list "f" "g" "h"))) (define (file->list-of-lines afile) (cond [(empty? afile) '()] [else (cons (first-line afile) (file->list-of-lines (remove-first-line afile)))])) ; File -> Line ; retrieves the prefix of afile up to the first occurrence of NEWLINE (define (first-line afile) (cond [(empty? afile) '()] [(string=? (first afile) NEWLINE) '()] [else (cons (first afile) (first-line (rest afile)))])) ; File -> Line ; drops the suffix of afile behind the first occurrence of NEWLINE (define (remove-first-line afile) (cond [(empty? afile) '()] [(string=? (first afile) NEWLINE) (rest afile)] [else (remove-first-line (rest afile))])) (define NEWLINE "\n")
From here, it is easy to create the rest of the program. In file->list-of-lines, the answer in the first clause must be '() because an empty file does not contain any lines. The answer in the second clause must cons the value of (first-line afile) onto the value (file->list-of-lines (remove-first-line afile)), because the first expression computes the first line and the second one computes the rest of the lines. Finally, the auxiliary functions process their inputs in a structurally recursive manner; their development is a straightforward exercise. Figure 112 collects the three function definitions and the definition for NEWLINE.
(file->list-of-lines (list "a" "b" "c" "\n" "d" "e" "\n" "f" "g" "h" "\n")) == (cons (list "a" "b" "c") (file->list-of-lines (list "d" "e" "\n" "f" "g" "h" "\n"))) == (cons (list "a" "b" "c") (cons (list "d" "e") (file->list-of-lines (list "f" "g" "h" "\n")))) == (cons (list "a" "b" "c") (cons (list "d" "e") (cons (list "f" "g" "h") (file->list-of-lines '())))) == (cons (list "a" "b" "c") (cons (list "d" "e") (cons (list "f" "g" "h") '()))) == (list (list "a" "b" "c") (list "d" "e") (list "f" "g" "h"))
Exercise 451. Design the function tokenize. It turns a Line into a list of tokens. Here a token is either a 1String or a String that consists of lower-case letters and nothing else. That is, all white-space 1Strings are dropped; all other non-letters remain as is; and all consecutive letters are bundled into “words.” Hint Read up on the string-whitespace? function.
Exercise 452. Design create-matrix. The function consumes a number n and a list of n2 numbers. It produces a list of n lists of n numbers, for example:Make up a second example.
Many solutions to mathematical problems employ generative recursion. A future programmer must get to know such solutions for two reasons. On one hand, a fair number of programming tasks are essentially about turning these kinds of mathematical ideas into programs. On the other hand, practicing with such mathematical problems often proves inspirational for the design of algorithms. This chapter deals with three such problems.
Binary Search introduces one method for finding the root of a mathematical function. As the exercises in the same section sketch, the method naturally generalizes to computational problems, such as finding certain values in tables, vectors, and arrays. In mathematical applications, programmers tend to employ methods that originate from analytical mathematics. A prominent one is due to Newton. Like binary search, the so-called Newton method repeatedly improves an approximation to the root until it is “close enough.” Starting from a guess, say, r1, the essence of the process is to construct the tangent of f at r1 and to determine its root. While the tangent approximates the function, it is also straightforward to determine its root. By repeating this process sufficiently often,Newton proved this fact. an algorithm can find a root r for which (f r) is close enough to 0.
Exercise 453. Translate this mathematical formula into the ISL+ function slope, which maps function f and a number x to the slope of f at x. Assume that EPSILON is a global constant. For your examples, use functions whose exact slope you can figure out, say, horizontal lines, linear functions, and perhaps polynomials if you know some calculus.
Exercise 454. Translate this mathematical formula into the function root-of-tangent, which maps function f and a number x to the root of the tangent through (x,(f x)). You need to solve exercise 453 first and reuse the solution here.
- If (f r1) is close enough to 0, the problem is solved. Close to 0 could be mean (f r1) is a small positive number or a small negative number. Hence we translate this idea intoThat is, we determine whether the absolute value is small.
The solution is r1.
The generative step of the algorithm consists of finding the root of the tangent of f at r1, which generates the next guess. By applying newton to f and this new guess, we resume the process.
The answer of the recursion is also the answer of the original problem.
; [Number -> Number] Number -> Number ; finds a number r such that (<= (abs (f r)) EPSILON) (check-within (newton poly 1) 2 EPSILON) (check-within (newton poly 3.5) 4 EPSILON) (define (newton f r1) (cond [(<= (abs (f r1)) EPSILON) r1] [else (newton f (root-of-tangent f r1))])) ; see exercise 453 (define (slope f r) 1.0) ; see exercise 454 (define (root-of-tangent f r) 1.0)
Figure 113 displays the definition of newton. It includes two tests that are derived from the tests in Binary Search for the find-root function. After all, both functions search for the root of a function and poly has two known roots.
> (poly 3)
> (newton poly 3)
/: division by zero
> (slope poly 3)
> (root-of-tangent poly 3)
/: division by zero
> (newton poly 2.9999)
> (newton poly #i3.0)
> (slope poly #i3.0)
> (root-of-tangent poly #i3.0)
In short, newton exhibits the full range of complex termination
behavior. For some inputs, the function produces a correct result. For some
others, it signals errors. And for yet others, it goes into infinite loop or
appears to go into one. The header for newton—
Exercise 455. Design the function double-amount, which computes how many months it takes to double a given amount of money when a savings account pays interest at a fixed rate on a monthly basis.This exercise is due to Adrian German.
Hint The function must know the current amount and the initially given one; see ... Add Expressive Power.
Domain Knowledge With a minor algebraic manipulation, you can show that the given amount is irrelevant. Only the interest rate matters. Also domain experts know that doubling occurs after roughly 72/r as long as the interest rate r is not large.
Sample Problem: A car drives at constant speed of v meters per second. How far does it travel in 5, 10, 15 seconds?
A rocket lifts off at the constant rate of acceleration of . What height does it reach after 5, 10, 15 seconds?
Figure 115 illustrates the idea in a graphical manner. On the left, we see an overlay of two graphs: the solid flat line is the speed of the vehicle and the rising dashed line is the distance traveled. A quick check shows that the latter is indeed the area determine by the former and the x-axis at every point in time. Similarly, the graphs on the right show the relationship between a rocket moving at constantly increasing speed and the height it reaches. Determining this area under the graph of a function for some specific interval is called (function) integration.
While mathematicians know formulas for the two sample problems that give precise answers, the general problem calls for computational solutions. The problem is that curves often come with complex shapes, more like those in figure 116, which suggests that someone needs to know the area between the x-axis, the vertical lines labeled a and b, and the graph of f. Applied mathematicians determine such areas in an approximate manner, summing the areas of many small geometric shapes. It is therefore natural to develop algorithms that deal with these calculations.
(define (constant x) 20)
(check-expect (integrate constant 12 22) 200)
(check-expect (integrate linear 0 10) 100)
(define EPSILON 0.1) ; [Number -> Number] Number Number -> Number ; computes the area under the graph of f between a and b ; assume (< a b) holds (check-within (integrate (lambda (x) 20) 12 22) 200 EPSILON) (check-within (integrate (lambda (x) (* 2 x)) 0 10) 100 EPSILON) (check-within (integrate (lambda (x) (* 3 (sqr x))) 0 10) 1000 EPSILON) (define (integrate f a b) #i0.0)
Figure 117 collects the result of the first three steps of the design recipe. The figure adds a purpose statement and an obvious assumption concerning the two interval boundaries. Instead of check-expect it uses check-within, which anticipates the numerical inaccuracies that comes with computational approximations in such calculations. Analogously, the header of integrate specifies #i0.0 as the return result, signaling that the function is expected to return an inexact number.
The following two exercises show how to turn domain knowledge into integration functions. Both functions compute rather crude approximations. While the design of the first uses only mathematical formulas, the second also exploits a bit of structural design ideas. Solving these exercises creates the necessary appreciation for the core of this section, which presents a generative-recursive integration algorithm.
Exercise 456. Kepler suggested a simple integration method. To compute a rough estimate of the area under f between a and b, proceed as follows:
divide the interval into half at mid = (a + b) / 2;
- compute the areas of the two trapezoids, each determined by four points:
then add the two areas.The method is known as Kepler’s rule.
Design the function integrate-kepler. That is, turn the mathematical knowledge into a ISL+ function. Make sure to adapt the test cases from figure 117 to this use. Which of the three tests fails and by how much?Domain knowledge Let us take a close look at the kind of trapezoids whose area you need to compute. Here are the two basic shapes, without a coordinate system to reduce clutter:
The left assumes f(L) > f(R) while the right one shows the case where f(L) < f (R). Surprisingly, it is possible to calculate the area of these trapezoids with a single formula:Stop! Convince yourself that this formula adds the area of the triangle to the area of the lower rectangle for the left trapezoid above, while it subtracts the area of the triangle from the area of the large rectangle for the right one.Also show that the above formula is equal toThis is a mathematical way to convince yourself of the asymmetry of the formula.
Exercise 457. Another simple integration method divides the area into many small rectangles. Each rectangle has a fixed width and is as tall as the function graph in the middle of the rectangle. Adding up the areas of the rectangles produces an estimate of the area under the function’s graph.Let’s use
R = 10to stand for the number of rectangles to be considered. Hence the width of each rectangle isFor the height of one of these rectangles, we determine the value of f at its midpoint. The first midpoint is clearly at a plus half of the width of the rectangle,which means its area isTo compute the area of the second rectangle, we must add the width of one rectangle to the first midpoint:For the third one, we getIn general, we can use the following formula for the ith rectangle:The first rectangle has index 0, the last one R - 1.Using this sequence of rectangles, we can now determine the area under the graph as a series:
Design the function integrate-rectangles. That is, turn the description of the rectangle process an ISL+ function. Make sure to adapt the test cases from figure 117 to this use.
The more rectangles the algorithms uses, the closer its estimate is to the actual area. Make R a top-level constant and increase it by factors of 10 until the algorithm’s accuracy eliminates problems with EPSILON value of 0.1.
Decrease EPSILON to 0.01 and increase R enough to eliminate any failing test cases again. Compare the result to exercise 456.
The Kepler method of exercise 456 immediately suggests a divide-and-conquer strategy like binary search introduced in Binary Search. Roughly speaking, the algorithm would split the interval into two pieces, recursively compute the area of each piece, and add the two results.
Exercise 458. Develop the algorithm integrate-dc, which integrates a function f between the boundaries a and b using a divide-and-conquer strategy. Use Kepler’s method when the interval is sufficiently small.
The plain divide-and-conquer approach of exercise 458 is wasteful. Consider a function whose graph is level in one part and rapidly changes in another; see figure 118 for a concrete example. For the level part on the graph, it is pointless to keep splitting the interval. It is just as easy to compute the trapezoid for the complete interval as for the two halves. For the wavy part, however, the algorithm must continue dividing the interval until the irregularities of the graph are reasonably small.
Exercise 459. Design integrate-adaptive. That is, turn the recursive process description into an ISL+ algorithm. Make sure to adapt the test cases from figure 117 to this use.
Do not discuss the termination of integrate-adaptive
Does integrate-adaptive necessarily compute a better answer than either integrate-kepler or integrate-rectangles? Which aspect is integrate-adaptive guaranteed to improve?
Terminology The algorithm is called adaptive integration because it automatically allocates time to those parts of the graph that needs it and spends little time on the others. Specifically, for those parts of f that are level, it performs just a few calculations; for the other parts, it inspects small intervals to decrease the error margin. Computer science knows many adaptive algorithms, and integrate-adaptive is just one of them.
Sample Problem: In a bartering world, the values of coal (x), oil (y), and natural gas (z) are determined by the following exchange equations:
x = 1, y = 1, and z = 2.
10 = 10, 31 = 31, and 1 =1.
; An SOE is a non-empty Matrix ; constraint if its length is n (in N), each item has length (+ n 1) ; interpretation an SOE represents a system of linear equations ; An Equation is [List-of Number] ; constraint an Equation contains at least two numbers. ; interpretation if (list a1 ... an b) is an Equation, a1, ..., an are ; the left-hand side variable coefficients and b is the right-hand side ; A Solution is [List-of Number] ; examples: (define M (list (list 2 2 3 10) (list 2 5 12 31) (list 4 1 -2 1))) (define S '(1 1 2))
Figure 119 introduces a data representation for our problem domain. It also illustrates how to represent the sample system of equations and its solution with this data representation. This representation captures the essence of a system of equations, namely, the numeric coefficients of the variables on the left-hand side and the right-hand side values. The names of the variables don’t play any role because they are like parameters of functions; meaning, as long as they are consistently renamed the equations have the same solutions.
; Equation -> [List-of Number] ; extracts the left-hand side from a row in a matrix (check-expect (lhs (first M)) '(2 2 3)) (define (lhs e) (reverse (rest (reverse e)))) ; Equation -> Number ; extracts the right-hand side from a row in a matrix (check-expect (rhs (first M)) 10) (define (rhs e) (first (reverse e)))
Exercise 460. Design the function check-solution. It consumes an SOE and a Solution. Its result is #true if plugging in the numbers from the Solution for the variables in the Equations of the SOE produces equal left-hand side values and right-hand side values; otherwise the function produces #false. Use check-solution to formulate tests with check-satisfied.
Hint Design the function plug-in first. It consumes the left-hand side of an Equation and a Solution and calculates out the value of the left-hand side when the numbers from the solution are plugged in for the variables.
Gaussian elimination is a standard method for finding solutions to systems of linear equations. It consists of two steps. The first step is to transform the system of equations into a system of different shape but with the same solution. The second step is to find solutions to one equation at a time. Here we focus on the first step because it is another interesting instance of generative recursion.
has the same solution as the one labeled with (). Do so by hand and with check-solution from exercise 460.
has the same solution as the one labeled with (). Again do so by hand and with check-solution from exercise 460.
Exercise 463. Design subtract. The function consumes two Equations of equal length. It subtracts the second from the first, item by item, as many times as necessary to obtain an Equation with a 0 in the first position. Since the leading coefficient is known to be 0, subtract returns the rest of the list that results from the subtractions.
Design the triangulate algorithm:Turn the above example into a test and spell out explicit answers for the four questions based on our loose description.
Do not yet deal with the termination step of the design recipe.
Exercise 465. Revise the algorithm triangulate from exercise 464 so that it rotates the equations first to find one with a leading coefficient that is not 0 before it subtracts the first equation from the remaining ones.
Does this algorithm terminate for all possible system of equations?Hint The following expression rotates a non-empty list L:Explain why.
Hint Use structural recursion for the design. Start with the design of a function that solves a single linear equation in n+1 variables, given a solution for the last n variables. In general, this function plugs in the values for the rest of the left-hand side, subtracts the result from the right-hand side, and divides by the first coefficient. Experiment with this suggestion and the above examples.
Challenge Use an existing abstraction and lambda to define solve as a one-liner.
Problem solving isn’t always straightforward. Sometimes we make progress by pursuing one approach only to discover that we are stuck because we took a wrong turn. One obvious option is to backtrack to the place where we took the wrong turn and to take a different turn. Some algorithms work just like that. This chapter presents two instances. The first section deals with an algorithm for traversing graphs. The second one is an extended exercise that uses backtracking in the context of a chess puzzle.
Graphs are ubiquitous in our world and the world of computing. Imagine a group of people, say, the students in your school. Write down all the names and connect the names of those people that know each other. You have just created your first undirected graph.
Now take a look at figure 120, which displays a small directed
graph. It consists of seven nodes—
Sample Problem: Design an algorithm that proposes a way to introduce one person to another in a directed email graph for a large company. The program consumes a directed graph representing established email connections and two email addresses. It returns a sequence of email addresses that connect the first email with the second.
send email from E to F and then to D.
send it from E to C and then to D.
Looking at figure 120 you can easily figure out how to get from
one node to another without thinking much about how you did it. So imagine
for a moment that the graph in figure 120 is a large park. Also
imagine someone says you are located at E and you need to get to
G. You can clearly see two paths, one leading to C and
another one leading to F. Follow the first one and make sure to
remember that it is also possible to get from E to F. Now you
have a new problem, namely, how to get from C to G. The key
insight is that this new problem is just like the original problem; it asks
you to find a path from one node to another. Furthermore, if you can solve
the problem, you know how to get from E to G—
Exercise 469. Translate one of the above definitions into proper list form using list and proper symbols.The data representation for nodes is straightforward:
; A Node is a SymbolFormulate a data definition to describe the class of all Graph representations, allowing an arbitrary number of nodes and edges. Only one of the above representations has to belong to Graph.
(find-path 'C 'D sample-graph) (find-path 'E 'D sample-graph) (find-path 'C 'G sample-graph)
The result of the function consists of all nodes leading from the origination node to the destination node, including those two. In this case, an empty path could be used to express the lack of a path between two nodes.
Alternatively, since the call itself already lists two of the nodes, the output could mention only the “interior” nodes of the path. Now the answer for the first call would be '() because 'D is an immediate neighbor of 'C. Of course, this also means that '() no longer signals failure.
; A Path is [List-of Node] ; interpretation The list of nodes specifies a sequence of ; immediate neighbors that leads from the first Node on the ; list to the last one. ; Node Node Graph -> [Maybe Path] ; finds a path from origination to destination in G ; if there is no path, the function produces #false (check-expect (find-path 'C 'D sample-graph) '(C D)) (check-member-of (find-path 'E 'D sample-graph) '(E F D) '(E C D)) (check-expect (find-path 'C 'G sample-graph) #false) (define (find-path origination destination G) #false)
If the two given nodes are directly connected with an arrow in the given graph, the path consists of just these two nodes. But there is an even simpler case, namely, when the origination argument of find-path is equal to its destination.
In that second case, the problem is truly trivial and the matching answer is (list destination).
If the arguments are different, the algorithm must inspect all immediate neighbors of origination and determine whether there is a path from any one of those to destination. In other words, picking one of those neighbors generates a new instance of the “find a path” problem.
Finally, once the algorithm has a path from a neighbor of origination to destination, it is easy to construct a complete path from the former to the latter—
just add the origination node to the list.
; Node Node Graph -> [Maybe Path] ; finds a path from origination to destination in G ; if there is no path, the function produces #false (define (find-path origination destination G) (cond [(symbol=? origination destination) (list destination)] [else (local ((define next (neighbors origination G)) (define candidate (find-path/list next destination G))) (cond [(boolean? candidate) #false] [else (cons origination candidate)]))])) ; [List-of Node] Node Graph -> [Maybe Path] ; finds a path from some node on lo-Os to D ; if there is no path, the function produces #false (define (find-path/list lo-Os D G) (cond [(empty? lo-Os) #false] [else (local ((define candidate (find-path (first lo-Os) D G))) (cond [(boolean? candidate) (find-path/list (rest lo-Os) D G)] [else candidate]))]))
Figure 121 contains the complete definition of find-path. It also contains a definition of find-path/list, which processes its first argument via structural recursion. For each node in the list, find-path/list uses find-path to check for a path. If find-path indeed produces a path, that path is the answer. Otherwise, find-path produces #false and the function backtracks.
Note Trees discusses backtracking in the structural world. A particularly good example is the function that searches blue-eyed ancestors in a family tree. When the function encounters node, it first searches one branch of the family tree, say the father’s, and if this search produces #false, it searches the other half. Since graphs generalize trees, comparing this function with find-path is an instructive exercise. End
Lastly, we need to check whether find-path produces an answer for all possible inputs. It is relatively easy to check that when given the graph in figure 120 and any two nodes in this graph, find-path always produces some answer. Stop! Solve the next exercise before you read on.
Design the function test-on-all-nodes, which consumes a graph g and tries to find a path find-path between all pairs of nodes in g. If it succeeds, it produces #true. Test the function on sample-graph.
For other graphs, however, find-path may not terminate for certain pairs of nodes. Consider the graph in figure 122.
Stop! Define cyclic-graph to represent the graph in this figure.
(find-path 'B 'D cyclic-graph) == ... (find-path 'B 'D cyclic-graph) ... == ... (find-path/list (list 'E 'F) 'D cyclic-graph) ... == ... (find-path 'E 'D cyclic-graph) ... == ... (find-path/list (list 'C 'F) 'D cyclic-graph) ... == ... (find-path 'C 'D cyclic-graph) ... == ... (find-path/list (list 'B 'D) 'D cyclic-graph) ... == ... (find-path 'B 'D cyclic-graph) ... == ...
In summary, the termination argument goes like this. If some given graph is free of cycles, find-path produces some output for any given inputs. After all, every path can only contain a finite number of nodes and the number of paths is finite, too. The function therefore either exhaustively inspects all solutions starting from some given node or finds a path from the origination to the destination node. If, however, a graph contains a cycle, that is, a path from some node back to itself, find-path may not produce a result for some inputs.
The next part presents a program design technique that addresses just this kind of problem. In particular, it presents a variant of find-path that can deal with cycles in a graph.
Exercise 473. Re-design find-path/list so that it uses an existing list abstraction from figure 64 instead of explicit structural recursion.
Once you have this version tested, merge the two functions.
Note on Data Abstraction You may have noticed that the find-path function does not need to know how Graph is defined. As long as you provide a correct neighbors function for Graph, find-path works perfectly fine. In short, the find-path program uses data abstraction.
In a way, data abstraction works just like function abstraction, as
discussed in Abstraction. Here you could create a function
abstract-find-path, which would consume one more parameter than
find-path: neighbors. As long as you always handed
abstract-find-path a graph G from Graph and the
matching neighbors function, it would process the graph
properly. While the extra parameter suggests abstraction in the
conventional sense, the required relationship between two of the
When programs grow large, data abstraction becomes a critical tool for the construction of a program’s components. The next volume in the “How to Design” series addresses this idea in depth; the next section illustrates the idea with another example. End
Exercise 474. Finite State Machines poses a problem concerning finite state machines and strings but immediately defers to this chapter because the solution calls for generative recursion. You have now acquired the design knowledge needed to tackle the problem.
Design the function fsm-match. It consumes the data representation of a finite state machine and a string. It produces #true if the sequence of characters in the string causes the finite state machine to transition from an initial state to a final state.Since this problem is about the design of generative recursive functions, we provide the essential data definition and a data example:
(define-struct transition [current key next]) (define-struct fsm [initial transitions final]) ; A FSM is (make-fsm FSM-State [List-of 1Transition] FSM-State) ; A 1Transition is ; (make-transition FSM-State 1String FSM-State) ; A FSM-State is String ; data example: see exercise 111 (define fsm-a-bc*-d (make-fsm "AA" (list (make-transition "AA" "a" "BC") (make-transition "BC" "b" "BC") (make-transition "BC" "c" "BC") (make-transition "BC" "d" "DD")) "DD"))The data example corresponds to the regular expression a (b|c)* d. As mentioned in exercise 111, "acbd" is one example of an acceptable string; "ad" and "abcd" are two others. Of course, "da", "aa", or "d" do not match.In this context, you are designing the following function:Hint Design the necessary auxiliary function locally to fsm-match?. In this context, represent the problem as a pair of parameters: the current state of the finite state machine and the remaining list of 1Strings.
; [List-of X] -> [List-of [List-of X]] ; creates a list of all rearrangements of the items in w (define (arrangements w) (cond [(empty? w) '(())] [else (foldr (lambda (item others) (local ((define without-item (arrangements (remove item w))) (define add-item-to-front (map (lambda (a) (cons item a)) without-item))) (append add-item-to-front others))) '() w)])) ; test (define (all-words-from-rat? w) (and (member (explode "rat") w) (member (explode "art") w) (member (explode "tar") w))) (check-satisfied (arrangements '("r" "a" "t")) all-words-from-rat?)
Exercise 475. Inspect the function definition of arrangements in figure 123.We thank Mr. Mark Engelberg for suggesting this exercise. Also see figure 73. The figure displays a generative-recursive solution of the extended design problem covered by Word Games, the Heart of the Problem, namely
given a word, create all possible re-arrangements of the letters in a list.The extended exercise is a direct guide to the structurally recursive design of the main function and two auxiliaries, where the design of the latter requires the creation of two more helper functions. In contrast, figure 123 uses the power of generative recursion—
plus foldr and map— to define the same program as a single function definition.
Explain the design of the generative-recursive version of arrangements. Answer all questions that the design recipe for generative recursion poses, including the question of termination.
We thank Mr. Mark Engelberg for his suggestions on how to reformulate this section for young students.
The n queens puzzle is a famous problem from the world of chess that also illustrates the applicability of backtracking in a natural way. For our purposes, a chessboard is a grid of n by n squares. The queen is a game piece that can move in a horizontal, vertical, or diagonal direction arbitrarily far without “jumping” over another piece. We say that a queen threatens a square if it is on the square or can move to it. Figure 124 illustrates the notion in a graphical manner. The queen is in the second column and sixth row. The solid lines radiating out from the queen go through all those squares that are threatened by the queen.
The classical queen-placement problem is to place 8 queens on a 8 by 8 chessboard such that the queens on the board don’t threaten each other. Computer scientists generalize the problem and ask whether it is possible to place n queens on a k by k, for k >= n, chessboard such that the queens don’t pose a threat to each other. When k = n, a solution solves the complete puzzle.
For n = 2, the complete puzzle obviously has no solution. A queen placed on any of the four squares threatens all remaining squares.
It turns out that there is also no complete solution for n = 3. Figure 125 presents three different placements of two queens, that is, solutions for k = 3 and n = 2. In each case, the left-most queen occupies a square in the left-most column while a second queen is placed in one of two squares that the first one does not threaten. The placement of a second queen ensures that all seven unoccupied squares are threatened by some queen, meaning it is impossible to place a third queen. Together, the three placements explore all possibilities of placing the first queen in a square of the first column.
Exercise 476. It is also possible to place the first queen in all squares of the top-most row, the right-most column, and the bottom-most row. Explain why all of these solutions are just like the three scenarios depicted in figure 125?
This leaves the central square. Is it possible to place even a second queen after you place one on the central square of a 3 by 3 board?
Figure 126 displays two solutions for the n queens puzzle: the left one is for n = 4, while the right one solves the n = 5 version. The figure shows how, in each case, a solution places one queen in each row and column of the board, which makes sense because a queen threatens the entire row and column that radiate out from its square.
The problem is about placing one queen at a time. When we place a queen on a board, we can mark the corresponding rows, columns, and diagonals as unusable for other queens.
When we place another queen on a board, we consider only those spots that are not threatened.
Just in case this first choice of a spot leads to problems later, we remember what other squares are feasible for placing this queen.
If we are supposed to place a queen on a board but no safe squares are left, we backtrack to a previous point in the process where we chose one square over another and try one of the remaining squares.
; N -> [Maybe [List-of QP]] ; find a solution to the n queens problem (check-member-of (n-queens 4) (list (make-posn 0 1) (make-posn 1 3) (make-posn 2 0) (make-posn 3 2)) (list (make-posn 0 1) (make-posn 1 3) (make-posn 3 2) (make-posn 2 0)) (list (make-posn 0 1) (make-posn 2 0) (make-posn 1 3) (make-posn 3 2)) (list (make-posn 0 1) (make-posn 2 0) (make-posn 3 2) (make-posn 1 3)) (list (make-posn 0 1) (make-posn 3 2) (make-posn 1 3) (make-posn 2 0)) (list (make-posn 0 1) (make-posn 3 2) (make-posn 2 0) (make-posn 1 3)) (list (make-posn 0 2) (make-posn 1 0) (make-posn 2 3) (make-posn 3 1)) ... (list (make-posn 3 2) (make-posn 2 0) (make-posn 1 3) (make-posn 0 1))) (define (n-queens n) (place-queens (board0 n) n))
Exercise 477. Design the threatening? function. It consumes two QPs and determines whether queens placed on the two respective squares would threaten each other.
Domain Knowledge (1) Study figure 124. The queen in this figure threatens all squares on the horizontal, the vertical, and the diagonal lines. Conversely, a queen on any square on these lines threatens the queen.
(2) Translate your insights into mathematical conditions that relate the squares’ coordinates to each other. For example, all squares on a horizontal have the same y-coordinate. Similarly, all squares on one diagonal have coordinates whose sums are the same. Which diagonal is that? For the other diagonal, the differences between the two coordinates remains the same. Which diagonal does this idea describe?
Hint Once you have figured out the domain knowledge, formulate a test suite that covers horizontals, verticals, and both diagonals. Don’t forget to include a pair of arguments for which threatening? must produce #false.
Exercise 478. Design render-queens. The function consumes a natural number n, a list l of QPs, and an Image i representing a queen. It produces an image of an n by n chess board with images i placed according to l.
You may wish to look for an image for a chess queen on-line or create a simplistic one with the available image functions.
Exercise 479. The tests in figure 127 are awful. No real-world programmer ever spells out all these possible outcomes.One solution is to use property testing again. Design the n-queens-solution? function, which consumes a natural number n and produces a predicate on queen placements that determines whether a given placement is a solution to an n queens puzzle:Once you have tested this predicate, use it and check-satisfied to formulate the tests for n-queens.An alternative solution is to understand the lists of QPs as sets. If two lists contains the same QPs in different order, they are equivalent as the figure suggests. Hence you could formulate the test for n-queens asDesign the function set=?. It consumes two lists and determines whether they contain the same items—
regardless of order.
Exercise 480. As bullet 2 above says, you really want to design a function that places n queens on a k by k chessboard in a non-threatening manner:Figure 127 already refers to this function in the definition of n-queens.Design the place-queens algorithm. Assume you have the following functions to deal with Boards:The first function is used in figure 127 to create the initial board representation for place-queens. You will need the other two to describe the generative steps for the algorithm.
You cannot confirm yet that your solution to the preceding exercise works because it relies on an extensive wish list. Technically, it calls for a data representation of Boards that supports three specific functions. The remaining problem is then to formulate a definition for Board and to design the functions on the wish list.
Exercise 481. Develop a data definition for Board and design the three functions specified in exercise 480. Consider the following ideas:
a Board collects those positions where a queen can still be placed;
a Board contains the list of positions where a queen has been placed;
a Board is a grid of n by n squares, each possibly occupied by a queen. Hint For this representation, consider using a structure to represent a square with one field for the x index, another one for y, and a third one saying whether the square is threatened.Use one of the above ideas to solve this exercise.
Challenge Use all three ideas to come up with three different data representations of Board. Abstract your solution to exercise 480 and confirm that it works with any of your data representations of Board.
This fifth part of the book introduces the idea of eureka! into program design. Unlike the structural design of the first four parts, eureka! design starts from an idea of how the program should solve a problem or process data that represents a problem. Designing here means coming up with a clever way to call a recursive function on a new kind of problem that is like the given one but simpler.
Keep in mind that while we have dubbed it generative recursion, most computer scientists refer to these functions as algorithms.
The standard outline of the design recipe remains valid.
The major change concerns the coding step. It introduces four new questions on going from the completely generic template for generative recursion to a complete function. With two of these questions, you work out the “trivial” parts of the solution process; and with the other two you work out the generative solution step.
The minor change is about the termination behavior of generative recursive functions. Unlike structurally designed functions, algorithms may not terminate for some inputs. This problem might be due to inherent limitations in the idea or the translation of the idea into code. Regardless, the future reader of your program deserves a warning about potentially “bad” inputs.
You will encounter some simple or well-known algorithms in your real-world programming tasks. And you will be expected to cope. For truly clever algorithms, software companies employ highly paid computer scientists, domain experts, and mathematicians to work out the conceptual details before they ask programmers to turn the concepts into programs. You must also be prepared for this kind of task, and the best preparation is practice.