10  Some More Recursion

Programs consist of functions, and functions process data or trigger actions. If the definition for a new class of data refers to itself, the functions that process this kind of data are self-referential in an analogous manner. If a function needs to repeat an action, it recurs. So, in principle, recursion is natural and straightforward. Of course, it also often hidden in loops that come with certain classes of data.

On occasion, however, recursion doesn't fall into any of these categories. Two cases stand out. First, as a recursive function processes a piece of data recursively, it may use another argument to collect data. We call such arguments ACCUMULATORS and the first subsection explains with some examples how and why they work. Second, for rare cases, the best way to solve a problem is to generate another problem just like the given one, but different. Since the problem belongs to the same class of problems, we can use the function recursively to solve this newly generated problem. After we get a solution for this alternative problem, we can use it to get a solution for the originally given problem. This form of recursion is called GENERATIVE RECURSION and the second subsection presents this form of recursion.

10.1  Accumulators

Take a look at reverse again:

;; (Listof X)  -->  (Listof X)
;; produce a reversed version of l
(define (reverse l)
    [(null? l) '()]
    [else (append (reverse (cdr l)) (list (car l)))]))

It consumes a list such as '(a c b e) and produces '(e b c a).

The function is naturally recursive. If the list is empty, the reverse is empty. Otherwise, the function recurs on the rest of the list and produces the reverse of the rest. For example, when reverse is given '(a c b e), the cdr is '(c b e) and its reverse is '(e b c). To get the reverse of the originally given list, we need to add the car of the list, which is 'a, to the end of (reverse (cdr l)).

Although the function is natural at first glance, any decent programmer knows that the combination of reverse and append traverses the entire list twice. If the function could instead keep track of the things to be appended and append them when the end is reached, we could skip the entire append. To do so, the function must keep track of the elements it has seen so far:

(define (reverse-aux l seen-so-far)
    [(null? l) ???]
    [else ... (reverse-aux (cdr l) (cons (car l) seen-so-far)) ...]))

Let's work through our running example again:

given: seen-so-far
'(a c b e) '()
'(c b e) '(a)
'(b e) '(c a)
'(e) '(b c a)
'() '(e b c a)
As this table shows, seen-so-far is actually the reverse of all the elements from the beginning of the list to the current part of the list. Thus, when reverse-aux gets to the end of the list, seen-so-far is the reverse of the entire list.

Here is the actual definition of reverse:

;; (Listof X)  -->  (Listof X)
;; produce a reversed version of l
(define (reverse l0)
  (let aux ([l l0] [seen-so-far '()]) ;; accu: the reverse of l to l0 
      [(null? l) seen-so-far]
      [else (aux (cdr l) (cons (car l) seen-so-far))])))

The body of this function is a let-loop that consumes two arguments: the list and the ACCUMULATOR seen-so-far. When the loop has traversed the entire list, it returns seen-so-far, which is the reverse of the originally given list (l0).

Consider the problem of computing the product of a list of numbers:

;; (Listof Number)  -->  Number
(define (product l0)
  (let aux ([l l0][so-far 1])
      [(null? l) so-far]
      [else (aux (cdr l) (* (car l) so-far))])))

Even though we could have defined this function easily in our plain old recursive style, we used a let-loop with an accumulator. Here the accumulator keeps track of the product of all the numbers seen from the start of the list (l) to the current head of the list (l0). When the loop exhausts l, it just returns the value of the accumulator.

Using an accumulator-style loop often helps with unusual situations, too. Say product really consumes a list of numbers that are read from an input port of the compute in one instruction. (We'll see how that works soon.) Perhaps the list therefore contains symbols. Of course, the loop shouldn't do anything when it encounters such a symbol and should instead continue with the next iteration:

;; (Listof (union Number '))  -->  Number
(define (product l0)
  (let aux ([l l0][so-far 1])
      [(null? l) so-far]
      [(eq? ' (car l)) (aux (cdr l) so-far)]
      [else (aux (cdr l) (* (car l) so-far))])))

Accommodating this situation is easy now. We just add a condition to the loop's conditional that watches out for ' and, upon encountering one, just recurs on the cdr of l without performing any multiplication.7

If the input port is noisy, the list may also contain non-numbers. When this happens, we may wish to stop the loop but not the program that surrounds it. Specifically, the function should return this ``noise'' element (a character) so that the program can deal with this situation:

;; (Listof (union Number ' Char))  -->  (union Number Char)
(define (product l0)
  (let aux ([l l0][so-far 1])
      [(null? l) so-far]
      [(eq? ' (car l)) (aux (cdr l) so-far)]
      [(char? (car l)) (car l)]
      [else (aux (cdr l) (* (car l) so-far))])))

Again, exiting the loop is no problem for us. We add a condition that looks for characters in the input list and, upon finding one, it just returns that char.8

10.2  Generative Recursion

Studying the structure of a function's input is one way of figuring out how to write the function. Another one is to think of the input data as just an encoding of a problem. For example, a sort function consumes a list of numbers and produces a sorted version of this list. Following the natural recursive pattern, we write the function as follows:

;; (Listof Number)  -->  (Listof Number)
;; produce a sorted version of l
(define (sort l) 
  (local ((define (insert x l)
	      [(null? l) (list x)]
	      [else (if (< x (car l)) (cons x l) (cons (car  l) (insert x (cdr l))))])))
    (foldr insert empty l)))

The foldr loop applies insert to each item and a (sorted) suffix of the list. The insert function puts a number into its proper place in an already sorted list.

Instead of looking at the list as a list, we can also look at the problem of sorting a bunch of numbers. Following Hoare's logic, we can split the list into those items that are smaller, equal, and larger than, say, the first element and then sort those lists. If we append the results, we get a sorted list:

(define (qsort alon)
    [(null? alon) '()]
    [else (let ([first (car alon)])
	      ;; select those smaller than first and sort
	      (qsort (filter (lambda (x) (< x first)) alon))
              ;; select those equal to first and put them here
	      (filter (lambda (x) (= x first)) alon)
	      ;; select those larger than first and sort
	      (qsort (filter (lambda (x) (> x first)) alon))))]))

As a matter of fact, Hoare showed decades ago that this divide-and-conquer approach is often much better than what is now known as insertion sort.

More generally, the designer of a function starts with an ad hoc experience. What if I could solve a similar, but easier problem? Would that help? If I had its solution, how would it help me solve the original problem? How does it all stop? In qsort, we see all these elements. And with some puzzling, we can work it put all the pieces together so that a sorting algorithm works out.

The key insight of this whole process is that we can solve a similar, but easier (smaller, simpler) problem with the same function. This is, of course, a form of recursion. It is not, however, the kind of recursion we have seen with many list-processing functions. It is a ``generate a new problem and solve it'' recursion, and finding such recursions is far more difficult than finding ``structural'' recursions that traverse a piece of data in a natural manner.

While Hoare's quick-sort algorithm is a powerful demonstration of the power of recursion, using Scheme is not about this kind of recursion. Instead, using Scheme is about understanding the power of functions, composing functions, abstracting with functions, and have rich functional libraries for communicating with the rest of the world.

7 In a C-style language, this corresponds to a continue statement. Once again functional programmers don't need it; they can build this kind of thing on the cheap.

8 Yes, this is a C-style exit statement. Since a Scheme program doesn't convert (or misinterpret) chars to (as) numbers, it is also safe to return a character or a number from product. The context of product cannot accidentally use one as the other.