On this page:
Concrete Time, Abstract Time
The Definition of “on the Order of”
Why do Functions use Predicates and Selectors?
6.2.0.3

Intermezzo: The Cost of Computation

What do you know about function f once the following tests succeed:If you wish to make sure, once and for all, that a program performs the desired computation, you need to study logic and its application to computing. Logic relates to computing like analysis to engineering. This book’s design process is deeply informed by logic, and a course on logic in computation is an ideal complement.
(check-expect (f 0) 0)
(check-expect (f 1) 1)
(check-expect (f 2) 8)
If this question showed up on a standard test, you might guess at this definition:

(define (f x) (expt x 3))

But nothing speaks against the following:

(define (f x) (if (= x 2) 8 (* x x)))

Tests alone tell you little else than that a function works as expected on a select, finite number of inputs. Naturally, your confidence that the function works properly overall increases when you follow a systematic design process.

In the same spirit, timing the evaluation of a function application on specific inputs tells you how long it takes to compute the answer for those inputs. Now imagine you have two functions—g and hthat compute the same answers when given the same input and assume that g always computes the answer faster than h. Making Choices presents just such a choice with its comparison of gcd, a structurally-recursive function, and gcd-generative, an equivalent but generative-recursive function. The timing comparison suggests that the latter is much faster than the former.

image

Figure 113: A comparison of two running time expressions

How confident are you that you wish to use g instead of h? Consider the graph in figure 113. The x-axis records the size of the input—say the number of items on a list—and the y-axis records the time it takes to compute the answer. Assume that the straight line represents the running time of g and the curved graph represents h. In the shaded region, g takes less time than h. But, at the edge of the shaded region, the two graphs cross, and to its right, the performance of h is better than that of g. If, for whatever reasons, you had evaluated the performance of g and h only for input sizes in the blue region and if your clients were to run your program mostly on inputs that fall in the non-shaded region, you would be delivering the wrong function.

This intermezzo introduces a phrase (and its meaning) for making general statements about the performance of a program. Any serious programmer must be thoroughly familiar with this notion. It is the most fundamental method for analyzing and comparing performance attributes of programs, but in addition, it is also a generally concept for describing the growth of functions in many other disciplines. This intermezzo provides a first glimpse at the idea; a text on algorithms provides a mechanism for the deriving “order or of” estimates for the performance of algorithms. The first subsection is an example-driven introduction to the idea. The second one provides a rigorous definition. The last subsection explains why we insist that the template for structural design uses predicates and selectors.

Concrete Time, Abstract Time

Making Choices compares the running time of gcd and gcd-generative. In addition, it shows how to count how many times the functions recur. The latter suggests why the running time for gcd-generative compares so favorably with gcd. Hence we use this idea as a starting point, starting with a second look at how-many from Designing With Self-Referential Data Definitions, a function that is easy to understand:
(define (how-many a-list)
  (cond
    [(empty? a-list) 0]
    [else (+ (how-many (rest a-list)) 1)]))
It consumes a list and computes how many items the list contains.

Suppose we wanted to know how long it takes to compute the length of an arbitrarily long, non-empty list. Using the rules of computation from Intermezzo: BSL, we can simulate this computation as a series of algebra-like manipulations:
(how-many some-non-empty-list)
==
(cond
  [(empty? some-non-empty-list) 0]
  [else (+ (how-many (rest some-non-empty-list)) 1)])
==
(cond
  [#false 0]
  [else (+ (how-many (rest some-non-empty-list)) 1)])
==
(cond
  [else (+ (how-many (rest some-non-empty-list)) 1)])
==
(+ (how-many (rest some-non-empty-list)) 1)
The first step is to replace a-list in the definition of how-many with the actual argument, some-non-empty-list, which yields the first cond expression. Next we must evaluate

(empty? some-non-empty-list)

By assumption the result is #false. The question is how long it takes. While we don’t know and don’t care about the precise amount, it is safe to say that this step takes a fixed amount of time. Indeed, this also holds for the next step, when cond checks what the value of first condition is. Since it is #false, the first cond line is dropped. Checking whether a cond line starts with else is equally fast, which means we are left with

(+ (how-many (rest some-non-empty-list)) 1)

If we knew the value of some-non-empty-list, rest would extract the remainder of the list in another fixed amount of time, but then we would be stuck. To compute how long how-many takes to determine the length of some list, we need to know how long how-many takes to compute the number of items in the rest of that list. In short, what really matters is the number of recursive steps, not how many small, fixed-time computations we perform.

So, here is a sample evaluation of how-many applied to '(a b c), showing only the intermediate step where a new recursion comes about:
(how-many (list 'a 'b 'c))
== (+ (how-many (list 'b 'c)) 1)
== (+ (+ (how-many (list 'c)) 1) 1)
== (+ (+ (+ (how-many '()) 1) 1) 1)
== 3
As far as timing is concerned, the steps in between really are the same. If we apply how-many to a shorter list, we need fewer recursive steps:
(how-many (list 'e))
== (+ (how-many '()) 1)
== 1
If we apply how-many to a longer list, we need more steps. All in all, the time it takes to evaluate (how-many some-list) is the number of items on the list times the fixed amount of time for the small steps of checking emptiness and adding 1 to some natural number.

Generalizing from this example suggests that the running time depends on the size of the input.Abstract because the measure ignores the details of how much time primitive computation steps take. More importantly, though, it also implies that the number of recursive steps is a good estimate for the length of an evaluation sequence and thus the evaluation time. For this reason, computer scientists discuss the abstract running time of a program as a relationship between the size of the input and the number of recursive steps in an evaluation. In our first example, the size of the input is the number of items on the list. Thus, a list of one item requires one recursive step, a list of two needs two steps, and for a list of N items, it’s N steps.

Not all functions have such a simple measure of abstract running time. Take a look at the first recursive function in this book:
(define (contains-flatt? a-list-of-names)
  (cond
    [(empty? a-list-of-names) #false]
    [(cons? a-list-of-names)
     (or (string=? (first a-list-of-names) "Flatt")
         (contains-flatt? (rest a-list-of-names)))]))
For a list that starts with 'flatt

(contains-flatt? (list 'flatt 'robot 'ball 'game-boy 'pokemon))

the function requires no recursive steps at all. In contrast, if 'flatt occurs at the end of the list,

(contains-flatt? (list 'robot 'ball 'game-boy 'pokemon 'flatt))

the evaluation requires as many recursive steps as there are items in the list. In other words, in the best case, the function can find the answer immediately; in the worst case, the function must search the entire input list.

Programmers cannot assume that inputs are always of the best posisble shape; and they must hope that the inputs are not of the worst possible shape. Instead, they must analyze how much time their functions take {on the average}. For example, contains-flatt? may—on the average—find 'flatt somewhere in the middle of the list. Thus, we could say that if the input contains N items, the abstract running time of contains-flatt? is image, that is, it recurs half as often as the number of items on the input. Because we already measure the running time in an abstract manner, we can actually ignore the division. Here is why. By assumption, each basic step takes K units of time but we know that every so often, software and computers get twice as fast, meaning we could use image as the constant for basic work. Since

image

we can ignore such constant factors.

Computer scientists use the phrase a function f takes “on the order of N steps” to indicate that they are talking about the abstract, rather than concrete, running time of f. To use the phrase correctly, it must come with an explanation of N, for example, “it counts the number of items on the given list” or “it is the number of digits in the given number.” Without such an explanation, the original phrase is actually meaningless.

Now consider our standard sorting function from figure 47. Here is a hand-evaluation for a small input, listing all recursive steps:
(sort (list 3 1 2))
== (insert 3 (sort (list 1 2)))
== (insert 3 (insert 1 (sort (list 2))))
== (insert 3 (insert 1 (insert 2 (sort '()))))
== (insert 3 (insert 1 (insert 2 '())))
== (insert 3 (insert 1 (list 2)))
== (insert 3 (cons 2 (insert 1 '())))
== (insert 3 (list 2 1))
== (insert 3 (list 2 1))
== (list 3 2 1)
The evaluation shows how sort traverses the given list and how it sets up an application of insert for each number in the list. Put differently, sort is a two-phase program. During the first one, the recursive steps for sort set up as many applications of insert as there are items in the list. During the second phase, each application of insert traverses a sorted list that, on the average, is about half as long as the original one.

Inserting an item is similar to finding an item, so it is not surprising that insert and contains-flatt? are alike. More specifically, the applications of insert to a list of N items triggers between 0 and N recursive steps. On the average, we assume it requires image, which—with our new terminology—means that insert takes on the order of N steps where N is the length of the given list.

Since sort uses insert once per item on the list, we also get an order of N insert steps per use of sort, where N is again the size of the list. By combining these two analysis steps, we now see that sort takes on the order of N steps itself plus N2 recursive steps in insert for a list of N items. In sum, this yields

N2 + N

steps, but exercise 427 ask you to argue that this is equivalent to saying that insertion sort requires on the order of N2 steps. Note This analysis assumes that comparing two items on the list takes a fixed amount of time.

Our final example is the function inf from Local Function Definitions:
(define (inf l)
  (cond
    [(empty? (rest l)) (first l)]
    [else
     (if (< (first l) (inf (rest l)))
         (first l)
         (inf (rest l)))]))
(define (infL l)
  (cond
    [(empty? (rest l)) (first l)]
    [else
     (local ((define s (infL (rest l))))
       (if (< (first l) s)
           (first l)
           s))]))

Let’s start with a small input: (list 3 2 1 0). We know that the result is 3. Here is the first important step of a hand-evaluation:
(inf (list 3 2 1 0))
==
(if (< 3 (inf (list 2 1 0)))
    3
    (inf (list 2 1 0)))
From here, we must evaluate the first recursive call. Because the result is 0 and the condition is thus #false, we must evaluate the recursion in the else-branch as well.

Once we do so, we see how it triggers two evaluations of (inf (list 1 0)):
(inf (list 2 1 0))
==
(if (< 2 (inf (list 1 0)))
    2
    (inf (list 1 0)))
From here we can generalize the pattern and summarize it in a table:

original expression

     

requires two evaluations of

(inf (list 3 2 1 0))

     

(inf (list 2 1 0))

(inf (list 2 1 0))

     

(inf (list 1 0))

(inf (list 1 0))

     

(inf (list 0))

In sum, the hand-evaluation requires eight recursive steps for a list of four items. If we add 4 (or a larger number) to the front of the list, we double the number of recursive steps again. Speaking algebraically, inf needs on the order of 2N recursive steps for a list of N numbers in the worst case, which is when the last number is the maximum.

Stop! If you pay close attention, you know that the above suggestion is sloppy. The inf function really just needs image recursive steps for a list of N items. What is going on?

Remember that we don’t really measure the exact time when we say “on the order of.” Instead we skip over all basic functions—build-in predicates, selectors, constructors, arithmetic, and so on—and focus on recursive steps only. Now consider this calculation:

image

It shows that image and image differ by a small factor: 2, meaning “on the order of image steps” describes inf in a world where all basic functions provided by *SL run at half the speed when compared to an inf function that runs at “the order of image steps.” In this sense, the two expressions really mean the same thing. The question is what they mean, and that is the subject of the next section.

While the scenario we considered is the worst possible case, the analysis of inf’s abstract running time explains why the rewrite of inf with localsee the right side above—is faster. Instead of recomputing the smallest item in the rest of the list, this version just refers to the variable twice when the variable stands for the smallest item already. You may now wish to revisit exercise 230, which asks you to explore a similar problem.

Exercise 425. Hand-evaluate (infL (list 3 2 1 0)) in a manner similar to our evaluation of (inf (list 3 2 1 0)). What is the abstract running time of inv.v2? image

Exercise 426. A number tree is either a number or a pair of number trees. Design the function sum-tree, which determines the sum of the numbers in a tree. What is its abstract running time? What is an acceptable measure of the size of such a tree? What is the worst possible shape of the tree? What’s the best possible shape? image

The Definition of “on the Order of”

The preceding section alluded to all the key ingredients of the phrase “on the order of.” Now it is time to introduce a rigorous description of the phrase. Let’s start with the two ideas that the preceding section develops:
  1. The abstract measurement of performance is a relationship between two quantities: the size of the input and the number of recursive steps needed to determine the answer. The relationship is actually a (mathematical) function that maps a natural number—a measure of the input’s size—to a natural number—a count of the recursive steps.

  2. Hence, a comparison of the performance of two functions calls for the comparison of two such functions.

And this second idea brings us to the critical point: performance comparisons are comparisons of functions on natural numbers, which are infinitely large objects.

How do you decide whether one function is “better” than another?Exercise 219 tackles the different question of whether we can formulate a function in a programming language that decides whether two given functions are equal. In this intermezzo, we are not interested in writing a function in any programming language; we are willing to use plain mathematical arguments instead. Let’s return to the imagined functions g and h from the introduction. They compute the same results but their performance differs. The h function requires “on the order of N steps” while g uses “on the order of N2 steps.” Let’s also assume that the cost of each recursive step in g is 1 and that of h is 1000.

While figure 113 displays the comparison graphically (and in general terms), it is also good to tabulate these abstract running times for a concrete comparison:

N

     

1

     

10

     

50

     

500

     

1000

     

2000

g

     

1

     

100

     

2500

     

250000

     

1000000

     

4000000

h

     

1000

     

10000

     

50000

     

500000

     

1000000

     

2000000

Like the graphs, the table at first seems to say that g’s performance is better than h’s, because for inputs of the same size N, h’s running time is always larger than h’s. But look at the last item in the table. Once the inputs get larger, g’s advantage decreases until it disappears at an input size of 1000. Thereafter g is always slower than h.

This last insight is the key to the precise definition of the phrase “order of.” If an infinite function h produces larger outputs than some other infinite function g for all natural numbers, then h is clearly larger than g. But what if this comparison fails for just a few inputs, say for 1000, and holds for all others? In that case, we would still like to say h is better than g. And this brings us to the following definition.

Definition Given a function g on the natural numbers, O(g) (pronounced: “big-O of g”) is a class of functions on natural numbers. A function f is a member of O(g) if there exist numbers c and bigEnough such that

for all image it is true that image

Naturally, we would love to illustrate this definition with the example of g and h from above. Our first step is to define performance functions for g and h. We use capital letters to denote these functions, which map input sizes to performance counts:

image

and

image

The second step is to find these magic numbers c and bigEnough such that image, which would validate that h’s performance is no worse than g’s. For now, we just tell you what these numbers are:

image

Using these numbers, we need to show that

image

for every single n larger than 1000. Here is now this kind of argument is settled:

Pick some specific n0 that satisfies the condition:

image

We use the symbolic name n0 so that we don’t make any specific assumptions about it. Now recall from algebra that you can multiply both sides of the inequality with the same factor. We use n0:

image

At this point, it is time to observe that the left side of the inequality is just H(n0) and the right side is G(n0):

image

Since n0 is a generic number of the right kind, we have shown exactly what we wanted to show.

Usually you find bigEnough and c working your way backwards through such an argument. While this kind of mathematical reasoning is fascinating, it is best left to a course on algorithms.

The definition of big-O also explains with mathematical rigor why we don’t have to pay attention to specific constants in our comparisons of abstract running times. Say we can make each basic step of g go twice as fast so that we have:

image

and

image

All we have to do to make the above argument go through is to double bigEnough to 2000.

Stop! Work through the argument again. Hint Start by dividing both sides of the initial inequality by 2.

Finally, most people use big-O together with a shorthand for stating functions. Thus they say how-many’s running time is O(N)because they tend to think of N as an abbreviation of the (mathematical) function id(N) = N. Similarly, this use yields the claim that sort’s worst-case running time is O(N2) and inc’s is O(2N)again because N2 is short-hand for the function sqr(N) = N2 and 2N is short for expt(N) = 2N.

Exercise 427. In the first subsection, we stated that the function f(n) = n2 + n belongs to the class O(n2). Determine the pair of numbers c and bigEnough that verify this claim. image

Exercise 428. Consider the functions f(n) = 2n and image. Show that g belongs to O(f), which means that f is abstractly speaking more (or at least equally) expensive than g. If the input size is guaranteed to be between 3 and 12, which function is better? image

Exercise 429. Compare image and image. Does f belong to O(g) or g to O(f)? image

Why do Functions use Predicates and Selectors?

The “on the order of” concept explains why the design recipes produce both well-organized and “performant” functions. We illustrate this insight with a single example, the design of a function that searches for a number in a list of numbers. Here are the signature, the purpose statement, and examples formulated as tests:
; Number [List-of Number] -> Boolean
; is x in l
 
(check-expect (search 0 '(3 2 1 0)) #true)
(check-expect (search 4 '(3 2 1 0)) #false)

Here are two definitions that live up to these expectations:
(define (searchL x l)
  (cond
    [(empty? l) #false]
    [else
      (or (= (first l) x)
          (searchL x (rest l)))]))

  

(define (searchS x l)
  (cond
    [(= (length l) 0) #false]
    [else
      (or (= (first l) x)
          (searchS x (rest l)))]))
The design of the function on the left follows the design recipe. In particular, the development of the template calls for the use of structural predicates per clause in the data definition. Following this advice yields a conditional function whose first cond line deals with empty lists and the second one with all others. The question in the first cond line uses empty? and the second one uses cons? of else.

The design of the function on the right fails to live up to the structural design recipe.Technically searchS uses generative recursion. It instead takes inspiration from the idea that lists are containers that have a size. Hence, a function can check this size for 0, which is equivalent to checking whether the given list is empty.

Although this idea is functionally correct, it makes the assumption that the cost of *SL-provided functions is a fixed constant. If, however, length is more like how-many, searchS is going to be much slower than searchL. Using our new terminology, searchL is using O(n) while searchS needs O(n2) recursive steps for a list of n items. In short, using arbitrary *SL functions to formulate conditions may shift performance from one class of functions to one that is much worse.

Let’s wrap up this intermezzo with an experiment that determines whether length is a constant-time function or whether it consumes time proportionally to the length of the given list. The easiest way is to define a function that creates a long list and determines how much time each version of the search function takes:
; N -> [List Boolean Boolean]
; how long do searchS and searchL take
; to look for n in (list 0 ... (- n 1))
(define (timing n)
  (local ((define long-list (build-list n (lambda (x) x))))
    (list
      (time (searchS n long-list))
      (time (searchL n long-list)))))
Now run this function on, say, 10000 and 20000. If length is like empty?, the times for the second run will be roughly twice of the first one; otherwise, the time for searchS will increase drammatically.

Stop! Conduct the experiment.

See the end of Data Representations with Accumulators for how other languages compute the size of a container.

Assuming you have completed the experiment, you now know that length takes time proportionally to the size of the given list and the “S” in searchS stands for “squared” because its running time is O(n2). But don’t jump to the conclusion that this kind of reasoning holds for every programming language you will encounter. Many deal with containers differently than *SL. Understanding how they do this, calls for the introduction of one more design concept—accumulators, which is what the final part of this book is about.