6.2.0.3
Intermezzo: The Cost of Computation
What do you know about function f once the following tests
succeed:If you wish to make sure, once and for all, that a
program performs the desired computation, you need to study logic and its
application to computing. Logic relates to computing like analysis to
engineering. This book’s design process is deeply informed by logic, and a
course on logic in computation is an ideal complement.
If this question showed up on a standard test, you might guess at this
definition:
But nothing speaks against the following:
Tests alone tell you little else than that a function works as expected on
a select, finite number of inputs. Naturally, your confidence that the
function works properly overall increases when you follow a systematic
design process.
In the same spirit, timing the evaluation of a function application on
specific inputs tells you how long it takes to compute the answer for
those inputs. Now imagine you have two functions—g and
h—that compute the same answers when given the same input and
assume that g always computes the answer faster than
h. Making Choices presents just such a choice with its
comparison of gcd, a structurally-recursive function, and
gcd-generative, an equivalent but generative-recursive function.
The timing comparison suggests that the latter is much faster than the
former.
Figure 113: A comparison of two running time expressions
How confident are you that you wish to use g instead of
h? Consider the graph in figure 113. The
x-axis records the size of the input—say the number of items on a
list—and the y-axis records the time it takes to compute the
answer. Assume that the straight line represents the running time of
g and the curved graph represents h. In the shaded
region, g takes less time than h. But, at the edge of
the shaded region, the two graphs cross, and to its right, the performance
of h is better than that of g. If, for whatever reasons,
you had evaluated the performance of g and h only for
input sizes in the blue region and if your clients were to run your
program mostly on inputs that fall in the non-shaded region, you would be
delivering the wrong function.
This intermezzo introduces a phrase (and its meaning) for making general
statements about the performance of a program. Any serious programmer must
be thoroughly familiar with this notion. It is the most fundamental method
for analyzing and comparing performance attributes of programs, but in
addition, it is also a generally concept for describing the growth of
functions in many other disciplines. This intermezzo provides a first
glimpse at the idea; a text on algorithms provides a mechanism for the
deriving “order or of” estimates for the performance of algorithms. The
first subsection is an example-driven introduction to the idea. The second
one provides a rigorous definition. The last subsection explains why we
insist that the template for structural design uses predicates and
selectors.
Concrete Time, Abstract Time
Making Choices compares the running time of
gcd and
gcd-generative. In addition, it shows how to count how many times
the functions recur. The latter suggests why the running time for
gcd-generative compares so favorably with
gcd. Hence we
use this idea as a starting point, starting with a second look at
how-many from
Designing With Self-Referential Data Definitions, a function that is easy
to understand:
It consumes a list and computes how many items the list contains.
Suppose we wanted to know how long it takes to compute the length of an
arbitrarily long, non-empty list. Using the rules of computation from
Intermezzo: BSL, we can simulate this computation as a series of algebra-like
manipulations:
(how-many some-non-empty-list) |
== |
(cond |
[(empty? some-non-empty-list) 0] |
[else (+ (how-many (rest some-non-empty-list)) 1)]) |
== |
(cond |
[#false 0] |
[else (+ (how-many (rest some-non-empty-list)) 1)]) |
== |
(cond |
[else (+ (how-many (rest some-non-empty-list)) 1)]) |
== |
(+ (how-many (rest some-non-empty-list)) 1) |
The first step is to replace
a-list in the definition of
how-many with the actual argument,
some-non-empty-list, which
yields the first
cond expression. Next we must evaluate
By assumption the result is
#false. The question is how long it
takes. While we don’t know and don’t care about the precise amount, it is
safe to say that this step takes a fixed amount of time. Indeed, this also holds
for the next step, when
cond checks what the value of first condition
is. Since it is
#false, the first
cond line is
dropped. Checking whether a
cond line starts with
else is
equally fast, which means we are left with
(+ (how-many (rest some-non-empty-list)) 1)
If we knew the value of
some-non-empty-list,
rest would extract
the remainder of the list in another fixed amount of time, but then we would be
stuck. To compute how long
how-many takes to determine the length of
some list, we need to know how long
how-many takes to compute the number
of items in the rest of that list. In short, what really matters is the number of
recursive steps, not how many small, fixed-time computations we perform.
So, here is a sample evaluation of how-many applied to '(a b c),
showing only the intermediate step where a new recursion comes about:
(how-many (list 'a 'b 'c)) |
== (+ (how-many (list 'b 'c)) 1) |
== (+ (+ (how-many (list 'c)) 1) 1) |
== (+ (+ (+ (how-many '()) 1) 1) 1) |
== 3 |
As far as timing is concerned, the steps in between really are the same. If we
apply how-many to a shorter list, we need fewer recursive steps:
(how-many (list 'e)) |
== (+ (how-many '()) 1) |
== 1 |
If we apply how-many to a longer list, we need more steps. All in all,
the time it takes to evaluate (how-many some-list) is the number of items
on the list times the fixed amount of time for the small steps of checking
emptiness and adding 1 to some natural number.
Generalizing from this example suggests that the running time depends on the size
of the input.Abstract because the measure ignores the
details of how much time primitive computation steps take. More importantly,
though, it also implies that the number of recursive steps is a good estimate for
the length of an evaluation sequence and thus the evaluation time. For this
reason, computer scientists discuss the abstract running time of a
program as a relationship between the size of the input and the number of
recursive steps in an evaluation. In our first example, the size of the input is
the number of items on the list. Thus, a list of one item requires one recursive
step, a list of two needs two steps, and for a list of N items, it’s
N steps.
Not all functions have such a simple measure of abstract running time. Take a
look at the first recursive function in this book:
For a list that starts with 'flatt
(contains-flatt? (list 'flatt 'robot 'ball 'game-boy 'pokemon))
the function requires no recursive steps at all. In contrast, if
'flatt occurs at the end of the list,
(contains-flatt? (list 'robot 'ball 'game-boy 'pokemon 'flatt))
the evaluation requires as many recursive steps as there are items in the list.
In other words, in the best case, the function can find the answer immediately;
in the worst case, the function must search the entire input list.
Programmers cannot assume that inputs are always of the best posisble shape; and
they must hope that the inputs are not of the worst possible shape. Instead, they
must analyze how much time their functions take {on the average}. For example,
contains-flatt? may—
on the average—find 'flatt somewhere in
the middle of the list. Thus, we could say that if the input contains N
items, the abstract running time of contains-flatt? is
, that is, it recurs half as often as the number of items on the
input. Because we already measure the running time in an abstract manner, we can
actually ignore the division. Here is why. By assumption, each basic step takes
K units of time but we know that every so often, software and computers
get twice as fast, meaning we could use as the constant for
basic work. Since
we can ignore such constant factors.
Computer scientists use the phrase a function f takes “on the order of
N steps” to indicate that they are talking about the abstract, rather
than concrete, running time of f. To use the phrase correctly, it must
come with an explanation of N, for example, “it counts the number of
items on the given list” or “it is the number of digits in the given number.”
Without such an explanation, the original phrase is actually meaningless.
Now consider our standard sorting function from
figure 47.
Here is a hand-evaluation for a small input, listing all recursive steps:
(sort (list 3 1 2)) |
== (insert 3 (sort (list 1 2))) |
== (insert 3 (insert 1 (sort (list 2)))) |
== (insert 3 (insert 1 (insert 2 (sort '())))) |
== (insert 3 (insert 1 (insert 2 '()))) |
== (insert 3 (insert 1 (list 2))) |
== (insert 3 (cons 2 (insert 1 '()))) |
== (insert 3 (list 2 1)) |
== (insert 3 (list 2 1)) |
== (list 3 2 1) |
The evaluation shows how
sort traverses the given list and how it sets up
an application of
insert for each number in the list. Put differently,
sort is a two-phase program. During the first one, the recursive steps
for
sort set up as many applications of
insert as there are
items in the list. During the second phase, each application of
insert
traverses a sorted list that, on the average, is about half as long as the
original one.
Inserting an item is similar to finding an item, so it is not surprising that
insert and contains-flatt? are alike. More specifically, the
applications of insert to a list of N items triggers between
0 and N recursive steps. On the average, we assume it requires
, which—with our new terminology—means that insert takes
on the order of N steps where N is the length of the given list.
Since
sort uses
insert once per item on the list, we also get an
order of
N insert steps per use of
sort, where
N
is again the size of the list. By combining these two analysis steps, we now see
that
sort takes on the order of
N steps itself plus
N2
recursive steps in
insert for a list of
N items. In sum, this
yields
steps, but
exercise 427 ask you to argue that this is equivalent to saying
that insertion sort requires on the order of
N2 steps.
Note This
analysis assumes that comparing two items on the list takes a fixed amount of time.
Let’s start with a small input:
(list 3 2 1 0). We know that the
result is
3. Here is the first important step of a hand-evaluation:
From here, we must evaluate the first recursive call. Because the result is
0 and the condition is thus #false, we must evaluate the
recursion in the else-branch as well.
Once we do so, we see how it triggers two evaluations of
(inf (list 1 0)):
From here we can generalize the pattern and summarize it in a table:
original expression | | requires two evaluations of |
(inf (list 3 2 1 0)) | | (inf (list 2 1 0)) |
(inf (list 2 1 0)) | | (inf (list 1 0)) |
(inf (list 1 0)) | | (inf (list 0)) |
In sum, the hand-evaluation requires eight recursive steps for a list of
four items. If we add 4 (or a larger number) to the front of the
list, we double the number of recursive steps again. Speaking
algebraically, inf needs on the order of 2N recursive
steps for a list of N numbers in the worst case, which is when the
last number is the maximum.
Stop! If you pay close attention, you know that the above suggestion is
sloppy. The inf function really just needs
recursive steps for a list of N items. What is going on?
Remember that we don’t really measure the exact time when we say “on the
order of.” Instead we skip over all basic functions—build-in predicates,
selectors, constructors, arithmetic, and so on—and focus on recursive
steps only. Now consider this calculation:
It shows that
and
differ by a small factor:
2
, meaning “on the order of
steps” describes
inf in a world where all basic functions provided by
*SL run
at half the speed when compared to an
inf function that runs at
“the order of
steps.” In this sense, the two expressions
really mean the same thing. The question is what they mean, and that is
the subject of the next section.
While the scenario we considered is the worst possible case, the analysis
of inf’s abstract running time explains why the rewrite of
inf with local—see the right side above—is faster.
Instead of recomputing the smallest item in the rest of the list, this
version just refers to the variable twice when the variable stands for the
smallest item already. You may now wish to revisit
exercise 230, which asks you to explore a similar problem.
Exercise 425. Hand-evaluate (infL (list 3 2 1 0)) in
a manner similar to our evaluation of (inf (list 3 2 1 0)). What
is the abstract running time of inv.v2?
Exercise 426. A number tree is either a number or a pair of
number trees. Design the function sum-tree, which determines the
sum of the numbers in a tree. What is its abstract running time? What is
an acceptable measure of the size of such a tree? What is the worst
possible shape of the tree? What’s the best possible shape?
The Definition of “on the Order of”
The preceding section alluded to all the key ingredients of the phrase “on
the order of.” Now it is time to introduce a rigorous description of the
phrase. Let’s start with the two ideas that the preceding section
develops:
The abstract measurement of performance is a relationship between
two quantities: the size of the input and the number of recursive steps
needed to determine the answer. The relationship is actually a
(mathematical) function that maps a natural number—a measure of the
input’s size—to a natural number—a count of the recursive steps.
Hence, a comparison of the performance of two functions calls for the
comparison of two such functions.
And this second idea brings us to the critical point: performance
comparisons are comparisons of functions on natural numbers, which are
infinitely large objects.
How do you decide whether one function is “better” than
another?Exercise 219 tackles the different question
of whether we can formulate a function in a programming language that
decides whether two given functions are equal. In this intermezzo, we are
not interested in writing a function in any programming language; we are
willing to use plain mathematical arguments instead. Let’s return to the
imagined functions g and h from the introduction. They
compute the same results but their performance differs. The h
function requires “on the order of N steps” while g uses
“on the order of N2 steps.” Let’s also assume that the cost of
each recursive step in g is 1 and that of h is
1000.
While figure 113 displays the comparison graphically (and
in general terms), it is also good to tabulate these abstract running
times for a concrete comparison:
N | | 1 | | 10 | | 50 | | 500 | | 1000 | | 2000 |
g | | 1 | | 100 | | 2500 | | 250000 | | 1000000 | | 4000000 |
h | | 1000 | | 10000 | | 50000 | | 500000 | | 1000000 | | 2000000 |
Like the graphs, the table at first seems to say that g’s
performance is better than h’s, because for inputs of the same
size N, h’s running time is always larger than
h’s. But look at the last item in the table. Once the inputs get
larger, g’s advantage decreases until it disappears at an input
size of 1000. Thereafter g is always slower
than h.
This last insight is the key to the precise definition of the phrase
“order of.” If an infinite function h produces larger outputs
than some other infinite function g for all natural
numbers, then h is clearly larger than g. But what if
this comparison fails for just a few inputs, say for 1000, and
holds for all others? In that case, we would still like to say h
is better than g. And this brings us to the following definition.
Definition Given a function g on the natural numbers,
O(g) (pronounced: “big-O of g”) is a class of functions on
natural numbers. A function f is a member of O(g) if
there exist numbers c and bigEnough such that
for all
it is true that
Naturally, we would love to illustrate this definition with the example of
g and h from above. Our first step is to define
performance functions for g and h. We use capital
letters to denote these functions, which map input sizes to performance
counts:
and
The second step is to find these magic numbers
c and
bigEnough such that
, which would validate that
h’s performance is no worse than
g’s. For now, we just
tell you what these numbers are:
Using these numbers, we need to show that
for every single n larger than 1000. Here is now this kind
of argument is settled:
Pick some specific n0 that satisfies the
condition:
We use the symbolic name n0 so that we don’t make any specific
assumptions about it. Now recall from algebra that you can multiply both
sides of the inequality with the same factor. We use n0:
At this point, it is time to observe that the left side of the inequality
is just H(n0) and the right side is G(n0):
Since n0 is a generic number of the right kind, we have shown
exactly what we wanted to show.
Usually you find bigEnough and c working your way backwards
through such an argument. While this kind of mathematical reasoning is
fascinating, it is best left to a course on algorithms.
The definition of big-O also explains with mathematical rigor why we don’t
have to pay attention to specific constants in our comparisons of abstract
running times. Say we can make each basic step of g go twice as
fast so that we have:
and
All we have to do to make the above argument go through is to double
bigEnough to 2000.
Stop! Work through the argument again. Hint Start by dividing both
sides of the initial inequality by 2.
Finally, most people use big-O together with a shorthand for stating
functions. Thus they say how-many’s running time is
O(N)—because they tend to think of N as an abbreviation of
the (mathematical) function id(N) = N. Similarly, this use yields
the claim that sort’s worst-case running time is O(N2)
and inc’s is O(2N)—again because N2 is
short-hand for the function sqr(N) = N2 and 2N is short
for expt(N) = 2N.
Exercise 427. In the first subsection, we stated that the function
f(n) = n2 + n belongs to the class O(n2). Determine the
pair of numbers c and bigEnough that verify this claim.
Exercise 428. Consider the functions f(n) = 2n and
. Show that g belongs to O(f),
which means that f is abstractly speaking more (or at least
equally) expensive than g. If the input size is guaranteed to be
between 3 and 12, which function is better?
Exercise 429. Compare and . Does f belong to O(g) or g to O(f)?
Why do Functions use Predicates and Selectors?
The “on the order of” concept explains why the design recipes produce
both well-organized and “performant” functions. We illustrate this
insight with a single example, the design of a function that searches for
a number in a list of numbers. Here are the signature, the purpose
statement, and examples formulated as tests:
Here are two definitions that live up to these expectations:
The design of the function on the left follows the design recipe. In
particular, the development of the template calls for the use of
structural predicates per clause in the data definition. Following this
advice yields a conditional function whose first
cond line deals
with empty lists and the second one with all others. The question in the
first
cond line uses
empty? and the second one uses
cons? of
else.
The design of the function on the right fails to live up to the structural
design recipe.Technically searchS uses generative
recursion. It instead takes inspiration from the idea that lists are
containers that have a size. Hence, a function can check this size for
0, which is equivalent to checking whether the given list is
empty.
Although this idea is functionally correct, it makes the assumption that
the cost of *SL-provided functions is a fixed constant. If, however,
length is more like how-many, searchS is going
to be much slower than searchL. Using our new terminology,
searchL is using O(n) while searchS needs
O(n2) recursive steps for a list of n items. In short,
using arbitrary *SL functions to formulate conditions may shift
performance from one class of functions to one that is much worse.
Let’s wrap up this intermezzo with an experiment that determines whether
length is a constant-time function or whether it consumes time
proportionally to the length of the given list. The easiest way is to
define a function that creates a long list and determines how much time
each version of the search function takes:
Now run this function on, say,
10000 and
20000.
If
length is like
empty?, the times for the second run
will be roughly twice of the first one; otherwise, the time for
searchS will increase drammatically.
Stop! Conduct the experiment.
See the end of Data Representations with Accumulators for how other
languages compute the size of a container.
Assuming you have completed the experiment, you now know that
length takes time proportionally to the size of the given list
and the “S” in searchS stands for “squared” because its
running time is O(n2). But don’t jump to the conclusion that this
kind of reasoning holds for every programming language you will encounter.
Many deal with containers differently than *SL. Understanding how they
do this, calls for the introduction of one more design
concept—accumulators, which is what the final part of this book is
about.