Twobit: description of benchmarks

Description of Benchmarks

fib30

Doubly recursive computation of the 30th fibonacci number (832040), using (< n 2) to terminate the recursion. Register windows should perform well on this benchmark, but gcc squanders this advantage by allocating a register window even for the base case.

tak

10 repetitions of the triply recursive Gabriel benchmark.

takl

1 repetition of the tak benchmark, but using lists as a unary notation for integers. Another Gabriel benchmark. Although this is an allocation-intensive benchmark, it is so small that the C versions make no attempt to reclaim storage.

cpstak

The tak benchmark in continuation-passing style, as a test of closure creation. The tested implementations of Common Lisp cannot run this benchmark because their compilers convert the tail recursion into an extremely deep recursion.

idiv2

This Gabriel benchmark divides 300 by 2 several times, using lists as a unary notation for integers. It tests null?, cons, car, cdr, and little else.

rdiv2

Same as idiv2 except it uses deep recursion instead of iteration. Sun Common Lisp and AKCL use the SPARC's register windows to implement recursion, which accounts for their poor showing on this benchmark.

boyer

The kernel of a simple-minded theorem prover for propositional logic. This is one of the largest Gabriel benchmarks.

perm8

Creates a list containing the 40320 permutations of a list of 8 integers, in grey code order, using Zaks's algorithm. This benchmark allocates about 1199296 bytes, all of which goes into the result. In other words, this is an allocation-intensive benchmark that creates no garbage whatsoever.

10perm8

Repeats the perm8 benchmark 10 times, starting with a heap that already contains the output of the perm8 benchmark. The graph of live storage versus time has ten peaks at 2.4 megabytes:

     *     *     *     *     *     *     *     *     *     *
   ***   ***   ***   ***   ***   ***   ***   ***   ***   ***
 ***** ***** ***** ***** ***** ***** ***** ***** ***** *****
************************************************************
************************************************************
************************************************************

At the end of each repetition, the oldest half of the live storage becomes garbage.

10perm8 is the only serious test of garbage collection within this set of benchmarks. It is particularly difficult for generational garbage collectors, since it violates their assumption that young objects have a shorter future life expectancy than older objects. Larceny uses a one-megabyte ephemeral area, so it tends to promote objects that are about to become garbage.

The C versions of this benchmark cheat slightly to simplify the complex structure-sharing of the real benchmark. Otherwise explicit deallocation would not have been practical.

sumperms8

Calculates the sum of the lists allocated by perm8. This benchmark consists of a doubly nested loop. The body of the inner loop is executed 8 times on every iteration of the outer loop. Scheme and Standard ML express this loop structure by using a non-tail-recursive call to invoke the inner loop. None of the tested Scheme or Standard ML compilers are able to avoid creating a continuation for each invocation of the inner loop, an inefficiency that probably shows up on the selectionsort and puzzle benchmarks also. A newer version of SML/NJ is much better at this.

mergesort8

Sorts the list of 40320 permutations that was created by perm8 into lexicographic order, using a stable merge sort. This benchmark, like the next two, uses side effects that may stress the write barrier of a generational garbage collector.

quicksort

Creates an array of 30000 randomly chosen integers and sorts it using a generic array quicksort.

selectionsort

Creates an array of 2000 randomly chosen integers and sorts it into numerically increasing order. The sorting routine is not generic, which allows SML/NJ to bypass the write barrier. This benchmark is written in a C-like style. I don't know why Sun Common Lisp does so much better than Larceny and Chez Scheme on the quicksort and selectionsort benchmarks.

insertionsort

Creates a list of the first 1000 integers in reverse order and sorts it non-destructively. This benchmark was written by Olsson as an allocation-intensive benchmark to compare the speed of Standard ML with that of C++ using conservative garbage collection.

recursive-nfa

1000 repetitions of a backtracking recursive implementation of the obvious nondeterministic finite automaton for ((a | c)* b c d) | (a* b c) being applied to the input a^133 b c. This benchmark converts a string into a list at each repetition, which may bias the results. As a deep recursion, it penalizes implementations that use the SPARC's register windows.