Benchmarks are a crock

With modern superscalar architectures, 5-level memory hierarchies, and wide data paths, changing the alignment of instructions and data can easily change the performance of a program by 20% or more, and Hans Boehm has witnessed a spectacular 100% variation in user CPU time while holding the executable file constant. Since much of this alignment is determined by the linker, loader, and garbage collector, most individual compiler optimizations are in the noise. To evaluate a compiler properly, one must look at the code that it generates, not the timings.

These benchmarks are not even representative

I have selected small benchmarks that illustrate just a few aspects of performance that I feel are important. These aspects of performance may not be the ones that matter for the programs you compile.

For example, your programs may be written in an imperative style, with many assignments to local variables. Twobit does not generate good code for such programs, partly because such assignments in higher-order programs are inherently inefficient and difficult to optimize, but also because this is not a style that my programs use very often.

Furthermore there are many aspects of performance that I feel are important but have omitted from the benchmarks because I know Twobit would generate embarrassingly poor code. For example, none of the benchmarks use floating point, which I have not yet begun to optimize.

A note on C and C++

It is well known that C and C++ are faster than any higher order or garbage collected language. If some benchmark suggests otherwise, then this merely shows that the author of that benchmark does not know how to write efficient C code.

As an example of C code that is much faster than anything that could be written in Scheme, I recommend

Andrew W Appel. Intensional equality ;-) for continuations. ACM SIGPLAN Notices 31(2), February 1996, pages 55-57.