Scheme is the intermediate form

The intermediate form used by the front end is a rigidly stylized subset of Scheme. Stylistic variations that would ordinarily have no semantic significance are used to express the results of static analysis. Other information needed for optimization and transformation is conveyed by quoted data in command positions, where the value would be ignored by a standard implementation of Scheme.

The output of the front end is genuine Scheme code that can be compiled by any Scheme compiler. This makes the front end easier to understand, and has been of great benefit for testing and debugging.

Although many Scheme compilers will in fact generate better code when Twobit's front end is used as a preprocessor, the full benefits of its optimizations are realized only by a code generator that exploits the additional information that is encoded by stylistic variation and quoted constants.

The subset of Scheme used by the front end is generated by the following grammar.

 
  L  -->  (lambda (I_1 ...)
            (begin D ...)
            (quote (R F <decls> <doc>)
            E)
       |  (lambda (I_1 ... . I_rest)
            (begin D ...)
            (quote (R F <decls> <doc>)
            E)
  D  -->  (define I L)
  E  -->  (quote K)                        ; constants
       |  (begin I)                        ; variable references
       |  L                                ; lambda expressions
       |  (E0 E1 ...)                      ; calls
       |  (set! I E)                       ; assignments
       |  (if E0 E1 E2)                    ; conditionals
       |  (begin E0 E1 E2 ...)             ; sequential expressions
  I  -->  
 
  R  -->  ((I <references> <assignments> <calls>) ...)
  F  -->  (I ...)

Note that the variable x is represented by (begin x), which gives to variable references a list structure that can be shared and side effected.

For every identifier that is bound by a lambda expression, the table R contains an entry listing all references, assignments, and calls to that identifier. These references, assignments, and calls share their structure with the actual references, assignments, and calls within the body of the lambda expression. By using side effects, the compiler can substitute expressions for identifiers without having to traverse the identifer's scope. The compiler can also use side effects to rearrange or to add arguments to every call to an identifier.

Except for these R tables and constants, the Scheme expression used as the intermediate form does not otherwise contain any shared structure, not does it share structure with the original definition or expression being compiled. The intermediate form is therefore acyclic, and its printed form is genuine Scheme code, but its printed form is often very large because of the shared structures, and is hard to read because of all the clutter. A make-readable procedure is therefore used to format the intermediate code for easier comprehension by humans. Unless otherwise noted, the intermediate Scheme code shown in these notes is the output of make-readable.

This intermediate form is created by pass 1.