Assignment #9: (Not for submission) Toy CompilerOut: Wednesday, April 23rd Due: Wednesday, April 23rd, 0:00midnight |
Same pairs, unless you let me know.
Update the course plugin for the language level for this assignment.
In this assignment you will build on the interpreter that was extended in the previous homework, so begin by retrieving the posted solution to be used as a basis for this one.
In this homework, you will turn the extended TOY evaluator into a compiler. Do this in small steps, following the same process that was demonstrated in class, starting with the posted solution:
;; compiler-enabled? : (Box Boolean)
;; a global flag that can disable the compiler
(define compiler-enabled? (box #f)) |
;; run : String -> Any
;; compiles and runs a TOY program contained in a string
(define (run str)
(set-box! compiler-enabled? #t)
(let ([compiled (compile (parse str))])
(set-box! compiler-enabled? #f)
(let ([result (compiled global-environment)])
(cases result
[(ScmV v) v]
[else (error 'run
"evaluation returned a bad value: ~s"
result)])))) |
(unless (unbox compiler-enabled?)
(error 'compile "compiler disabled")) |
One problem that you will need to deal with is the fact that the extended language has several places were multiple expressions are evaluated one by one, using eval-body. There are several possible ways to deal with compiling such multiple-expression bodies:
When you pull out all dependencies on the input syntax, pay attention to the case of applying an rfun. In this case, there are two things that can be done at compile time:
Note that the check happens at compile-time, so you don't know whether the function that is applied is going to be a FunV or an RFunV. (There is no way to know that at compile time, since some function calls can be used with a variable function value that can be either one.) You should therefore only perform the check, but raise the error only if you actually get an RFunV function (at run-time).
Also, the list of symbols that you grab is used only in case of an RFunV application. You can use the fact that Scheme is dynamically typed and combine both of these by computing this:
(and (andmap Id? arg-exprs)
(map Id-name arg-exprs)) |
This question is not required, do not submit a solution. Do it only if you think it will be fun. (You can talk to me about it if you did it and want to get some karma points.)
Going back in time, remember that we talked about de-Bruijn indices as an
alternative for variable names. Now that you have an actual compiler, you
combine the de-Bruijn preprocessor with your compiler. This is easy since
they both operate on expressions: you will need to add an argument to the
compiler that describes the names that are known in the current lexical
scope (a list of lists, since each environment frame has a list of names
(no need for the functional representation that was used in the BRANG*
homework)), and use this value to precompute the position of names when
they are looked up. As a result, you will not need to do any lookups
at run-time, since you will know where to find the required value (the
frame number and the index number in the frame). (Dealing with RFunVs
will be a little tricky here.)
If you do this properly, you will notice that you no longer need the names
in the environment, since you pull out the values directly. This makes it
possible to change the environment implementation (as in the BRANG*
homework) so it no longer contains names. Note that global names will
require some special treatment — you can either make up a description of
the global environment to start the compilation with, or you can
‘inline’ global values when the compiler sees them. In either of these,
global lookups are also eliminated in the run-time (inlining requires
forbidding mutation of global bindings). Also note that you can make
closures keep only their arity, instead of a list of names that are not
needed.
For an even better performance, you can use vectors for the implementation
of frames. They are a little better because they have constant-time
access to all elements, unlike lists. See make-vector,
vector-ref, etc.
Throughout all this, you can use some code, like
(time (run "{bindrec {{fib {fun {n}
{if {< n 2}
n
{+ {fib {- n 1}}
{fib {- n 2}}}}}}}
{fib 27}}")) |