2008-02-08 Substitution Caches, Evaluating with Caches ======================================================================== >>> Using Higher-Order & Anonymous Functions Say that we have a function for estimating derivatives of a function at a specific point: (define dx 0.01) ;; deriv : (Number -> Number) Number -> Number ;; compute the derivative of `f' at the given point `x' (define (deriv f x) (/ (- (f (+ x dx)) (f x)) dx)) ;; integrate : (Number -> Number) Number -> Number ;; compute an integral of `f' at the given point `x' (define (integrate f x) (define (loop y acc) (if (> y x) (/ acc dx) (loop (+ y dx) (+ acc (f y))))) (loop 0 0)) And say that we want to try out various functions given some `plot-fun' procedure that knows how to draw graphs of numeric functions, for example: (plot-fun sin) To actually try this out, write this: (require (lib "plot.ss" "plot")) (define (plot-fun fun) (plot (line fun (color 'red)))) The problem is that `plot-fun' expects a single `(num -> num)' function -- if we want to try it with a derivative, we can do this: ;; sin-deriv : Number -> Number ;; the derivative of sin (define (sin-deriv x) (deriv sin x)) (plot-fun sin-deriv) But this will get very tedious very fast -- it is much simpler to use an anonymous function: (plot-fun (lambda (x) (deriv sin x))) we can even verify that our derivative is correct by comparing a known function to its derivative (plot-fun (lambda (x) (- (deriv sin x) (cos x)))) But it's still not completely natural to do these things -- you need to explicitly combine functions, which is not too convenient. Instead of doing this, we can write H.O. functions that will work with functional inputs and outputs. For example, we can write a function to subtract functions: ;; fsub : (Number -> Number) (Number -> Number) -> (Number -> Number) ;; subtracts two numeric 1-argument functions (define (fsub f g) (lambda (x) (- (f x) (g x)))) and the same for the derivative: ;; fderiv : (Number -> Number) -> (Number -> Number) ;; compute the derivative function of `f' (define (fderiv f) (lambda (x) (deriv f x))) Now we can try the same in a much easier way: (plot-fun (fsub (fderiv sin) cos)) More than that -- our `fderiv' could be created from `deriv' automatically: ;; currify : (A B -> C) -> (A -> B -> C) (define (currify f) (lambda (x) (lambda (y) (f x y)))) ;; fderiv : (Number -> Number) -> (Number -> Number) (define fderiv (currify deriv)) Same principle with `fsub': ;; binop->fbinop : (Number Number -> Number) ;; -> (Number -> Number) (Number -> Number) ;; -> (Number -> Number) ;; turns an arithmetic binary operator to a function operator (define (binop->fbinop op) (lambda (f g) (lambda (x) (op (f x) (g x))))) ;; fsub : (Number -> Number) (Number -> Number) -> (Number -> Number) (define fsub (binop->fbinop -)) We can do this with anything -- developing a rich library of functions and functionals is extremely easy... Here's a pretty extensive yet very short library of functions: ---------------------------------------------------------------------- (define (currify f) (lambda (x) (lambda (y) (f x y)))) (define (binop->fbinop op) (lambda (f g) (lambda (x) (op (f x) (g x))))) (define (compose f g) (lambda (x) (f (g x)))) (define dx 0.01) (define (deriv f x) (/ (- (f (+ x dx)) (f x)) dx)) (define (integrate f x) (define over (if (< x 0) < >)) (define step (if (< x 0) - +)) (define add (if (< x 0) - +)) (define (loop y acc) (if (over y x) (* acc dx) (loop (step y dx) (add acc (f y))))) (loop 0 0)) (define fadd (binop->fbinop +)) (define fsub (binop->fbinop -)) (define fmul (binop->fbinop *)) (define fdiv (binop->fbinop /)) (define fderiv (currify deriv)) (define fintegrate (currify integrate)) ;; ... ---------------------------------------------------------------------- Examples: ;; want to verify that `integrate' is the opposite of `deriv': ;; take a function, subtract it from its derivative's integral (plot-fun (fsub sin (fintegrate (fderiv sin)))) ;; want to magnify the errors? -- here's how you magnify: (plot-fun (compose ((currify *) 5) sin)) ;; so: (plot-fun (compose ((currify *) 20) (fsub sin (fintegrate (fderiv sin))))) All of this is similar to run-time code generation, but not really. The only thing that `fderiv' does is take a function and store it somewhere in the returned function, then when that function receives a number, it uses the stored function and send it to deriv with the number. We could simply write deriv as what `fderiv' is -- which is the *real* derivative function: (define (deriv f) (lambda (x) (/ (- (f (+ x dx)) (f x)) dx))) but again, this is not faster or slower than the plain deriv. However, there are some situations where we can do some of the computation on the first-stage argument, saving work from the second stage. Here is a cooked-to-exaggeration example -- we want a function that receives two inputs x,y and returns fib(x)*y, but we must use a stupid `fib': (define (fib n) (if (<= n 1) n (+ (fib (- n 1)) (fib (- n 2))))) The function we want is: (define (bogus x y) (* (fib x) y)) If we currify it as usual (or just use `currify'), we get: (define (bogus x) (lambda (y) (* (fib x) y))) And try this several times: (define bogus24 (bogus 24)) (map bogus24 '(1 2 3 4 5)) But in the definition of `bogus', notice that `(fib x)' does not depend on `y' -- so we can rewrite it a little differently: (define (bogus x) (let ([fibx (fib x)]) (lambda (y) (* fibx y)))) and trying the above again is much faster now: (define bogus24 (bogus 24)) (map bogus24 '(1 2 3 4 5)) ======================================================================== >>> Substitution Caches Evaluating using substitutions is very inefficient -- at each scope, we copy a piece of the program AST. This includes all function calls which implies an impractical cost (function calls should be *cheap*!). To get over this, we want to use a cache of substitutions. Basic idea: we begin evaluating with no cached substitutions, then collect them as we encounter bindings. Implies another change for our evaluator: we don't really substitute identifiers until we get there, so when we reach an identifier it is no longer an error -- we must consult the substitution cache at that point. ======================================================================== >>> Initial Implementation of Cache Functionality First, we need a type for a substitution cache. For this we will use a list of lists of two elements each -- a name and its value FLANG: ;; a type for substitution caches: (define-type SubstCache = (Listof (List Symbol FLANG))) We need to have an empty substitution cache, a way to extend it, and a way to look things up: (: empty-subst : SubstCache) (define empty-subst null) (: extend : (Symbol FLANG SubstCache -> SubstCache)) (define (extend name val sc) (cons (list name val) sc)) (: lookup : (Symbol SubstCache -> FLANG)) (define (lookup name sc) (cond [(null? sc) (error 'lookup "no binding for ~s" name)] [(eq? name (first (first sc))) (second (first sc))] [else (lookup name (rest sc))])) Actually, the reason to use such list of lists is that Scheme has a built-in procedure called `assq' that will do this kind of search (`assq' is a search in an association list using `eq?' for the key comparison). This is a version of `lookup' that uses `assq': (: lookup : (Symbol SubstCache -> FLANG)) (define (lookup name sc) (let ([cell (assq name sc)]) (if cell (second cell) (error 'lookup "no binding for ~s" name)))) ======================================================================== >>> Formal Rules for Cached Substitutions The formal evaluation rules are now different. Evaluation carries along a "substitution cache" that begins its life as empty: so `eval' needs an extra argument. We begin by writing the rules that deal with the cache, and use the above function names for simplicity -- the behavior of the three definitions can be summed up in a single rule for `lookup': lookup(x,empty-subst) = error! lookup(x,extend(x,E,sc)) = E lookup(x,extend(y,E,sc)) = lookup(x,sc) if `x' is not `y' And now we can write the new rules for `eval' eval(N,sc) = N eval({+ E1 E2},sc) = eval(E1,sc) + eval(E2,sc) eval({- E1 E2},sc) = eval(E1,sc) - eval(E2,sc) eval({* E1 E2},sc) = eval(E1,sc) * eval(E2,sc) eval({/ E1 E2},sc) = eval(E1,sc) / eval(E2,sc) eval(x,sc) = lookup(x,sc) eval({with {x E1} E2},sc) = eval(E2,extend(x,eval(E1,sc),sc)) eval({fun {x} E},sc) = {fun {x} E} eval({call E1 E2},sc) = eval(Ef,extend(x,eval(E2,sc),sc)) if eval(E1,sc)={fun {x} Ef} = error! otherwise Note that there is no mention of `subst' -- the whole point is that we don't really do substitution, but use the cache instead. The `lookup' rules, and the places where `extend' is used replaces `subst', and therefore specifies our scoping rules. Also note that the rule for `call' is still very similar to the rule for `with', but it looks like we have lost something -- the interesting bit with substituting into `fun' expressions. ======================================================================== >>> Evaluating with Substitution Caches Implementing the new `eval' is easy now -- it is extended in the same way that the formal `eval' rule is extended: (: eval : (FLANG SubstCache -> FLANG)) ;; evaluates FLANG expressions by reducing them to expressions (define (eval expr sc) (cases expr [(Num n) expr] [(Add l r) (arith-op + (eval l sc) (eval r sc))] [(Sub l r) (arith-op - (eval l sc) (eval r sc))] [(Mul l r) (arith-op * (eval l sc) (eval r sc))] [(Div l r) (arith-op / (eval l sc) (eval r sc))] [(With bound-id named-expr bound-body) (eval bound-body (extend bound-id (eval named-expr sc) sc))] [(Id name) (lookup name sc)] [(Fun bound-id bound-body) expr] [(Call fun-expr arg-expr) (let ([fval (eval fun-expr sc)]) (cases fval [(Fun bound-id bound-body) (eval bound-body (extend bound-id (eval arg-expr sc) sc))] [else (error 'eval "`call' expects a function, got: ~s" fval)]))])) Again, note that we don't need `subst' anymore, but the rest of the code (the data type definition, parsing, and `arith-op') is exactly the same. Finally, we need to make sure that `eval' is initially called with an empty cache. This is easy to change in our main `run' entry point: (: run : (String -> Number)) ;; evaluate a FLANG program contained in a string (define (run str) (let ([result (eval (parse str) empty-subst)]) (cases result [(Num n) n] [else (error 'run "evaluation returned a non-number: ~s" result)]))) The full code (including the same tests) follows. Note that one test does not pass. ---------------------------------------------------------------------- (define-type FLANG [Num (n Number)] [Add (lhs FLANG) (rhs FLANG)] [Sub (lhs FLANG) (rhs FLANG)] [Mul (lhs FLANG) (rhs FLANG)] [Div (lhs FLANG) (rhs FLANG)] [Id (name Symbol)] [With (name Symbol) (named FLANG) (body FLANG)] [Fun (name Symbol) (body FLANG)] [Call (fun-expr FLANG) (arg-expr FLANG)]) (: parse-sexpr : (Sexpr -> FLANG)) ;; to convert s-expressions into FLANGs (define (parse-sexpr sexpr) (match sexpr [(number: n) (Num n)] [(symbol: name) (Id name)] [(cons 'with more) (match sexpr [(list 'with (list (symbol: name) named) body) (With name (parse-sexpr named) (parse-sexpr body))] [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])] [(cons 'fun more) (match sexpr [(list 'fun (list (symbol: name)) body) (Fun name (parse-sexpr body))] [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])] [(list op left right) (let ([make-node (match op ['+ Add] ['- Sub] ['* Mul] ['/ Div] ['call Call] [else (error 'parse-sexpr "don't know about ~s" op)])]) (make-node (parse-sexpr left) (parse-sexpr right)))] [else (error 'parse-sexpr "bad syntax in ~s" sexpr)])) (: parse : (String -> FLANG)) ;; parses a string containing an FLANG expression to a FLANG AST (define (parse str) (parse-sexpr (string->sexpr str))) ;; a type for substitution caches: (define-type SubstCache = (Listof (List Symbol FLANG))) (: empty-subst : SubstCache) (define empty-subst null) (: extend : (Symbol FLANG SubstCache -> SubstCache)) (define (extend name val sc) (cons (list name val) sc)) (: lookup : (Symbol SubstCache -> FLANG)) (define (lookup name sc) (let ([cell (assq name sc)]) (if cell (second cell) (error 'lookup "no binding for ~s" name)))) (: arith-op : ((Number Number -> Number) FLANG FLANG -> FLANG)) ;; gets a Scheme numeric binary operator, and uses it within an FLANG ;; `Num' wrapper (define (arith-op op expr1 expr2) (define: (Num->number [e : FLANG]) : Number (cases e [(Num n) n] [else (error 'arith-op "expects a number, got: ~s" e)])) (Num (op (Num->number expr1) (Num->number expr2)))) (: eval : (FLANG SubstCache -> FLANG)) ;; evaluates FLANG expressions by reducing them to expressions (define (eval expr sc) (cases expr [(Num n) expr] [(Add l r) (arith-op + (eval l sc) (eval r sc))] [(Sub l r) (arith-op - (eval l sc) (eval r sc))] [(Mul l r) (arith-op * (eval l sc) (eval r sc))] [(Div l r) (arith-op / (eval l sc) (eval r sc))] [(With bound-id named-expr bound-body) (eval bound-body (extend bound-id (eval named-expr sc) sc))] [(Id name) (lookup name sc)] [(Fun bound-id bound-body) expr] [(Call fun-expr arg-expr) (let ([fval (eval fun-expr sc)]) (cases fval [(Fun bound-id bound-body) (eval bound-body (extend bound-id (eval arg-expr sc) sc))] [else (error 'eval "`call' expects a function, got: ~s" fval)]))])) (: run : (String -> Number)) ;; evaluate a FLANG program contained in a string (define (run str) (let ([result (eval (parse str) empty-subst)]) (cases result [(Num n) n] [else (error 'run "evaluation returned a non-number: ~s" result)]))) ;; tests (test (run "{call {fun {x} {+ x 1}} 4}") => 5) (test (run "{with {add3 {fun {x} {+ x 3}}} {call add3 1}}") => 4) (test (run "{with {add3 {fun {x} {+ x 3}}} {with {add1 {fun {x} {+ x 1}}} {with {x 3} {call add1 {call add3 x}}}}}") => 7) (test (run "{with {identity {fun {x} x}} {with {foo {fun {x} {+ x 1}}} {call {call identity foo} 123}}}") => 124) (test (run "{call {call {fun {x} {call x 1}} {fun {x} {fun {y} {+ x y}}}} 123}") => 124) (test (run "{with {x 3} {with {f {fun {y} {+ x y}}} {with {x 5} {call f 4}}}}") => ???) ---------------------------------------------------------------------- ========================================================================