2008-10-24 Recursion and the Y Combinator ======================================================================== >>> The Nature of Recursion There is one major feature that is still missing from our language: we have no way to perform recursion (therefore no kind of loops). So far, we could only use recursion when we had *names*. In FLANG, the only way we can have names is through `with' which not good enough for recursion. To discuss the issue of recursion, we switch to a "broken" version of (untyped) Scheme -- one where a `define' has a different scoping rules: the scope of the defined name does *not* cover the defined expression. Specifically, in this version, this doesn't work: (define (fact n) (if (zero? n) 1 (* n (fact (- n 1))))) (fact 120) The `define' form is more similar to a mathematical definition now. For example, when we write: (define (F x) x) (define (G y) (F y)) (G F) which is actually shorthand for (define F (lambda (x) x)) (define G (lambda (y) (F y))) (G F) we really mean that this is again shorthand notation for the real thing you want to write which is: ((lambda (y) ((lambda (x) x) y)) (lambda (x) x)) This means that the above `fact' definition is similar to writing: fact := (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))) (fact 120) which is not a well-formed definition -- it is *meaningless* (this is a formal use of the word "meaningless"). What we'd really want, is to take the *equation* (using `=' instead of `:=') fact = (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))) and find a solution which will be a value for `fact' that makes this true. If you look at the Scheme evaluation rules handout on the web page, you will see that this problem is related to the way that we introduced the Scheme `define': there is a hand-wavy explanation that talks about *knowing* things. The big question is: can we define recursive functions without Scheme's magical `define' form? ======================================================================== >>> Recursion: Implementing It Lets start now with with an example: a simple recursive function definition. We continue using the broken+untyped language, since the types are not relevant for what we will be doing. (define (fact n) (if (zero? n) 1 (* n (fact (- n 1))))) or (define fact (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) When we look at the *value* of `fact', we see that by itself, it doesn't make any sense because `fact' is a free variable if you look at the body in isolation: (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))) We will now make our way to a recursive function, beginning with the broken definition: (define fact (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) and try to fix it in steps. First of all, the above does not have a recursive call -- we don't have a value for `fact', so we can just as well write anything we want instead of `fact' to make it a valid definition: (define fact (lambda (n) (if (zero? n) 1 (* n (666 (- n 1)))))) ;*** This function will not work in the general case, but there is one case where it *will* work: when the input value is 0 (since then we do not reach that bogus application). Note this by naming this function `fact0': (define fact0 ;*** (lambda (n) (if (zero? n) 1 (* n (666 (- n 1)))))) Now that we have that, we can use it to write `fact1' which is the same as the real factorial function for arguments of 0 or 1: (define fact0 (lambda (n) (if (zero? n) 1 (* n (666 (- n 1)))))) (define fact1 (lambda (n) (if (zero? n) 1 (* n (fact0 (- n 1)))))) But remember that this is actually just shorthand for: (define fact1 (lambda (n) (if (zero? n) 1 (* n ((lambda (n) (if (zero? n) 1 (* n (666 (- n 1))))) (- n 1)))))) We can continue in this way and write `fact2' that will work for n<=2: (define fact2 (lambda (n) (if (zero? n) 1 (* n (fact1 (- n 1)))))) or, in full form: (define fact2 (lambda (n) (if (zero? n) 1 (* n ((lambda (n) (if (zero? n) 1 (* n ((lambda (n) (if (zero? n) 1 (* n (666 (- n 1))))) (- n 1))))) (- n 1)))))) If we continue this way, we *will* get the true factorial function, but the problem is that to handle *any* possible integer argument, it will have to be an infinite definition! Here is what it is supposed to look like: (define fact0 (lambda (n) (if (zero? n) 1 (* n (666 (- n 1)))))) (define fact1 (lambda (n) (if (zero? n) 1 (* n (fact0 (- n 1)))))) (define fact2 (lambda (n) (if (zero? n) 1 (* n (fact1 (- n 1)))))) (define fact3 (lambda (n) (if (zero? n) 1 (* n (fact2 (- n 1)))))) ... And our `fact' is actually `fact-infinity', with an infinite size. So, we're back at the original problem... There is hope though -- the bigger and bigger definitions all use instances of the same original `fact' code, so we can try to abstract it away -- pull the value that is being used as the internal call as an argument to a function. Rule#1: (... y ...) <==> (let ([x y]) (... x ...)) <==> ((lambda (x) (... x ...)) y) Using this, `fact1' becomes: (define fact1 (let ([fact fact0]) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))) which is (define fact1 ((lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) fact0)) which, in turn, is actually: (define fact1 ((lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) ((lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) 666))) This way we get something that looks better, but we still repeat ourselves. To solve this problem, we'll use a function that will take the `(lambda (n) ...)' expression and will apply that on `666', `fact0', or whatever. Rule#2: (f x) <==> ((lambda (g) (g x)) f) (This is actually an instance of Rule#1.) Use this to create a function that gets `make-fact': (define fact0 ((lambda (make-fact) (make-fact 666)) ;*** (lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))) Now `fact1' can be written easily: (define fact1 ((lambda (make-fact) (make-fact (make-fact 666))) (lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))) And the principle is clear. We can now continue by working on the real factorial, which is still infinite at this stage: (define fact ((lambda (make-fact) (make-fact (make-fact (... (make-fact 666) ...)))) (lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))) But the infiniteness problem is now localized: we look for a *finite* expression that will evaluate to (f (f (f (... (f x) ...)))) Now, examine the `(lambda (make-fact) ...)' -- all it does is get something and apply it to the result of applying it to the result of applying it ... to 666. So, if we succeed, `make-fact' does what we want: (make-fact f) --> (f (f (f (... (f x) ...)))) We can now use this idea: pass `make-fact' itself to `make-fact', and have it do the extra calls itself when needed. Now the second closure gets `make-fact' itself, so the internal `fact' variable is renamed to make this clearer, and we initially apply it on 666: (define fact ((lambda (make-fact) (make-fact make-fact)) ;*** (lambda (make-fact) ;*** (lambda (n) (if (zero? n) 1 (* n ((make-fact 666) ;*** (- n 1)))))))) That will make this function do the same as `fact1' -- if we try to continue, we'll bump into 666, a reduction will demonstrate this (try it). Instead -- use `make-fact' instead of 666 and we'll be able to do as many calls as needed: (define fact ((lambda (make-fact) (make-fact make-fact)) (lambda (make-fact) (lambda (n) (if (zero? n) 1 (* n ((make-fact make-fact) ;*** (- n 1)))))))) Now, we have something which *is* the true factorial function (convince yourself by fixing the reductions made with 666). It is therefore possible to write recursive functions using finite expressions! -- But we still have some problems to overcome... First, there is still the problem of having a solution which is quite different from the original factorial function. To make things more clear, we use more abstractions. First, abstract the second lambda expression, putting the `(make-fact make-fact)' call outside the expression (Rule#1) so we get the `fact's original body in one piece: (define fact ((lambda (make-fact) (make-fact make-fact)) (lambda (make-fact) ((lambda (fact) ;*** (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) ;*** (make-fact make-fact))))) ;*** This was easy to solve, but there is a more important problem -- it will get stuck in an infinite loop... ======================================================================== >>> Recursion: Infinity (and beyond?) This is because we use an eager language -- you evaluate the function and its arguments first, and then do the actual application step. But when you try to evaluate the argument, you end up evaluating an infinite loop. What we just did is write an expression that looks like: ((lambda (x) (x x)) (lambda (x) (f (x x)))) which is a variation on the well-known and much loved expression: ((lambda (x) (x x)) (lambda (x) (x x))) ...which evaluates to itself ... forever... This expression is the key for creating a loop -- we use it to create the recursion. The original expression evaluates as follows: ((lambda (x) (x x)) (lambda (x) (f (x x)))) ((lambda (x) (f (x x))) (lambda (x) (f (x x)))) (f ((lambda (x) (f (x x))) (lambda (x) (f (x x))))) (f (f ((lambda (x) (f (x x))) (lambda (x) (f (x x)))))) (f (f (f ((lambda (x) (f (x x))) (lambda (x) (f (x x))))))) ... So the problem is that we have an infinite sequence of `f' applications: (f (f (f ...forever...))) but this doesn't work, since evaluating the argument to the first `f' is stuck in an infinite loop. What we need is some way to hand `f' a value that behaves like the real argument, except that it doesn't get stuck in an infinite loop. (f PROTECT(f (f ...forever...))) In this case, we're lucky to have a function value, which means that we can use a wrapper function that will actually evaluate the problematic expression only when the function is actually used. In other words, we make `PROTECT' be the following rewrite: PROTECT(E) ==> (lambda (z) (E z)) and if E is an expression that evaluates to a one argument function, this will be an equivalent function: Rule#3: f <==> (lambda (z) (f z)) -- as long as f is a one-argument function. Applying this rule on the problematic expression gives us: (f (lambda (z) ((f (f ...forever...)) z))) The problem now is that we can indeed the first `f', but as soon as it tries to use the protected argument, it will get stuck in an infinite loop again. The solution is simple -- protect *all* `f's, so that we never get into an infinite loop (unless, of course, `f' will call itself infinitely -- there's nothing we can do to avoid that): (f PROTECT(f PROTECT(f PROTECT(...forever...)))) And this is easy to achieve in our case, since we have a function that generates the infinite sequence of `f's: ((lambda (x) (x x)) (lambda (x) (f (x x)))) we can use the protection right in there: ((lambda (x) (x x)) (lambda (x) (f PROTECT(x x)))) which means that we use the above wrapper on the `(x x)' (which does evaluate to a one-argument function, since it's the argument to `f'): ((lambda (x) (x x)) (lambda (x) (f (lambda (z) ((x x) z))))) ... Now the loop is deferred inside the lambda expression -- and only when we need to use it, we apply it and evaluate its body. The overall result will still be the same. Note that this wouldn't be a problem if we would use lazy evaluation (which is the common case when dealing with the Lambda Calculus). We can also note that nothing changes if we modify the first part of: ((lambda (x) (x x)) (lambda (x) (f (lambda (z) ((x x) z))))) to be identical with the second part for the sake of symmetry -- the evaluation still looks the same except for skipping the first step now: ((lambda (x) (f (lambda (z) ((x x) z)))) (lambda (x) (f (lambda (z) ((x x) z))))) ======================================================================== >>> Recursion: Solution Back to the factorial problem: applying the delay fix to the above factorial definition gives: (define fact ((lambda (make-fact) (make-fact make-fact)) (lambda (make-fact) ((lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (lambda (z) ;*** ((make-fact make-fact) z)))))) ;*** And the symmetric version is: (define fact ((lambda (make-fact) ((lambda (fact) ;*** (lambda (n) ;*** (if (zero? n) ;*** 1 ;*** (* n (fact (- n 1)))))) ;*** (lambda (z) ((make-fact make-fact) z)))) ;*** (lambda (make-fact) ((lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (lambda (z) ((make-fact make-fact) z)))))) This does look more complex -- but it is actually simpler since the two parts are the identical. Now take the `(lambda (fact) ...)' thing, and call it `fact-maker' -- we get: (define fact-maker (lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))) (define fact ((lambda (make-fact) (fact-maker ;*** (lambda (z) ((make-fact make-fact) z)))) (lambda (make-fact) (fact-maker ;*** (lambda (z) ((make-fact make-fact) z)))))) Using Rule#1, make `fact-maker' an argument for a `make-real-fact' function: (define fact-maker (lambda (fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))) (define (make-real-fact maker) ;*** ((lambda (make-fact) ;*** (maker (lambda (z) ((make-fact make-fact) z)))) (lambda (make-fact) (maker ;*** (lambda (z) ((make-fact make-fact) z)))))) And now the *real* `fact' can be defined with: (define fact (make-real-fact fact-maker)) This works -- and the whole thing is not defined recursively! (It generates a recursive computation, but the definition uses a well formed finite expression). The last thing to note is that `make-real-fact' is totally independent of `fact', so we can redefine it as a general function: (define (make-recursive f) ((lambda (x) (f (lambda (z) ((x x) z)))) (lambda (x) (f (lambda (z) ((x x) z)))))) (define fact (make-recursive fact-maker)) And we now have general recursion! -- This also work for other recursive functions: (define make-fib (lambda (fib) (lambda (n) (if (<= n 1) n (+ (fib (- n 1)) (fib (- n 2))))))) (define fib (make-recursive make-fib)) And another example, without an explicit `make-length' definition: (define length (make-recursive (lambda (length) (lambda (l) (if (null? l) 0 (+ (length (cdr l)) 1)))))) If we have a way to add rewrite rules to the language, we could even specify a rewrite rule that will create recursive definitions for us: (rewrite (define-rec f E) => (define f (make-recursive (lambda (f) E)))) (define-rec fact (lambda (n) (if (<= n 1) 1 (* n (fact (- n 1)))))) Finally, note that make-recursive is limited to 1-argument functions only because of the protection from eager evaluation. In any case, it can be used in any way you want, for example, (make-recursive (lambda (f) (lambda (x) f))) is a function that *returns* itself rather than calling itself. Using the rewrite rule, (define-rec f (lambda (x) f)) is the same as: (define (f x) f) in plain Scheme. ======================================================================== >>> Recursion: the Y Combinator Our `make-recursive' function is usually called the "fixpoint operator" or the "Y combinator". It looks really simple when using the lazy version (remember: our version is the eager one): (define Y (lambda (f) ((lambda (x) (f (x x))) (lambda (x) (f (x x)))))) And this all comes from the loop generated by: ((lambda (x) (x x)) (lambda (x) (x x))) ((lambda (x) (x x)) (lambda (x) (x x))), which is also called `omega', is also the idea behind many deep mathematical facts. As an example for what it does, follow the next rule: I will say the next sentence twice: "I will say the next sentence twice". (Note the usage of colon for the first and quotes for the second -- what is the equivalent of that in the lambda expression?) ======================================================================== >>> Recursion: Alternative Explanation If we go back to where we started, (define (fact n) (if (zero? n) 1 (* n (fact (- n 1))))) (fact 120) We can note that by the time we get to the body of the function, we *do* have some binding for `fact' -- the one that we have called. So we can send that value back to `fact' to make it able to call itself: (define (fact self n) ;*** (if (zero? n) 1 (* n (self (- n 1))))) (fact fact 120) ;*** except that now the recursive call should still send itself along: (define (fact self n) (if (zero? n) 1 (* n (self self (- n 1))))) ;*** (fact fact 120) The problem is that this required rewriting fact, something we wish to avoid when we write programs. We'll try to make it better again -- we need something with just a `(lambda (n) ...)' for the actual function. Begin with currying: (define (fact self) ;*** (lambda (n) ;*** (if (zero? n) 1 (* n ((self self) (- n 1)))))) ;*** ((fact fact) 120) ;*** Now we want to use a real recursive call instead of that "(self self)" thing: (define (fact self) (let ([fact (self self)]) ;*** (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))) ;*** ((fact fact) 120) But the problem is that we get into an infinite loop because we're trying to evaluate "(self self)" too early -- ignoring the body of the `let' and other details, we basically do this: (define (fact self) (self self)) (fact fact) --replace-let-with-lambda--> (define fact (lambda (self) (self self))) (fact fact) --replace-definition--> ((lambda (self) (self self)) (lambda (self) (self self))) --rename--> ((lambda (x) (x x)) (lambda (x) (x x))) Going back to the same problem with infinity -- what's the solution? We know that "(self self)" is ultimately used as a one-argument function, so we can protect its evaluation until needed, using `lambda' and the fact that if `f' is a one-argument function, then (lambda (x) (f x)) is the same function: (define (fact self) (let ([fact (lambda (x) ((self self) x))]) ;*** (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))) ((fact fact) 120) Now, since `let' is just a syntactic sugar for a lambda application, turn that inner `let' to a `lambda': (define (fact self) ((lambda (fact) ;*** (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (lambda (x) ((self self) x)))) ;*** ((fact fact) 120) The resulting `(lambda (fact) ...)' has a binding for the recursive call (the `fact' argument), and it has the `(lambda (n) ...)' function that is the body of the factorial function. So it's actually a kind of a `fact-maker'. Call it that: (define (fact-maker fact) ;*** (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (define (fact self) (fact-maker (lambda (x) ((self self) x)))) ;*** ((fact fact) 120) Now, the outer-most `fact' is not really the factorial function, it's something that *will* be the factorial function when applied to itself, so why not just apply it to itself, and bind the result of that to `fact' ? -- (define (fact-maker fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (define fact (let ([x (lambda (self) ;*** (fact-maker (lambda (x) ((self self) x))))]) (x x))) ;*** (fact 120) Again, convert that `let' to a `lambda': (define (fact-maker fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (define fact ((lambda (x) (x x)) ;*** (lambda (self) (fact-maker (lambda (x) ((self self) x)))))) (fact 120) Rename `self' -> `x': (define (fact-maker fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (define fact ((lambda (x) (x x)) (lambda (x) ;*** (fact-maker (lambda (z) ((x x) z)))))) ;*** (fact 120) and now abstract the recursion-making functionality into a separate function: (define (fact-maker fact) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (define (make-recursive maker) ;*** ((lambda (x) (x x)) (lambda (x) (maker (lambda (z) ((x x) z)))))) (define fact (make-recursive fact-maker)) ;*** (fact 120) And now the resulting `make-recursive' is the same as the version that was derive above. ======================================================================== >>> Recursion: The Main Property of Y `fact-maker' is a function that given any limited factorial, will generates a factorial that is good for one more integer input. Begin with `666' which is a factorial that is good for nothing (because it's not a function), and you can get `fact0' as fact0 == (fact-maker 666) and that's a good factorial function only for an input of 0. Use that with `fact-maker' again, and you get fact1 == (fact-maker fact0) == (fact-maker (fact-maker 666)) which is the factorial function when you only look at input values of 0 or 1. In a similar way fact2 == (fact-maker fact1) is good for 0..2 -- and we can continue as much as we want, except that we need to have an infinite number of applications -- in the general case, we have: fact-n == (fact-maker (fact-maker (fact-maker ... 666))) which is good for 0..n. The *real* factorial would be the result of running fact-maker on itself infinitely, it *is* fact-infinity . In other words (here `fact' is the *real* factorial): fact = fact-infinity == (fact-maker (fact-maker ...infinitely...)) but note that since this is really infinity, then fact = (fact-maker (fact-maker ...infinitely...)) = (fact-maker fact) so we get an equation: fact = (fact-maker fact) and a solution for this is going to be the real factorial. The solution is the fixed-point of the `fact-maker' function, in the same sense that 0 is the fixed point of the `sin' function because 0 = (sin 0) And the Y combinator does just that -- it has this property: (make-recursive f) = (f (make-recursive f)) or, using the more common name: (Y f) = (f (Y f)) This property encapsulates the real magical power of Y. You can see how it works -- because: (Y f) = (f (Y f)) you can also say that: (f (Y f)) = (f (f (Y f))) so we get: (Y f) = (f (Y f)) = (f (f (Y f))) = (f (f (f (Y f)))) = ... = (f (f (f ...))) and we can conclude that (Y fact-maker) = (fact-maker (fact-maker ...infinitely...)) = fact ======================================================================== >>> Typing the Y Combinator Typing the Y combinator is always a tricky issue. For example, in standard ML you must write a new type definition to do this: datatype 'a t = T of 'a t -> 'a val y = fn f => (fn (T x) => (f (fn a => x (T x) a))) (T (fn (T x) => (f (fn a => x (T x) a)))) In OCaml, you can turn on recrsive types, and it will infer the correct type: # let y f = (fun x -> x x) (fun x -> fun z -> f (x x) z);; val y : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = # let fact = y (fun fact n -> if n<1 then 1 else n* fact(n-1)) ;; val fact : int -> int = # fact 5;; - : int = 120 It is also possible to write this expression in typed-scheme, but we will need to write a type definition as well. First of all, the type of Y should be straightforward: it is a fixpoint operation, so it takes a `T -> T' function and produces its fixpoint. The fixpoint itself is some `T' (such that applying the function on it results in itself). So this gives us: (: make-recursive : ((T -> T) -> T)) However, in our case `make-recursive' computes a *functional* fixpoint, for unary `S -> T' functions, so we should narrow down the type (: make-recursive : (((S -> T) -> (S -> T)) -> (S -> T))) Now, in the body of `make-recursive' we need to add a type for the `x' arugment which are behaving in a weird way: they are both a function and its own argument. (Remember -- I will say the next sentence twice: "I will say the next sentence twice".) We need a recursive type definition for that: (define-type (Tau S T) = (Rec this (this -> (S -> T)))) This type is tailored for our use of `x': given a type `T', `x' is a function that will consume *itself* (hence the `Rec') and spit out the value that the `f' argument consumes -- a `T -> T' function. The resulting full version of the code: (: make-recursive : (All (S T) (((S -> T) -> (S -> T)) -> (S -> T)))) (define (make-recursive f) (define-type (Tau S T) = (Rec this (this -> (S -> T)))) ((lambda: ([x : (Tau S T)]) (f (lambda: ([z : S]) ((x x) z)))) (lambda: ([x : (Tau S T)]) (f (lambda: ([z : S]) ((x x) z)))))) (: fact : (Number -> Number)) (define fact (make-recursive (lambda: ([fact : (Number -> Number)]) (lambda: ([n : Number]) (if (zero? n) 1 (* n (fact (- n 1)))))))) (fact 120) ========================================================================