Expressions

The first programming languages we will study are expression languages. We will use SLLgen grammars to specify the syntax of these languages and the representations of their abstract syntax trees. We will then specify the semantics of these languages by writing interpreters for the abstract syntax trees. These interpreters take an environment as their second argument, which records the value of any variables that may appear free within the expression.

Specification and Implementation Strategy

(value-of expρ) = val

means the value of expression exp in environment ρ should be val.

The source language is the language we are defining, specifying, or implementing. The implementation language (usually Scheme with EoPL extensions) is the language in which we write our interpreters.

The front end of an interpreter or compiler translates the source language into abstract syntax trees. A compiler translates abstract syntax trees into some target language, such as Intel x86-32 machine code or JVM byte code. The abstract syntax trees or target language can then be executed by some interpreter. For example, an Intel Core 2 Duo contains an extremely efficient interpreter for Intel x86-32 machine code:

> (define add1
    (lambda (n) (+ n 1)))

> add1
#<PROCEDURE add1>

> (nasm-disassemble add1)
00000000  83FB04            cmp ebx,byte +0x4
00000003  7411              jz 0x16
00000005  C7452C04000000    mov dword [ebp+0x2c],0x4
0000000C  FF9500020000      call near [ebp+0x200]
00000012  90                nop
00000013  90                nop
00000014  EBEA              jmp short 0x0
00000016  F6C103            test cl,0x3
00000019  750A              jnz 0x25
0000001B  89CB              mov ebx,ecx
0000001D  83C304            add ebx,byte +0x4
00000020  710E              jno 0x30
00000022  83EB04            sub ebx,byte +0x4
00000025  89CB              mov ebx,ecx
00000027  B804000000        mov eax,0x4
0000002C  FF551C            call near [ebp+0x1c]
0000002F  90                nop
00000030  C3                ret
0

> (add1 (expt 10 70))
10000000000000000000000000000000000000000000000000000000000000000000001

Our interpreters will not be as efficient as the Intel Core 2 Duo, but they will be much simpler, much easier to build, and much easier to understand.

Scanning divides the plain text of a source program into meaningful substrings called tokens. The tokens are described by a lexical specification.

Parsing translates the sequence of tokens into an abstract syntax tree. The syntactically legal sequences of tokens are described by the source language's grammar.

A parser generator is a program whose inputs include a lexical specification, a grammar, and a description of the abstract syntax trees to be constructed for each production of the grammar. The main outputs of the parser generator are a scanner and parser.

We will use the SLLgen parser generator for most of this course. For MP3, however, the mp3-data-structures.scm will contain a hand-written scanner and a complete parser that was generated by a different parser generator. This is just to show you what a scanner and parser look like. In future assignments, where the scanners and parsers will be more complicated, you will see the lexical specifications and the grammars but will not see the scanners and parsers built from them. The main thing to remember is that scan&parse takes a string containing the plain text representation of a program, and returns the abstract syntax tree for that program.

LET: A Simple Language

Specifying the Syntax

Syntax for the LET language

`Program`	`::=`	`Expression`	`a-program (exp1)`
`Expression`	`::=`	`Number`	`const-exp (num)`
	`::=`	`-(Expression,Expression)`	`diff-exp (exp1 exp2)`
	`::=`	`zero? (Expression)`	`zero?-exp (exp1)`
	`::=`	`if` `ExpressionthenExpressionelseExpression`	`if-exp (exp1 exp2 exp3)`
	`::=`	`Identifier`	`var-exp (var)`
	`::=`	`let` `Identifier=ExpressioninExpression`	`let-exp (var exp1 body)`

For example,

(scan&parse "let x = 4 in -(x,-(1,x))")

evaluates to the abstract syntax tree that is the result of

(a-program
  (let-exp 'x
           (const-exp 4)
           (diff-exp (var-exp 'x)
                     (diff-exp (const-exp 1)
                               (var-exp 'x)))))

Specification of Values

For any programming language, the expressed values are the possible values of an expression, and the denoted values are the values to which a variable can be bound in some environment.

For LET, the expressed and denoted values happen to be the same:

ExpVal = Int + Bool
DenVal = Int + Bool

The expressed and denoted values will be abstract data types with this algebraic specification:

num-val : Int → ExpVal
bool-val : Bool → ExpVal
expval->num : ExpVal → Int
expval->bool : ExpVal → Bool

(expval->num (num-val n)) = n
(expval->bool (bool-val b)) = b

Environments

We use the following abbreviations:

ρ ranges over environments
[] denotes the empty environment
[var = val]ρ denotes (extend-env varvalρ)
[var = val] denotes [var = val][]

Specifying the Behavior of Expressions

Interface for expressions of LET

const-exp : Int → Exp
zero?-exp : Exp → Exp
if-exp : Exp × Exp × Exp → Exp
diff-exp : Exp × Exp → Exp
var-exp : Symbol → Exp
let-exp : Symbol × Exp × Exp → Exp

value-of : Exp × Env → ExpVal

Specification for three kinds of expressions

(value-of (const-exp n)ρ) = (num-val n)

(value-of (var-exp var)ρ) = (apply-env ρvar)

(value-of (diff-exp exp₁exp₂)ρ)
= (- (expval->num (value-ofexp₁ρ)) (expval->num (value-ofexp₂ρ)))

Specifying the Behavior of Programs

For LET, specifying the behavior of programs amounts to specifying the initial environment. For most programming languages, the initial environment consists of a standard set of predefined libraries that every implementation of the language is supposed to provide. For LET, we'll mimic that by providing three predefined identifiers.

(value-of-program exp) = (value-of expρ₀)

where

ρ₀ = [i=1,v=5,x=10]

Specifying Conditionals

(value-of exp₁ρ) = val₁
(expval->num val₁) = 0
------------------------------------
(value-of exp₁ρ) = (bool-val #t)

(value-of exp₁ρ) = val₁
(expval->num val₁) = n
n ≠ 0
------------------------------------
(value-of exp₁ρ) = (bool-val #f)

(value-of exp₁ρ) = val₁
(expval->bool val₁) = #t
----------------------------------------------------
(value-of (if-exp exp₁exp₂exp₃)ρ) = (value-of exp₂ρ)

(value-of exp₁ρ) = val₁
(expval->bool val₁) = #f
----------------------------------------------------
(value-of (if-exp exp₁exp₂exp₃)ρ) = (value-of exp₃ρ)

Specifying `let`

(value-of exp₁ρ) = val₁
------------------------------------
(value-of (let-exp varexp₁body)ρ)
= (value-of body[var=val₁]ρ)

Implementing the Specification of `let`

PROC: A Language with Procedures

An Example

Representing Procedures

LETREC: A Language with Recursive Procedures

Scope and Binding of Variables

Eliminating Variable Names

Implementing Lexical Addressing

The Translator

The Nameless Interpreter

Last updated 28 January 2008.