Hand out Quiz 3.

Interpreters and Compilers
==========================

-- Abstract syntax tree (or parse tree)

An abstract syntax is a representation that identifies the syntactic
rules used in obtaining the given expression.  The representation is
in the form a tree, referred to as a parse tree or abstract syntax
tree, and provides easy access to subcomponents of the expression.

How does one derive an abstract syntax tree for a given expression in
a given language?  Through scanning and parsing.  Scanning is the
process of analyzing a sequence of characters into larger units,
called tokens.  Typical tokens are variables, keywords, numbers,
punctuations, whitespace and comments.  The output is a sequence of
tokens which is then send to the parser.  The most suitable model for
defining scanners is a finite state automaton.  (Refer to handout from
the book "Elements of Programming Languages" by Friedman, Wand, and
Haynes.)

The parser organizes a sequence of tokens into a parse tree.  This is
done on the basis of the grammar defined for the language.  The parser
identifies the production rule that is associated with each
subcomponent of the expression.

Parse tree for the expression 

Example: * 5 + - * 6 7 5 + 8 9

Note that the parse tree immediately provides a guideline for
evaluating the expression.

For the above example, we assume that the tokens are already given.
That is, the expression is already scanned.

-- Environment

An environment is a mapping from variables to constants.  

-- Interpreter

An interpreter consists of a scanner, a parser, and an evaluator.  The
evaluator takes an abstract syntax tree (or parse tree) and an
environment, and executes the parse tree in the environment.  This
results in some output, and possibly, some changes to the environment.

-- Compiler

A compiler takes a parse tree or a sequence of parse trees and
produces machine code (rather than evaluating the parse trees).  The
machine code can then be executed by the machine for which the code is
generated.

One can thus transform an interpreter into a compiler by making
changes to the evaluator: when evaluating the parse tree, replace the
"evaluations" by "outputting appropriate machine code".

-- Instructions and machine code

The machine code typically consists of primitive arithmetic and
store/load instructions.  Primitive arithmetic involves simple binary
and unary operations on values stored in specified registers.
Store/load instructions involve loading from and storing into memory
locations.

-- Register allocation

Since the number of registers in the CPU is limited (10s, say), the
compiler needs to take up the task of allocating registers.  Thus, a
value that resides in a register and is needed (live) should be stored
into the appropriate memory location before the register may be used
for other computation.  This involves liveness analysis and register
allocation.

Java Virtual Machine
====================

(Notes based on article by Bill Venners.)

Represents a virtual machine.  The Java compiler compiles a given Java
program into JVM code.  The JVM has five components

-- Bytecodes

-- Registers

(i) program counter; (ii) optop; (iii) frame; and (iv) vars.

Most of the bytecode operations operate on the stack.

-- The method area and program counter

Contains the bytecodes for the program.  The program counter points to
the instruction that will be executed next.

-- Stack and registers

The Java stack is used to store parameters for an results of bytecode
instructions, to pass parameters and values between method
invocations.

A stack frame consists of: local variables, start of stack frame
operands, and operands for bytecode instructions.  The first is
pointed to by vars, the second by frame, and the third by optop.

-- Heap

This is where objects reside.  When new objects are created, memory is
allocated on the heap.  When objects die, their locations can be
reclaimed.  This is done by a process called garbage collection, which
is part of the JVM.  Memory management is not handled by the
programmer; instead, it is handled by JVM.