The Run-Time System
===================

After a program is compiled, either to JVM bytecode or to machine
code, it needs to be executed.  We saw in the last class how a program
compiled into JVM bytecode runs.  The essential components of any
run-time system are stack (including registers, local variables,
arguments, return value), method area (where the code resides), and
heap (where objects are dynamically allocated).

Since the size of both the stack and the heap are dynamic, we do not
allocate a fixed amount of memory to them.  Instead the stack and the
heap occupy two ends of a virtual memory "space" with free memory in
between.  Note that this memory is allocated by the operating system.
(Talk about the notion of virtual memory.)

Why are objects allocated on the heap?  Because we do not know at
compile time the size of the objects we would be allocating.  How do
you think the allocation of the "class vector" works?

Memory corruption errors are one of the most common software bugs.
There are different kinds of memory corruption errors, the two most
common being references to non-allocated memory locations and memory
leaks.  Memory corruption errors occur primarily when programmers do
their own memory management.

Buffer overrun is one such example.  Overrun can happen in either the
stack or in the heap.  Stack overruns may overwrite the registers
(program counter, local variables, return arguments, etc.) and may
cause a variety of wierd behaviors.  Heap overruns overwrite objects
allocated that may be referenced later and could cause major problems.
The recent bug discovered in Microsoft MDAC is an example of a
heap-overrun bug.

Automatic Memory Management
===========================

Scheme and Java are two programming languages that do not allow the
programmer to directly allocate memory on the heap.  The heap memory
is managed by allocation and deallocation subroutines that are
automatically called (inbuilt).  For instance, when you create a new
object that is local for a procedure, memory for the object in the
heap will be allocated at the instant the object is declared and will
be deallocated when the procedure returns.

Memory management is a complex task since memory is not always
allocated and deallocated in convenient chunks.  So the virtual memory
space at any time could contain used space and unused space, which can
be reused for allocation.  The process of reclaiming deallocated space
is referred to as *garbage collection*.  

Garbage collection is an important component of the run-time system
that runs in the background (or periodically) to reclaim unused space.
This procedure needs to identify "dead space".  

Two kinds of garbage collectors: (a) reference counting; (b) tracing.
Reference counting keeps a count of the number of references made to
each object.  Whenever a reference to an object is added, the count
for the object is increased.  Whenever a reference to an object goes
out of scope or is reassigned, the count is decremented.  When an
object is garbage collected, the counts of all objects it references
are decremented.

Tracing collections traverse the "pointers" from the root set (global
variables, stack, etc.).  All reachable pointers form the live space.
Unreachable space can be reclaimed.  This is another example of a
graph traversal algorithm (like depth-first search and breadth-first
search that we had seen earlier in the class -- in connection with
traversing the web graph).  Tracing collectors use a mark and sweep
strategy.  And then compact/copy the live data so that it is stored in
one end of the heap.