How to Design Class Hierarchies: Lecture Notes

Viera K. Proulx and Matthias Felleisen

January 12, 2003

0.1 Week 1 - Lecture 1: Data and Computation

0.1.1 The nature of computation

Computers can represent a variety of information: text, sound, color, speed of an airplane, the direction of the wind, the prices of shares on the stock market, the results of a pre-election poll, the location of a GPS receiver. The information is represented in a computer as some data. Data typically does not represent all available information about some object or phenomenon. It represents only that part of information that is relevant to the problem which the computer program is designed to solve.

The designer of a computer program needs to analyze the problem, decide what is the information relevant to the problem, decide how to represent this information as data. Once this is done, the programmer designs the actual computation that transforms the given data and produces a result. The result data represents the desired outcome information. A programmer must follow a strict discipline to assure that the program works correctly and performs the desired tasks.

0.1.2 Information vs. data

What is the difference between information and data? When counting the number of pieces of fruit in a fruit basket, one only needs to know the total count - a number. So, number 7 may represent three oranges, two bananas, and two apples. However, if someone asks whether there is a banana in the fruit basket, number 7 cannot represent the information we seek. If we expect this type of question we may encode the contents of the fruit basket as a sequence of pairs: (3 orange) (2 banana) (2 apple). This representation of information allows us to ask more questions about the fruit basket. Still, we do not know what kind of apples are in the basket (Macintosh, Cortland, Golden Delicious, or other), nor do we know how much they weigh, or what is their price.

0.1.3 Classes of data

The simplest kind of information can be represented by one numeric quantity, one sequence of letters, or some symbol. The corresponding data similarly is encoded as one number or one String (sequence of characters). But as the earlier example shows, in most cases the information about some phenomenon must be represented as several related data items. A class of data represents information about some phenomenon. It consists of a collection of data items, each representing one aspect of the phenomenon. So, for example, the daily weather report for Mt. Washington specifies the temperature (-5186F), the wind direction and velocity (NW, 67mph), the cloud cover and visibility (fog, 0), the precipitation for the day (1.0in). It is clear that each piece of data has a meaning of its own, but also that the whole collection of data items represents one complex data item - a class of data.

The programmer needs to identify the relevant classes of data and describe the relationships between them. For example, we may have a class of Person and a related class of Employee. Every employee is a person, but the Employee data contains more information than the Person data. A company may keep a List of Employees, and an Organizational Chart of Employees - with the CEO at the top. The figure xyz represents the relationships between these classes of data.

0.1.4 Designing programs

Computer programs perform computations. The goal of these computations is to consume one kind of data, perform some computation, and produce the result as some new data, which in turn represents new kind of information. For example, if we compute the sum of all fruits in the fruit basket, the result is a number that represents the quantity of fruits. The program that determines whether there is a banana in the fruit basket consumes the data representing the fruit basket and produces a symbol 'true' or 'false' that represents the answer.

The program design consists of a series of steps described in a design recipe. Following the design recipe helps the programmer to understand the nature of the problem, understand the structure of the data and the corresponding structure of the program, and to identify errors in the design or implementation of the program. The smallest programs represent simple computations, such as computing the speed, given the distance and time, or computing the average of two numbers. We start with the design recipes for such simple computations - implemented as functions. The design recipe consists of five steps:

Data Analysis and Data Definition
Purpose and Header
Examples
Body
Testing

0.2 Week 1 - Lecture 2: Ask Me a Question

0.2.1 Example of the use of design recipe

We are planning a trip from Boston to New York, driving at the average speed of about 50 miles per hour. We would like to know how long will the trip take. The map shows that the distance is 200 miles. It looks silly to design a program for just this one problem, but we hope to reuse this program to compute the duration of many other trips in the future.

Data Analysis and Data Definition
The information given to us is the distance (a number) and the average speed (a number). The function produces as a result another number which represents the expected duration of the trip.
In Java general numbers that represent quantity that may not always be a whole number are called double. So, the data this program consumes is:
double distance
double speed
It also produces a double.
Purpose and Header
The purpose statement states briefly and clearly what is the goal of this program. The header of the function is a Java language description of the data arguments the function will consume and the type of result it will produce. The header also gives the function a name. The following is the purpose and header for our function:
```
    /* Purpose: compute the travel time
    given the distance and average speed */
    double travelTime(double distance, double speed)
```
The function name is travelTime, is consumes two arguments: distance and speed, the type of each argument is double, and it produces a result of the type double.
Examples
Our first example is the trip to New York City. The distance is 200 miles, the average speed is 50 mph. We expect the trip to take 200/50 hours, or 4 hours. The trip to Washington, DC which is 420 miles away will take about 7 hours if we average 60 mph. We formulate the examples as Java test statements as follows:
```
    expected(4.0);
    actual( travelTime(200.0, 50.0) );

    expected(7.0);
    actual(  travelTime(420.0, 60.0) );
```
The first statement will display the expected value in the console window, labelled as 'Actual:'. The second statement will invoke the function where the specified values will be 'plugged in' for the corresponding arguments.
The examples help us determine the nature of computation that is to be done. In this case it is a simple division of the distance value by the value of average speed.
Body
The body of the function is simple:
```
    return(distance/speed);
```
If the body of the function begins to look complicated, we need to think about subdividing the problem into simpler subproblems and delegate the computation of the subproblem results to other functions. The rule is one task - one function. For example, if we planned a trip to Chicago, the total distance would be the sum of several segments of the trip. The computation of the total distance would then be a separate task delegated to another function.

The complete function looks as follows:
```
    /* Purpose: compute the travel time
    given the distance and average speed */
    double travelTime(double distance, double speed){
        return(distance/speed);
    }
```

Tests
The statements that specify the examples and tests are placed in a special Java test file. The complete test file FunctionTest.java would look like this:

/* FunctionTest.java 1.0  1 January 2003 */

public class FunctionTest extends JPFalt {
    public static void main(String[] args) { new FunctionTest(); }

    ////////////////////////////////////////////////////////////////
    // Place your actual methods here.                            //
    ////////////////////////////////////////////////////////////////

    /* compute the travel time given the distance and average speed
     -------------------------------------------------------------*/
    double travelTime(double distance, double speed){
        return(distance/speed);
    }

    /* Examples/Tests:
     --------------------------------------------------------------*/
    void travelTimeTest(){

        testHeader("travelTime(double distance, double speed)");

        expected(4.0);
        actual( travelTime(200.0, 50.0) );

        expected(7.0);
        actual(  travelTime(420.0, 60.0) );
    }
}

0.2.2 Other kinds of data

The answer to the question whether there is an apple in a fruit basket is either 'yes' or 'no', or alternatively true or false. Java uses boolean data type to represent true or false values. The data type that represents the number of pieces of fruit in the basket is int (standing for integer, or whole number). Name of a person or city is represented as a String, for example "New York City".

A function that determines whether a person is old enough to vote illustrates the use of boolean values and introduces relational operators. The Java expression age > 18 produces a boolean value true or false. Additional relational operators are <, <=, >=, ==, !=.

The following is an example of a function that returns a boolean value:

    /* determine whether a person can vote, given the year of birth */
    boolean canVote(int dob){
        return(2003 - dob) > 18;
    }

    /* Examples/Tests:
     --------------------------------------------------------------*/
    void canVoteTest(){
        testHeader("canVote(int age)");

        expected(false);
        actual( canVote(1985) );

        expected(true);
        actual(  canVote(1982) );

        expected(true);
        actual( canVote(1995) );
    }

0.2.3 Composition of functions

Sometimes the task is complex and hard to understand. Careful analysis typically reveals that the problem can be subdivided into smaller, simpler tasks.

You want to estimate the cost of gas needed for a trip to New York City. The distance is 210 miles, the cost of gas is $1.50 per gallon and the car averages 30 miles per gallon. The data analysis is simple. The function consumes distance (double miles), price per gallon (double price), and the gas consumption (double mpg). The purpose and the header follow:

    /* compute the price of gas needed for a trip,
       given the distance, the cost of gas, and miles per gallon
      -----------------------------------------------------------*/
    double tripCost(double distance, double mpg, double price)

Here are some examples. On the trip to New York City the car will need 210/30 = 7 gallons of gas, and so the cost of the trip will be 7 * $1.50 = $10.50. Another trip is to Washington, DC which is 400 miles away. The car consumes 20 mpg and the gas price is $1.60. The car will need 400/20 = 20 gallons of gas at a cost 20 * $1.60 = $32.00. In Java, the examples are:

    expected( 10.50);
    actual  ( tripCost(210, 30, 1.50) );

    expected( 32.00);
    actual  ( tripCost(400, 20, 1.60) );

The body of the function follows:

        return (gallonsNeeded(distance, mpg) * price);

The call of the function gallonsNeeded recognizes the fact that computing the number of gallons is a separate task. The function has not yet been defined, but our expected use guides its definition. The development of the tripCost function is nearly complete, but we cannot run the tests until the helper function gallonsNeeded is available.

Only examples and tests are needed to develop the helper function gallonsNeeded - the rest follows from the work done on tripCost function and the use of the gallonsNeeded function in its body. Examples and test are derived directly from examples and tests for the tripCost function. The final version is:

    /* compute gallons of gas are needed for a trip,
       given the distance, and miles per gallon
    -------------------------------------------------------------*/
    double gallonsNeeded(double distance, double mpg){
        return distance / mpg;
    }

    /* Examples/Tests:
     --------------------------------------------------------------*/
    void gallonsNeededTests(){
        testHeader("gallonsNeeded(int age)");

        expected( 7.0);
        actual  ( gallonsNeeded(210, 30) );

        expected( 20.00);
        actual  ( gallonsNeeded(400, 20) );
    }

0.3 Week 1 - Lecture 3: As You Like It

In some programs the computational formula is different for each of several sets of input values. For example, the price of admission to a museum is different for children, youth, adults, and seniors. Children get in for free, youth and seniors pay half price, adults pay $10.00. The design recipe for this type of functions introduces a new step. Obviously, the data the function ticketPrice consumes is the age of the patron (an int). The result is the price in dollars (a double). Before developing the body of the function, we create a template for the function body. For each condition, there is one entry:

    /* Template */

        if (age < 6)
            return ...
        if (age < 16)
            return ...
        if (age < 65)
            return...
        else
            return...

It is important to return to examples and tests to make sure there is at least one example/test for each condition, and preferably one test for each boundary between conditions as well. For this function at least four examples are needed (resulting in child, youth, adult, and senior prices). Preferably three more tests for ages 6, 16, and 65 are also included. The development of the body easily follows. The completed function is:

    /* compute the price of museum admission, given the age of patron
       ------------------------------------------------------------*/
    double ticketPrice(int age){
    /* Template
        if (age < 6)
            return ...
        if (age < 16)
            return ...
        if (age < 65)
            return ...
        else
            return...
     */

        if (age < 6)
            return 0.0;
        if (age < 16)
            return 5.00;
        if (age < 65)
            return 10.00;
        else
            return 5.00;
     }

    /* Examples/Tests:
     --------------------------------------------------------------*/
    void ticketPriceTests(){
        testHeader("ticketPrice(int age)");

        expected( 0.0);
        actual  ( ticketPrice(3) );

        expected( 5.0);
        actual  ( ticketPrice(12) );

        expected( 10.0);
        actual  ( ticketPrice(34) );

        expected( 5.0);
        actual  ( ticketPrice(80) );

        expected( 5.0);
        actual  ( ticketPrice(6) );

        expected( 10.0);
        actual  ( ticketPrice(16) );

        expected( 5.0);
        actual  ( ticketPrice(65) );
    }

The price of admission may be reduced for museum members:

age	member	non-member
under 6	free	free
under 16	$3.00	$5.00
under 65	$5.00	$10.00
65 and over	$3.00	$5.00

The template needs to reflect that:

    /* Template
        if (member){
            if (age < 6)
                return ...
            if (age < 16)
                return ...
            if (age < 65)
                return ...
            else
                return...
        }
        else{
            if (age < 6)
                return ...
            if (age < 16)
                return ...
            if (age < 65)
                return ...
            else
                return...
        }
     */

The { and } brackets enclose a body of code that is to be performed for the outer if statement and the outer else statement.

Last modified: Sun, Jan 12, 2003, 5:38 pm
HTML conversion by TeX2page 4q4