How to Design Class Hierarchies

Viera K. Proulx and Matthias Felleisen

January 6, 2003

0.1 Numbers and Expressions

Computers can deal with a variety of data, such as text, numerical values, images, sounds, mouse movement, etc. Numerical values are the most natural to understand both for people and for computers. We start by developing simple programs that deal with numeric computation.

Java provides a way to represent several different kinds of numerical values as PRIMITIVE TYPES. We will use two of them now: type int that represents integer values and double that represents real numbers (with fractional part). Numbers 5, -3, 122456, -3241 are int values, numbers 3.1415, -0.00234, 3456.2347, 12.5 are double values. Java provides the standard four arithmetic operators: +, -, *, and /. In addition, there is an operator % that is used to compute the remainder after the division by a given number. So, for example, the result of computing 20 % 3 is 2.

Parentheses can be used to group subexpressions to specify the order of the expression evaluation. The standard rules for expression evaluation typically work as expected, but it is better to write programs where the reader does not have to look up such details.

0.2 Variables and Functions

In algebra we learn to formulate dependencies between quantities using VARIABLE EXPRESSIONS. A variable is a placeholder that stands for an unknown quantity. For example, a disk of radius r has the the approximate area
3.14 * r * r
In this expression r stands for any positive number. If we now come across a disk with radius 5, we can determine its area by substituting 5 for r in the above formula and reducing the resulting expression to a number:
3.14 * 5 * 5 = 3.14 * 25 = 78.5 .
More generally, expressions that contain variables are rules that describe how to compute a number when we are given values for the variables.

A function in Java is such a rule. All descriptions of actions we want our program to perform are written as functions. The function definition describes the desired computation and the type of result it generates. The following function computes the area of a disk

    double areaOfDisk(double radius){
        return ( radius * radius * 3.1415 );
    }

The first line is the function header or function signature. The programmer selects a name that describes the intended purpose of this function. The double variable radius that appears in parentheses to the right of the function names is called function parameter. It describes the input to this function, the value that will be used in the subsequent computation. The keyword double signifies the fact that the value will be some real number - i.e. a number with a fractional part. Even if the radius is an integer value, it is treated as a number with fractional part, because the final result never is an integer. The first double in the function signature signifies the fact that the function will compute or (return) a value of the type double.

The signature is followed by the definition of the actual computation - enclosed in a pair of braces { }. The computation specifies that the function should return the result of performing the specified operation, after substituting the input value of the parameter radius for the value of the variable.

Once we have defined a function, we can use it. So, for example, we may write expressions whose operation is areaOfDisk replacing the variable radius by the desired input value and omitting the type specification for the input:

    areaOfDisk(5.0)

We also say that we APPLY areaOfDisk to 5.0.

The application of the function is evaluated by copying the expression that follows the return and by replacing the variable radius by the number we supplied (5.0):

    areaOfDisk(5.0)
    = ( 5.0 * 5.0 * 3.1415 )
    = ( 25.0 * 3.1415 )
    = 78.5375

The area of disk is that of the outer disk minus the area of the inner disk, which means that the function requires two unknown quantities: the outer and the inner radii. Let us call these unknown numbers outer and inner. Then the function that computes the area of ring is defined as follows:

    double areaOfRing(double outer, double inner){
        return areaOfDisk(outer) - areaOfDisk(inner);
    }

The three lines express that areaOfDisk is a function, that the function accepts two input parameters called outer and inner, that the result will be a double type value, and that the result is going to be the difference between areaOfDisk(outer) and areaOfDisk(inner). In other words, we have used both basic Java arithmetic operations and earlier defined Java functions in the definition of areaOfDisk.

When we wish to use areaOfDisk, we must supply two inputs:

    areaOfRing(5.0, 3.0)

The function is evaluated in the same manner as areaOfDisk(5.0). We copy the expression from the definition of the function and replace the variables with the numbers we supplied:

    areaOfRing(5.0, 3.0)

    = areaOfRing(5.0) - areaOfRing(3.0)

    = ( 5.0 * 5.0 * 3.1415 ) - ( 3.0 * 3.0 * 3.1415 )

The rest is plain arithmetic.

0.3 Word Problems

Programmers are rarely handed mathematical expressions to turn into functions. Instead they typically receive informal problem descriptions that often contain irrelevant and sometimes ambiguous information. The programmer's first task is to extract the relevant information and then to formulate appropriate expressions.

Here is a typical example:

Company XYZ & Co. pays its employees $12 per hour. A typical employee works between 20 and 65 hours per week. develop a function that determines the wage of an employee from the number of hours of work.

The last sentence is the first to mention the actual task: to write a function that determines one quantity based on some other quantity. More specifically, the program consumes one quantity, the number of hours of work, and produces another one, the wage in dollars. The first sentence implies how to compte the result, but does not state it explicitly. In this particular example, though, this poses no problem. if an employee works h hours, the wage is

12 * h

Now that we have a rule, we can formulate a Java function:

    int wage(int h){
        return 12 * h;
    }

The function is called wage; its parameter h stands for the hours an employee works; the result is 12 * h, the corresponding wage.

0.4 Errors

When we write Java programs we must follow a number of carefully designed rules, which are a compromise between a computer's abilities and human behavior. For example, certain statements must end in a semicolon - that way the computer interpreting your program knows that you have finished that statement. The rules for writing functions are the following:

    return-type function-name ( par-type1 par-name1, ... ){
        return ...the computational expression... ;}

Syntax Errors: When the programmer does not follows the rules of the programming language syntax (its grammar), computer does not understand the program structure and signals a syntax error. Typical errors are the omission of a closing brace or a semicolon, misspelling of some of the words, or placing some parts of a statement out of order. To get familiar with how the compiler reacts to different types of errors, make intentional errors in your working program and observe the messages you get. Omit the closing brace, omit the semicolon, misspell a word, omit the word, use wrong type of argument in a function call, etc.

Runtime Errors: Once the program is successfully compiled it can be executed. That means that the computer now performs the actions the programmer specified. If the program expects a numeric value to perform a computation, but receives a word of text instead, it cannot continue and signals an error. A common error is an attempt to divide by zero.

Logical Errors: Computer performs the instructions given to it in a program exactly as stated. It does not know that the programmer made an arithmetic error such as

    double areaOfDisk(double radius){
        return ( radius + radius * 3.1415 );
    }

and computes the result as specified by the programmer. A programmer can catch such mistakes only by designing programs carefully and systematically.

0.5 Designing Functions

The preceding sections show that the development of a function requires many steps. We need to determine what's relevant in the problem statement and what we can ignore. We need to understand what the function consumes, what it produces, and how it relates inputs to outputs. We must know, or find out, whether Java provides certain basic operations for the data that our function is to process. If not, we might have to develop auxiliary functions that implement these operations. Finally, once we have a function, we must check whether it actually performs the intended computations. This might reveal syntax errors, run-time problems, or even logical errors.

To bring some order to this apparent chaos, it is best to set up and to follow a DESIGN RECIPE, that is, a step-by-step prescription of what we should do and the order in which we should do things. Based on what we have experienced thus far, the development of a program requires at least the following four activities:

Understanding the Function's Purpose:The goal of defining a function is to create a mechanism that consumes and produces data. We therefore start every function by giving the function a meaningful name and by stating what kind of information it consumes and produces. we call this a SIGNATURE. Here is how we write down a signature for the areaOfRing, one of our first functions:
```
double areaOfRing (double outer, double inner)
```
The left side of the signature specifies the type of the result the function produces. Next is the name of the function. To the right of the function is a list, which specifies the type of each input item the function needs and gives it a name. Another name for the inputs is parameters.

The signature for the function areaOfRing says that we refer to the first input as outer and the second one as inner, and that the function will produce a value of the type double.
Finally, using the signature and the parameters we should formulate a short PURPOSE STATEMENT for the program, that is, a brief comment of what the function is to compute. For most of our functions, one or two lines will suffice; as we develop larger and larger functions and programs, we may need to add more information to explain the function's purpose.
Here is the complete starting point for our running example; the header has been modified to provide a TEMPLATE for the function definition.
```
    /* Purpose: to compute the area of a ring, whose radius is
                outer and whose hole has radius of inner
     ------------------------------------------------------------*/
    double areaOfRing(double outer, double inner){
        return ... ;
    }
```
Hints: If the statement problem provides a mathematical formula, the number of distinct variables in the formula suggests how many inputs the program consumes.
For other problems, we must inspect the problem, we must inspect the problem to separate the given facts from what is to be computed. If a given is a fixed number, it shows up in the function. If it is an unknown number that is to be fixed by someone else later, it is an input. The question (or the imperative) in the problem statement suggests a name for the funciton. To determine the type of input parameters and the type of the return value, we must consider what the values for the inputs represent and how will they be used in the function.
Function Examples To gain a better understanding of what the function should compute, we make up examples of inputs and determine what the output should be. For example, areaOfRing should produce 110.9525 for the inputs 6.0 and 1.0, because it is the difference between the area of the outer disk and the inner disk.

We add examples to the purpose statement.
```
    /* Purpose: to compute the area of a ring, whose radius is
                outer and whose hole has radius of inner
     ------------------------------------------------------------*/
    /* Example: areaOfRing(6.0, 1.0) should produce 110.9525
     ------------------------------------------------------------*/
    double areaOfRing(double outer, double inner){
        return ... ;
    }
```
Making up examples - before we write down the function's body - helps in many ways. First, it is the only sure way to discover logical errors with testing. If we use the finished function to make up examples, we are tempted to trust the function because it is so much easier to run the function than to predict what it does. Second, examples force us to think through the computational process, which, for the complicated cases we will encounter later, is critical to the development of the function body. Finally, examples illustrate the informal prose of a purpose statement. Future readers of the function, such as teachers, colleagues, or buyers, greatly appreciate illustrations of abstract concepts.
The Body Finally, we must formulate the function's body. That is, we must replace the "..." in our definition with an expression. The expression computes the answer from the parameters, using Java's basic operations and Java functions that we already defined or intend to define.

We can only formulate the function's body if we understand how the function computes the output from the given inputs. if the input-output relationship is given as a mathematical formula, we just translate mathematics into Java expression. If, instead, we are given a word problem, we must craft the expression carefully. To this end, it is helpful to revisit the examples from the second step and to understand how we computed the outputs for the specific inputs.

In our running example, the computational tasks was given via an informally stated formula that re-used areaOfDisk, a previously defined function. Here is the translation into Java:
```
    /* Purpose: to compute the area of a ring, whose radius is
                outer and whose hole has radius of inner
     ------------------------------------------------------------*/
    /* Example: areaOfRing(6.0, 1.0) should produce 110.9525
     ------------------------------------------------------------*/
    double areaOfRing(double outer, double inner){
        return areaOfDisk(outer) - areaOfDisk(inner);
    }
```
Testing After we have completed the function definition, we must still test the function. That means we have to APPLY or INVOKE the function with several different values supplied as parameters, and make sure the function result value matches our expectation. At the minimum we should test the function on the examples we used to design the function. To facilitate the testing we provide three functions in the Exercise Set that help in performing such tests. The expression expected(someValue) prints in the console a message Expected: someValue. The expression actual (someFunction(value1, value2, ...)) prints in the console a message Actual: computed-return-value. To verify that a given function application returns the desired value, we therefore write
```
    expected(expectedResult);
    actual  (someFunction(value1, value2, ... ));
    
```
In addition, the function testHeader(someFunctionName) prints a header for this test suite: Testing function someFunctionName. The code below shows our tests for the areaOfRing function.
```
    /*-------------------------------------------------------------
     Tests:
     ------------------------------------------------------------*/
        void areaOfRingTests(){

        testHeader ( "areaOfRing" );

        expected(3.1415);
        actual  ( areaOfRing(1.0, 0.0) );

        expected(9.4245);
        actual  ( areaOfRing(2.0, 1.0) );

        expected(109.9525);
        actual  ( areaOfRing(6.0, 1.0) );
    }
```
Testing cannot show that a program produces the correct outputs for all possible inputs -- because there are typically an infinite number of possible inputs. But testing can reveal syntax errors, run-time problems, and logical mistakes.

For faulty outputs, we must pay special attention to our program examples. It is possible that the examples are wrong; that the program contains a logical mistake; or that both the examples and the program are wrong. In either case, we may have to step through the entire program development again.

Figure fig:recipe1-example shows what we get after we have developed the program according to our recipe. Figure fig:design1 summarizes the recipe in tabular form. It should be consulted whenever we design a program.

None of these skills are computer-specific; all of them are needed to solve all kinds of problems. Even lawyers and doctors, artisans and secretaries can benefit from laying out their work along these lines.

[Go to first, previous, next page]

Last modified: Mon, Jan 6, 2003, 6:53 am
HTML conversion by TeX2page 4q4