CS 3500 Fall 2013 Lecture 6: Data Abstractions - Specification and Implementation --------------------------------------------------------------- Topics: ------- Design test cases for a data type given as an algebraic specification. Data abstraction: understanding specification and implementation Designing algebraic specification ---------------------------- Design test cases for a data type given as an algebraic specification: ---------------------------------------------------------------------- The methods that create and manipulate a data type that represents a collection of data elements are interrelated, and so the tests, even when claiming to test one method, may rely on correct implementation of some other method. We always start with creating several instances of the data type. If there is two ways for creating two instances that are considered to be equal, we do that. But, be careful. Testing whether the two new instances are equal is a test for both the creators, and for the 'equals' method and the reason why a test fails may lie in either of these methods. If we were the implementors, we could write tests that would inspect the internal values of the representation - but we cannot do so when designing the black box tests. Next, we should test any methods that modify the structure of the data type - add or remove elements. Again, these tests rely on the correct implementation of the 'equals' method. The data we create for these tests provide the examples to use in methods that report on the size, the 'emptiness' of the data set. The easiest tests are those for the 'toString' method, as there we can compare the actual value with a specific expected string. Data abstraction: understanding specification and implementation: ----------------------------------------------------------------- Specification: ~~~~~~~~~~~~~~ The specification describes the expected behavior that the client can use and rely on. The underlying implementation can take several forms, and the client should not be given any information about the nature of the implementation. More importantly, the client should not be able to make any changes in the internals of the implementation, and should not be even able to observe the implementation details. Algebraic specification is one way to provide a specification for a data type. Another, somewhat less formal, but equally detailed and accurate, is the description of the data type behavior given as methods (possibly defined in an 'interface') with detailed purpose statements that always include three parts: - REQUIRES specifies any requirements on the method arguments (e.g. an integer must be positive) - MODIFIES specifies what members (field) of 'this' instance are modified within the body of this method. It may be that the method also modifies some of the arguments passed to it (for example a method that sorts the given 'ArrayList' may modify the 'ArrayList' that was provided as its argument. - EFFECT specifies what is the value that this method will return upon completion (In Fundies 2 we used to say that a method 'produces' a result, and have used the word 'EFFECT' for side effects - i.e. the modifications to the instance that invoked the method -- or any other data passed to the method.) If one of the methods is a constructor, the specification may be given in the form of an abstract class, or, if targeted for specific application, even a concrete class - but with only the method signatures and specifications of each method's behavior. The textbook introduces the 'Poly' class that provides for representation and manipulation of polynomials. The specification on page 84 does not show how the methods would be implemented -- that would depend on how we decide to represent the data that describes a particular polynomial. Implementation: ~~~~~~~~~~~~~~~ A given specification can be implemented in several different ways. The programmer working on the implementation is responsible to the client - to deliver an implementation that fulfills the specification. The client can verify the compliance by running the black-box tests. The programmer is also responsible for making the code readable and understandable to a future developer that may be asked to make changes or improvements to the existing implementation. Besides following the Java coding and style conventions, the programmer should provide 'documentation' that includes not only a description of any method's behavior, but also reveal the design decisions the programmer made to define the data representation or to implement a method. Regardless of whether the specification asks for it, the programmer should always implement the methods 'toString', 'equals' 'hashCode', and 'clone'. The 'toString' method should generate a reasonable representation of the information that the instance of the data type represent, though at times this may not be do-able. There is no reasonable way to represent a set of data - as the order is irrelevant and would be a prominent feature of any generated String. The 'equals' method is extremely important. If implemented incorrectly, many of the black box tests will fail. It is very important to understand deeply the different ways that one can compare two objects for equality in Java. The most important distinction is between 'a.equals(b)' and 'a == b'. The first case invokes the 'equals' method defined for the class in which 'a' has been defined. The second case always checks for object identity, i.e. are the objects 'a' and 'b' identical, saved at the same memory location. The 'hashCode" method must be overridden to match the 'equals' method. Any two objects that are 'equals' must generate the same 'hashCode'. The 'clone' method should be overridden for every data type as well. While we strive to design the 'clone' method to make a deep copy of the data, this is not always possible or reasonable. But it is important to document the exact level of cloning that the method will perform. Information hiding: ~~~~~~~~~~~~~~~~~~~ The implementor must hide any details of implementation from the client. Typically all fields in the implementor's classes should be private. If the implementation consists of several interacting classes that need to access each-other's data, some fields may be defined as 'protected' -- still hiding the implementation from the user of the 'package'. Any helper methods not described in the specification should also be 'private' - or at least 'protected'. Designing algebraic specification: --------------------------------- To better understand the algebraic specification of data types, we will go through an exercise to define the algebraic specification for an immutable Queue data type. We started this in this lecture, but will continue in the next one.