%\documentstyle[11pt]{article} %\documentstyle[proc]{article} %\begin{document} %\title{Demeter: A CASE Study of Software Growth\\ %Through Parameterized Classes} %\author{Karl J. Lieberherr, Arthur J. Riel\\ %Northeastern University, College of Computer Science\\ %360 Huntington Avenue, Boston MA 02115} % %\maketitle %\input /fiona/csfaculty/lieber/papers/new-obj/common.tex %%\input /fiona/csfaculty/wand/tex/time.tex %\centerline{Texed at \thetime\ \today} \section{Inheritance (Grafting)} %\noindent {\bf The Use of Inheritance in Demeter (Grafting)} There are three ways of acquiring plants. The easiest and most expensive way is to buy mature plants. The major drawback (besides cost) of this method is the unavailability of many varieties. Many plant fanciers decide to grow their favorite varieties from seed. While the cost may be lower, the time involved can be quite substantial. In addition, there are many things which can go wrong as our seed takes the long trip from sprout to mature plant. In the world of software development we find we have the same two choices. Either we buy the necessary software from someone who has already developed it or we develop it ourselves. Again we have the same advantages and drawbacks. There is a fair chance that the available software will not be the specific variety we want or it may be of dubious quality. If we decide to develop it ourselves then we must be willing to wait a long time and there is always the chance of something going wrong during that long development process. Many horticulturists look to a third alternative called grafting. The process of grafting involves splicing together part of a desired plant with a host plant of a related type. This method has the advantage in that the cost is low and much of the risk of nurturing a seed to maturity is removed. Software developers are usually at a loss when they attempt this third alternative. While plants have a common interface (typically a woody tissue) software systems do not. The chances of two systems using the same underlying data structure, the same language, and being modular enough to take functions for reuse in a new system are very low. The Demeter system avoids these problems by providing a common language for the abstract data structure of each system, a means of easy language translation, and very modular coding. This latter quality is enforced by the underlying object-oriented technology and the Law of Demeter\footnote {The Law of Demeter states that messages may only be sent to self and objects which are an instance of a type which is associated with the instance variable types or argument types of the given method. This law is discussed in \cite{LHLR:law-paper}}. The Demeter system grafts one software system onto another through the use of the keywords $*$inherit$*$ and $*$override$*$ in the class descriptions. If an $*$inherit$*$ keyword prefixes a symbol it specifies that the class is to be ``copied'' into the host system. For the purpose of language definition the copying is explicit. For example, \bv A = Ident Number. B = String *inherit* A "end". \end{verbatim} implies that the language {\tt B} is defined by \bv B = String Ident Number "end". \end{verbatim} For the purpose of class definition we use the inheritance semantics of the underlying object-oriented system. In the above example we simply specify that class B inherits from class A. We make the assumption that the underlying \oos\ uses dynamic binding of the first argument, often called {\tt self}. This implies that the methods attached to superclasses of a class are available at that class. A more detailed discussion of inheritance is given later in connection with the definition of the signature of a class. The $*$override$*$ keyword is only used in conjunction with the $*$inherit$*$ keyword. It specifies that the class is to be inherited with some changes which will always be declared after the $*$override$*$ keyword. A similar approach is taken in Trellis/Owl \cite{schaffert:trellis-86}. Our motivation for using override is as follows: A subclass is made from a number of superclasses by specifying additional properties (instance variables and methods) that elements of the subclass must have compared to elements of the superclasses. It is also necessary to make a subclass in which some of the properties inherited from superclasses are modified. Inherited methods and instance variable types can be modified in a subclass, subject to some restrictions that insure that the inheritance is conceptually a specialization hierarchy. The ability to modify the type of instance variables in subclasses is restricted in such a way that static type checking of the class definitions is still possible. If this property is violated, a warning will be given. We want the Demeter system to be able to guarantee that if B inherits from A then all objects in class B may safely be treated as A-objects. If B is defined by inheriting from A and adding additional variables, then the B-objects may be treated safely as A-objects. If B is defined by inheriting from A and modifying some of the types of the inherited instance variables (i.e. using an $*$override$*$ clause) then the rule that the type must be specialized will insure that B-objects may be treated safely as A-objects. The B-objects will know about all the messages the A-objects know of. There is an alternate way of specifying inheritance in the Demeter system: we can use an alternation production with implied inheritance. Syntactically this is expressed by using the the *common* token after the alternatives. For example, \bv Object : Pen | Brick *common* Number. Pen = ... Brick = ... \end{verbatim} defines an abstract class (one which will never be instantiated) Object which is inherited into both Pen and Brick. Pen and Brick have a common instance variable called weight. It is helpful to think of these classes in terms of like relations. We say that a Pen is like a Brick in that they share a common data definition, namely, a weight. Alternation productions are the preferred way of introducing inheritance since it expresses the generalization principle in a natural way. Of course, for the purposes of grafting from previously developed projects, the explicit $*$inherit$*$ is necessary. Our class dictionary notation requires that we give our typing information precisely for instance variables. If we say that an instance variable x is of type A, where A is defined by a construction production then x cannot contain an instance of a subclass of A; it has to be an instance of A. If we want to store a subtype of A in x we cannot define A by a construction production; we have to use an alternation production which specifies the legal subtypes of A. For the definition of legality of objects and for subtyping, we need to define the set of classes associated\index{associated classes} with a given class. As an abbreviation of associated(S) we use S'. \begin{definition} For a construction, repetition or terminal class $S$, {\rm S'} is $S$. For an alternation class $S$, {\rm S'} consists of the union of the classes associated with the alternatives on the \rhs\ of the alternation production. \end{definition} For the following \dd, \bv Expression : Simple | Compound. Simple : Ident | Number. Compound = ... \end{verbatim} {\tt Expression'} consists of {\tt Ident}, {\tt Number} and {\tt Compound}. \begin{definition} A construction object is {\rm legal} with respect to a construction production if every instance variable contains an object which is an instance of a class associated with the instance variable class given in the production. There is an exception for optional instance variables: their value may be nil. A repetition object is {\rm legal} with respect to a repetition production if every element of the object is an instance of a class associated with the class on the \rhs\ of the repetition production. If the repetition production does not allow an empty list, the repetition object must contain at least one element. A terminal object is {\rm legal} if it contains a value which satisfies the rules defined in the scanner generator specification (i.e. its lexical definition). %(to be discussed later). \end{definition} For defining the subtype concept we need the following definitions: \begin{definition} A class T directly inherits from class S if the definition of T is of one of the following forms: \bv T = ... *inherit* S ... S : T | ... *common* ... \end{verbatim} \end{definition} \begin{definition} A class T inherits from class S if there is a sequence of intermediate classes which form direct inheritance links between class T and S. \end{definition} For example, the following production implies that Orange inherits from class EatableObject. Fruit is an intermediate class. \bv Fruit : Orange | Apple *common* *inherit* EatableObject. \end{verbatim} \begin{definition} The class T is compatible with class S if the classes associated with the class T are a subset of the classes associated with class S. \end{definition} \begin{definition} A class T is a subtype (or subclass) of class S if %T and S %are related by the transitive closure of the following relations: \begin{itemize} \item T inherits from S or \item T is compatible with S. \end{itemize} Every class is a subtype (or subclass) of itself. \end{definition} We can summarize the subtype definition by: \begin{quotation} \fbox{Subtype} = \fbox{Inheritance} or \fbox{Compatibility} \end{quotation} We have to motivate this definition. Consider the following example: \bv AutoDealer = List(Car). Car : GasolineCar *override* Gasoline | CombustionCar *override* GasolineOrCoal *common* "speed" Number *inherit* Machine. List(S) ~ {S}. GasolineCar = "gasolinecar". CombustionCar = "combustioncar". Machine = "age" Number "fuel" FuelType. FuelType : Coal | Gasoline | Electricity. GasolineOrCoal : Gasoline | Coal. Gasoline = "gas". Coal = "coal". Electricity = "electricity". \end{verbatim} We define a class {\tt Car} which inherits from class {\tt Machine}. The {\tt Car} class has two alternatives: {\tt GasolineCar} and {\tt CombustionCar}. Both override the inherited instance variable type for {\tt fuel}. A machine can have any type of ``fuel'', be it gas, coal or electricity. A combustion car can use only gasoline or coal. Since we want to consider a combustion car a special kind of car, we have introduced the compatibility into our definition of subtype. There are other reasons to have both compatibility and inheritance in the subtype definition: reduction of inheritance links, avoidance of multiple inheritance, efficiency. The main idea behind the subtype definition is that if a type knows about a message then all objects belonging to subtypes of that type also know about that message. We want to prove that any combination of compatibility and inheritance will support this condition. The combination of compatibility and inheritance tends to constrain the form of class dictionaries. This is partially due to the fact that we restrict inherited classes to those defined by construction productions. We follow several definitions by a case analysis on the legal structure of class dictionaries for each of the four compatibility/inheritance combinations. We will define several semantic rules to deal with class dictionary structures which violate our definition of subtype. To support our subtype definition we need need to introduce the concept of a signature. Informally, signature(S) for a class S consists of all messages m to which all classes in S' can react to by having a method attached to them or by inheritance. To define the concept of signature more formally, we use the following notation: \begin{definition} A class $S$ is {\em responsive} to message m if m is attached to S or there is a class T such that m is attached to T and there are direct inheritance links from S to T. A class $S$ knows message m if each class in S' is responsive to m. $Signature(S) = \{m | S$ knows m $\}$. \end{definition} %$signature(C) = \{m | m$ is a message to which any instance of $C$ will respond $\}$ % $attached(C)$ is the set of methods attached to $C$. We give now the rules for computing the signature of a class: We assume that the subtype graph does not contain cycles and we apply the following rules to a topologically sorted order of the classes. This computation does not take the user-defined methods into account. They are easy to add to the algorithm. We assume that the signature of the abstract classes universal, construction, repetition, and terminal are given. \begin{itemize} \item CONSTRUCTION: \bv S = A ... signature(S) = {name, set-name, ...} union signature(construction) S = *inherit* A. signature(S) = signature(A) union signature(construction). S = A ... *inherit* B *override* C ... signature(S) = {name1, set-name1, ...} union signature(B) union signature(construction). \end{verbatim} \item REPETITION: \bv S ~ {R}. signature(S) = signature(repetition) \end{verbatim} \item ALTERNATION: \bv S : A | B. \end{verbatim} \bv signature(S) = signature(A) intersection signature(B) \end{verbatim} \item ALTERNATION with implied inheritance: \bv S : A | B *common* X Y. A = Ident. B = Ident. signature(S) = (signature(A) intersection signature(B)) union {X, Y, set-X, set-Y} signature(A) = {b, set-b} union signature(S) union signature(construction). signature(B) = {c, set-c} union signature(S) union signature(construction). \end{verbatim} \end{itemize} %This completes the definition of signature. % \begin{definition} % A class S is said to ``know'' about a message m if m is a member of % Signature(S). % \end{definition} \begin{theorem} If {\rm T} is a subtype of {\rm S} then {\rm signature(T)} is a superset of {\rm signature(S)}. \end{theorem} To prove this theorem we need to define semantic rules which are easily checked at compile-time. We state those semantic rules in the proof at the place where they are needed to highlight the exact dependency of of the theorem on the semantic rules. The proof is divided into two parts. In the first part we prove that knowledge of messages and responsiveness to messages are transmitted properly along inheritance and compatibility links. In the second part we analyse the four cases of combining inheritance and compatibility and prove the theorem for each case. \begin{claim} If B inherits from A and A is responsive to m, then B is responsive to m. \end{claim} This follows immediately from the definition of responsiveness and the transitivity of inheritance. \begin{claim} If B inherits from A and A knows m then B knows m. \end{claim} We have only to discuss the case where A knows m, but is not responsive to m. This implies that A is defined by an alternation production and every class in A' is responsive to m. If B is in A' then B is responsive to m and therefore knows about m. If B is not in A', we have two cases: In the first case, B inherits from A using a *inherit* link. If we request that A be a construction class, we know that A must be responsive to m (which implies that B is responsive to m and therefore B knows m.) %If B is not in A', we must have that A=A'. %This holds if A is a construction, repetition or terminal class. %For reasons not mentioned here, we choose that A must be a construction class %and we get therefore the following semantic rule. Therefore we use the following semantic rule. \begin{semantic-rule} Inheritance with *inherit* is only allowed from construction classes. \end{semantic-rule} In the second case, B is a subclass of A through alternation productions with implied inheritance and is an alternation class itself. The claim follows from the definition of knowledge of a message. \begin{claim} If B is compatible with A and A is responsive to m, then B is responsive to m. \end{claim} A must be defined by an alternation production. If A is responsive to m then A has either m attached or inherits it. If A has m attached, we require that the alternation production has implied inheritance with the following rule: \begin{semantic-rule} If an alternation class has a method attached, it must be defined with implied inheritance (i.e. with *common*). \end{semantic-rule} \begin{claim} If B is compatible with A and A knows m then B knows m. \end{claim} A must be defined by an alternation production. We have to discuss the case where A knows m but is not responsive to m. Therefore all classes in A' are responsive to m. If B is in A' then B is responsive to m and therefore knows about m. If B' is a subset of A', the claim holds by the definition of knowledge of a message. %B must be in A', since otherwise %A = A' which in turn implies that knowledge of m and responsiveness %to m coincide. This completes the first part of the proof. In the second part we proceed by case analysis. We first discuss the relevant cases which we also use in a later theorem.. The first case occurs when we have a class C which inherits from B and the class B inherits from A. There are four possible class dictionary configurations which can produce this type of hierarchy. Each is presented in the following examples. \bv inheritance inheritance C <---------------------- B <---------------------- A case 1a: A = ..... B = ..... *inherit* A. C = ..... *inherit* B. case 1b: A = ..... B : C | .... *common* *inherit* A. case 1c: A : B | ..... *common*. B = ..... C = *inherit* B. case 1d: A : B | ..... *common*. B : C | ..... *common*. \end{verbatim} In the second case there exists a class C which is compatible with a class B and the class B is compatible with a class A. There are only two possible class dictionary templates which achieve this type of hierarchy. They are presented in the following examples. \bv compatibility compatibility C <---------------------- B <---------------------- A case 2a: A : W | X | Y | Z. B : X | Y | Z. C : X | Z. case 2b: A : C | ...... B : C. C = ..... \end{verbatim} The third case combines inheritance with compatibility such that there exists a class C which inherits from a class B and the class B is compatible with a class A. There are three possibilities here. In the second we use set notation to restrict the classes which appear on the right hand side of B to be a subset of the classes appearing on the right hand side of A. This is to ensure the compatibility relation. \bv inheritance compatibility C <---------------------- B <---------------------- A case 3a: A : B | ..... *common*. B : C | ..... *common*. case 3b: A : C | {X1 ... XN}. B : C | { y | y is in subset of {X1 .. XN}} *common*. C = ...... case 3c (disallowed by semantic rule 5.4): for guaranteeing transitivity, not needed for theorem 5.1. A : B | ...... B = ..... C = *inherit* B. \end{verbatim} The fourth and final case is a class hierarchy where a class C is compatible to a class B and the class B inherits from a class A. There are four cases where this can occur. \bv compatibility inheritance C <---------------------- B <---------------------- A case 4a: A = ...... B : X | Y | Z *common* *inherit* A. C : X | Z. case 4b (disallowed by semantic rule 5.3): A : B | ..... *common*. B : X | Y | Z. C : X | Y. case 4c: A = ...... B = ...... *inherit* A. C : B. case 4d: A : B | ...... *common*. B = ...... C : B. \end{verbatim} We now complete the proof of theorem 5.1. \medskip \begin{itemize} \item Case 1: A class C inherits from a class B and the class B inherits from a class A. \smallskip \bv inheritance inheritance C <---------------------- B <---------------------- A \end{verbatim} \begin{enumerate} \item True by definition of signature. \end{enumerate} \smallskip \item Case 2: A class C is compatible with a class B and the class B is compatible with a class A. \smallskip \bv compatibility compatibility C <---------------------- B <---------------------- A \end{verbatim} \begin{enumerate} \item C' $\subseteq$ B' \quad by def. of compatibility \goodbreak B' $\subseteq$ A' \quad by def. of compatibility \goodbreak C' $\subseteq$ A' \quad by transitivity of subset \smallskip \item A method :m which is known by A implies that all classes in A' have the method :m known to them. Since both B' and C' are subsets of A' then all classes in B' and C' must have :m attached to them. By the definition of ``knowing'', both B and C both know about the method :m. \end{enumerate} \smallskip \item Case 3: A class C inherits from a class B and the class B is compatible with a class A. \smallskip \bv inheritance compatibility C <---------------------- B <---------------------- A \end{verbatim} \begin{enumerate} \smallskip \item In case 3a B inherits from A and C inherits from B via the *common* alternation productions. Therefore, any method :m attached to A will be known by C since C inherits from A (by definition of inheritance). \item In the case 3b a method :m known to the class A must also be known to the class C by definition of ``know''. Similarly, by the definition of ``know'' the class B knows about :m since every member of B' knows about :m. \item Examining the case 3c shows that any method :m known to the class A must also be known to the class B by the definition of ``know''. Therefore, the class C will know about :m since it inherits from the class B. We will disallow this case via a semantic rule in the proof of transitivity of the subtype relation. \end{enumerate} \smallskip \item Case 4: A class C is compatible with a class B and the class B inherits from a class A. \smallskip \bv compatibility inheritance C <---------------------- B <---------------------- A \end{verbatim} For class structures of this type we must prove each of the five subcases individually (i.e. cases 4a-4e from the previous section). \smallskip \begin{enumerate} \smallskip \item In the first case (4a) any method :m known to the class A is also known to the class B by definition of inheritance. The classes X, Y, and Z (or more generally B') also know about the method :m since they inherit from the class B (via *common*). Since C is compatible to B the classes in C' must also appear in B', therefore all the classes in C' know about the method :m. We can now say that the class C knows about the method :m by the definition of ``know''. \item We disallow the second case (4b) and adopt a semantic rule to restrict our class dictionaries from having this type of structure. Such a semantic rule is clearly desirable. The result of this structure will be an abstract class (i.e. the uninstantiable class B) from which no other class inherits. Any method m attached to B can never be activated. Our semantic restriction does not decrease the expressive power of class dictionaries. In the worst case it will add come superfluous inheritance links. \begin{semantic-rule} In an alternation class with implied inheritance (i.e. using {\tt *common*}) the alternatives have to be defined by construction productions or by alternation productions which are eventually defined by construction productions. All alternation productions which define alternatives must also use implied inheritance. \end{semantic-rule} \item The third and fourth cases can be proven by observing that any method :m known to the class A is also known to the class B by the definition of inheritance (via *inherit* and *common* respectively). Since B knows about the method :m and C' contains only the class B we can say that C knows about the method :m by definition of ``know''. %\item %In the fifth case (4e) the class C directly inherits from A, therefore any %method :m known by the class A will also be known by the class C. \end{enumerate} \end{itemize} The subtype concept is used both for checking whether the type of an actual parameter of a method is a subtype of the formal parameter type and for checking the specialization hierarchy property for $*$override$*$. The above proof guarantees the desirable property of name compatibility for Demeter subtypes. We have proven stronger forms of compatibility, but a discussion of this topic is beyond the scope of this paper. We now prove the transitivity of the subtype relation by examining each of our four cases. This proof is needed since our subtype definition contains a disjunction (or). The subtype transitivity theorem is not essential for object-oriented programming. It is theorem 5.1 that is important. \begin{theorem} The subtype relation is transitive. \end{theorem} Proof by case analysis. \begin{itemize} \item If a class C inherits from a class B and the class B inherits from A then the class C inherits from the class A. \begin{enumerate} \item True by definition of inheritance. \end{enumerate} \item If a class C is compatible with a class B and the class B is compatible with a class A then the class C is compatible to the class A. \begin{enumerate} \item C' $\subseteq$ B' \quad by def. of compatibility \goodbreak B' $\subseteq$ A' \quad by def. of compatibility \goodbreak C' $\subseteq$ A' \quad by transitivity of subset \goodbreak C is compatible to A \quad by definition of compatibility \goodbreak \smallskip \end{enumerate} \item If a class C inherits from a class B and a class B is compatible to a class A then: \begin{enumerate} \item In case 3a we can prove the subtype relation in two ways. C is both compatible to A via the definition of compatibility and C inherits from A via the definition of inheritance. This occurs because B is both compatible and inherits from A and C is both compatible and inherits from B. \item In case 3b the class C is compatible to A by definition of compatibility. C' is a subset of A' by inspection. \item We disallow the case 3c with the following semantic rule. \begin{semantic-rule} A class which is used for inheritance with *inherit* cannot be used as an alternative in an alternation production without implied inheritance (i.e. without *common*). \end{semantic-rule} We need this restriction since we cannot reduce the combined subtype links in class dictionaries of this form to either compatibility or inheritance exclusively. Our restriction does not reduce the expressiveness of our notation. We can simply add implied inheritance to the alternation production in case 3c (i.e. we use *common*). \end{enumerate} The fourth and final case is a class hierarchy where a class C is compatible to a class B and the class B inherits from a class A. There are five cases where this can occur. We will analyze these separately. \begin{enumerate} \item In case 4a C inherits from A since all elements in C' inherit from A. \item The case 4b has been defined as illegal by a semantic rule. \item In cases 4c and 4d the class C inherits from class A since all elements in C' inherit from A. %\item %In the case 4e the class C directly inherits from the class A. \end{enumerate} \end{itemize}