%\documentstyle[11pt]{article} %\documentstyle[proc]{article} %\begin{document} %\title{Demeter: A CASE Study of Software Growth\\ %Through Parameterized Classes} %\author{Karl J. Lieberherr, Arthur J. Riel\\ %Northeastern University, College of Computer Science\\ %360 Huntington Avenue, Boston MA 02115} % %\maketitle %\input /fiona/csfaculty/lieber/papers/new-obj/common.tex %%\input /fiona/csfaculty/wand/tex/time.tex %\centerline{Texed at \thetime\ \today} \section{Inheritance (Grafting)} %\noindent {\bf The Use of Inheritance in Demeter (Grafting)} There are three ways of acquiring plants. The easiest and most expensive way is to buy mature plants. The major drawback (besides cost) of this method is the unavailability of many varieties. Many plant fanciers decide to grow their favorite varieties from seed. While the cost may be lower, the time involved can be quite substantial. In addition, there are many things which can go wrong as our seed takes the long trip from sprout to mature plant. In the world of software development we find we have the same two choices. Either we buy the necessary software from someone who has already developed it or we develop it ourselves. Again we have the same advantages and drawbacks. There is a fair chance that the available software will not be the specific variety we want or it may be of dubious quality. If we decide to develop it ourselves then we must be willing to wait a long time and there is always the chance of something going wrong during that long development process. Many horticulturists look to a third alternative called grafting. The process of grafting involves splicing together part of a desired plant with a host plant of a related type. This method has the advantage in that the cost is low and much of the risk of nurturing a seed to maturity is removed. Software developers are usually at a loss when they attempt this third alternative. While plants have a common interface (typically a woody tissue) software systems do not. The chances of two systems using the same underlying data structure, the same language, and being modular enough to take functions for reuse in a new system are very low. The Demeter system avoids these problems by providing a common language for the abstract data structure of each system, a means of easy language translation, and very modular coding. This latter quality is enforced by the underlying object-oriented technology and the Law of Demeter\footnote {The Law of Demeter states that messages may only be sent to objects which are of instance variable types or argument types of a given method. This law is discussed in \cite{LHLR:law-paper}}. The Demeter system grafts one software system onto another through the use of the keywords $*$inherit$*$ and $*$override$*$ in the class descriptions. If an $*$inherit$*$ keyword prefixes a symbol it specifies that the class is to be ``copied'' into the host system. For the purpose of language definition the copying is explicit. For example, \bv A = Ident Number. B = String *inherit* A "end". \end{verbatim} implies that the language {\tt B} is defined by \bv B = String Ident Number "end". \end{verbatim} For the purpose of class definition we use the inheritance semantics of the underlying object-oriented system. In the above example we simply specify that class B inherits from class A. We make the assumption that the underlying \oos\ uses dynamic binding of the first argument, often called {\tt self}. This implies that the methods attached to superclasses of a class are available at that class. A more detailed discussion of inheritance is given later in connection with signature definition. The $*$override$*$ keyword is often used in conjunction with the $*$inherit$*$ keyword. It specifies that the class is to be inherited with some changes which will always be declared after the $*$override$*$ keyword. Our motivation for using override is as follows: A subclass is made from a number of superclasses by specifying additional properties (instance variables and methods) that elements of the subclass must have compared to elements of the superclasses. It is also necessary to make a subclass in which some of the properties inherited from superclasses are modified. Inherited methods and instance variable types can be modified in a subclass, subject to some restrictions that insure that the inheritance is conceptually a specialization hierarchy. The ability to modify the type of instance variables in subclasses is restricted in such a way that static type checking of the class definitions is still possible. If this property is violated, a warning will be given. We want the Demeter system to be able to guarantee that if B inherits from A then all objects in class B may safely be treated as A-objects. If B is defined by inheriting from A and adding additional variables, then the B-objects may be treated safely as A-objects. If B is defined by inheriting from A and modifying some of the types of the inherited instance variables (i.e. using an $*$override$*$ clause) then the rule that the type must be specialized will insure that B-objects may be treated safely as A-objects. The B-objects will know about all the messages the A-objects know of. There is an alternate way of specifying inheritance in the Demeter system: we can use an alternation production with implied inheritance. Syntactically this is expressed by using the the *common* token after the alternatives. For example, \bv Object : Pen | Brick *common* Number. Pen = ... Brick = ... \end{verbatim} defines an abstract class (one which will never be instantiated) Object which is inherited into both Pen and Brick. Pen and Brick have a common instance variable called weight. It is helpful to think of these classes in terms of like relations. We say that a Pen is like a Brick in that they share a common data definition, namely, a weight. Alternation productions are the preferred way of introducing inheritance since it expresses the generalization principle in a natural way. Of course, for the purposes of grafting from previously developed projects, the explicit $*$inherit$*$ is necessary. Our class dictionary notation requires that we give our typing information precisely for instance variables. If we say that an instance variable x is of type A, where A is defined by a construction production then x cannot contain an instance of a subclass of A; it has to be an instance of A. If we want to store a subtype of A in x we cannot define A by a construction production; we have to use an alternation production which specifies the legal subtypes of A. For the definition of legality of objects we need to define the set of classes associated\index{associated classes} with a given class. \begin{definition} For a construction, repetition or terminal class $S$, {\rm associated(S)} is $S$. For an alternation class $S$, {\rm associated(S)} consists of the union of the classes associated with the alternatives on the \rhs\ of the alternation production. \end{definition} For the following \dd, \bv Expression : Simple | Compound. Simple : Ident | Number. Compound = ... \end{verbatim} {\tt associated(Expression)} consists of {\tt Ident}, {\tt Number} and {\tt Compound}. \begin{definition} A construction object is {\rm legal} with respect to a construction production if every instance variable contains an object which is an instance of a class associated with the instance variable class given in the production. There is an exception for optional instance variables: their value may be nil. A repetition object is {\rm legal} with respect to a repetition production if every element of the object is an instance of a class associated with the class on the \rhs\ of the repetition production. If the repetition production does not allow an empty list, the repetition object must contain at least one element. A terminal object is {\rm legal} if it contains a value which satisfies the rules defined in the scanner generator specification (i.e. its lexical definition). %(to be discussed later). \end{definition} For defining the subtype concept we need the following definitions: \begin{definition} A class T directly inherits from class S if the definition of T is of one of the following forms: \bv T = ... *inherit* S ... S : T | ... *common* ... \end{verbatim} \end{definition} \begin{definition} A class T inherits from class S if there are sequences of intermediate classes which form direct inheritance links between every class in Associated(T) and S. \end{definition} \begin{definition} The class T is compatible with class S if the classes associated with the class T are a subset of the classes associated with class S. \end{definition} \begin{definition} A class T is a subtype (or subclass) of class S if %T and S %are related by the transitive closure of the following relations: \begin{itemize} \item T inherits from S or \item T is compatible with S. \end{itemize} Every class is a subtype (or subclass) of itself. \end{definition} We can summarize the subtype definition by: \begin{quotation} \fbox{Subtype} = \fbox{Inheritance} or \fbox{Compatibility} \end{quotation} We have to motivate this definition. Consider the following example: \bv AutoDealer = List(Car). Car : GasolineCar *override* Gasoline | CombustionCar *override* GasolineOrCoal *common* "speed" Number *inherit* Machine. List(S) ~ {S}. GasolineCar = "gasolinecar". CombustionCar = "combustioncar". Machine = "age" Number "fuel" FuelType. FuelType : Coal | Gasoline | Electricity. GasolineOrCoal : Gasoline | Coal. Gasoline = "gas". Coal = "coal". Electricity = "electricity". \end{verbatim} We define a class {\tt Car} which inherits from class {\tt Machine}. The {\tt Car} class has two alternatives: {\tt GasolineCar} and {\tt CombustionCar}. Both override the inherited instance variable type for {\tt fuel}. A machine can have any type of ``fuel'', be it gas, coal or electricity. A combustion car can use only gasoline or coal. Since we want to consider a combustion car a special kind of car, we have introduced the compatibility into our definition of subtype. There are other reasons to have both compatibility and inheritance in the subtype definition: reduction of inheritance links, avoidance of multiple inheritance, efficiency. The main idea behind the subtype definition is that if a type knows about a message then all objects belonging to subtypes of that type also know about that message. We want to prove that any combination of compatibility and inheritance will support this condition. The combination of compatibility and inheritance tends to constrain the form of class dictionaries. This is partially due to the fact that we restrict inherited classes to those defined by construction productions. We follow several definitions by a case analysis on the legal structure of class dictionaries for each of the four compatibility/inheritance combinations. We will define several semantic rules to deal with class dictionary structures which violate our definition of subtype. To support our subtype definition we need need to introduce the concept of a signature. Informally, Informally, signature(S) for a class S consists of all messages m to which all classes in associated(S) can react to by having a method attached to them or by inheritance. To define the concept of signature more formally, we use the following notation: A class $S$ is {\em responsive} to message m if m is attached to S or there is a class T such that m is attached to T and there are direct inheritance links from S to T. A class $S$ knows about message m if each class in associated(S) is responsive to m. $Signature(S) = \{m | m$ knows m $\}. %$signature(C) = \{m | m$ is a message to which any instance of $C$ will respond $\}$ $attached(C)$ is the set of methods attached to $C$. We give now the rules for computing the signature of a class: We assume that the inheritance graph does not contain cycles and we apply the following rules to a topologically sorted order of the classes. \begin{itemize} \item CONSTRUCTION: \bv C = A ... signature(C) = {name, set-name, ...} C = *inherit* A. signature(C) = attached(C) union signature(A). X = A ... *inherit* B *override* C ... signature(X) = {name1, set-name1, ...} union signature(B). \end{verbatim} \item REPETITION: \bv S ~ {R}. signature(S) = {first, rest, append, etc.} \end{verbatim} \item ALTERNATION: \bv C : A | B. \end{verbatim} \bv signature(C) = signature(A) intersection signature(B) \end{verbatim} \item ALTERNATION with implied inheritance: \bv C : A | B *common* X Y. A = Ident. B = Ident. signature(C) = (signature(A) intersection signature(B)) union {X, Y, set-X, set-Y} signature(A) = {b, set-b} union signature(C). signature(B) = {c, set-c} union signature(C). \end{verbatim} \end{itemize} %This completes the definition of signature. \begin{definition} A class S is said to ``know'' about a message m if m is a member of Signature(S). \end{definition} \begin{theorem} If {\rm T} is a subtype of {\rm S} then {\rm signature(T)} is a superset of {\rm signature(S)}. \end{theorem} Proof by case analysis. We first discuss the relevant cases. The first case occurs when we have a class C which inherits from B and the class B inherits from A. There are five possible class dictionary configurations which can produce this type of hierarchy. Each is presented in the following examples. \bv inheritance inheritance C <---------------------- B <---------------------- A case 1a: A = ..... B = ..... *inherit* A. C = ..... *inherit* B. case 1b: A = ..... B : C | .... *common* *inherit* A. case 1c: A : B | ..... *common*. B = ..... C = *inherit* B. case 1d: A : B | ..... *common*. B : C | ..... *common*. case 1e (disallowed): A = ...... B : X | Y. X = *inherit* A. Y = *inherit* A. C = *inherit* B. \end{verbatim} In the second case there exists a class C which is compatible with a class B and the class B is compatible with a class A. There are only two possible class dictionary templates which achieve this type of hierarchy. They are presented in the following examples. \bv compatibility compatibility C <---------------------- B <---------------------- A case 2a: A : W | X | Y | Z. B : X | Y | Z. C : X | Z. case 2b: A : C | ...... B : C. C = ..... \end{verbatim} The third case combines inheritance with compatibility such that there exists a class C which inherits from a class B and the class B is compatible with a class A. There are three possibilities here. In the third we use set notation to restrict the classes which appear on the right hand side of B to be a subset of the classes appearing on the right hand side of A. This is to ensure the compatibility relation. \bv inheritance compatibility C <---------------------- B <---------------------- A case 3a: A : B | ..... *common*. B : C | ..... *common*. case 3b: A : C | {X1 ... XN}. B : C | { y | y is in subset of {X1 .. XN}} *common*. C = ...... case 3c (disallowed): A : B | ...... B = ..... C = *inherit* B. \end{verbatim} The fourth and final case is a class hierarchy where a class C is compatible to a class B and the class B inherits from a class A. There are five cases where this can occur. \bv compatibility inheritance C <---------------------- B <---------------------- A case 4a: A = ...... B : X | Y | Z *common* *inherit* A. C : X | Z. case 4b (disallowed): A : B | ..... *common*. B : X | Y | Z. C : X | Y. case 4c: A = ...... B = ...... *inherit* A. C : B. case 4d: A : B | ...... *common*. B = ...... C : B. case 4e: A = ..... B : X | Y | C. X = *inherit* A. Y = *inherit* A. C = *inherit* A. \end{verbatim} We now prove that our definition of subtype guarantees that if a class B is a subtype of a class A then the class B knows about at least as many methods as the class A. \medskip \begin{itemize} \item Case 1: A class C inherits from a class B and the class B inherits from a class A. \smallskip \bv inheritance inheritance C <---------------------- B <---------------------- A \end{verbatim} \begin{enumerate} \item True by definition of signature and the following semantic rule which disallows the case 1e. \begin{semantic-rule} An alternation class B may not be used for inheritance with *inherit*. \end{semantic-rule} This rule is implied by the more general rule: \begin{semantic-rule} Inheritance with *inherit* is only allowed from construction classes. \end{semantic-rule} Another motivation for this rule is to avoid message name conflicts. \end{enumerate} \smallskip \item Case 2: A class C is compatible with a class B and the class B is compatible with a class A. \smallskip \bv compatibility compatibility C <---------------------- B <---------------------- A \end{verbatim} \begin{enumerate} \item Associated(C) $\subseteq$ Associated(B) \quad by def. of compatibility \goodbreak Associated(B) $\subseteq$ Associated(A) \quad by def. of compatibility \goodbreak Associated(C) $\subseteq$ Associated(A) \quad by transitivity of subset \smallskip \item A method :m which is known by A implies that all classes in Associated(A) have the method :m known to them. Since both Associated(B) and Associated(C) are subsets of Associated(A) then all classes in Associated(B) and Associated(C) must have :m attached to them. By the definition of ``knowing'', both B and C both know about the method :m. \end{enumerate} \smallskip \item Case 3: A class C inherits from a class B and the class B is compatible with a class A. \smallskip \bv inheritance compatibility C <---------------------- B <---------------------- A \end{verbatim} \begin{enumerate} \smallskip \item In case 3a B inherits from A and C inherits from B via the *common* alternation productions. Therefore, any method :m attached to A will be known by C since C inherits from A (by definition of inheritance). \item In the case 3b a method :m known to the class A must also be known to the class C by definition of ``know''. Similarly, by the definition of ``know'' the class B knows about :m since every member of Associated(B) knows about :m. \item Examining the case 3c shows that any method :m known to the class A must also be known to the class B by the definition of ``know''. Therefore, the class C will know about :m since it inherits from the class B. We will disallow the class dictionary structure via a semantic rule in the following proof. \end{enumerate} \smallskip \item Case 4: A class C is compatible with a class B and the class B inherits from a class A. \smallskip \bv compatibility inheritance C <---------------------- B <---------------------- A \end{verbatim} For class structures of this type we must prove each of the five subcases individually (i.e. cases 4a-4e from the previous section). \smallskip \begin{enumerate} \smallskip \item In the first case (4a) any method :m known to the class A is also known to the class B by definition of inheritance. The classes X, Y, and Z (or more generally Associated(B)) also know about the method :m since they inherit from the class B (via *common*). Since C is compatible to B the classes in Associated(C) must also appear in Associated(B), therefore all the classes in Associated(C) know about the method :m. We can now say that the class C knows about the method :m by the definition of ``know''. \item We disallow the second case (4b) and adopt a semantic rule to restrict our class dictionaries from having this type of structure. Such a semantic rule is clearly desirable. The result of this structure will be an abstract class (i.e. the uninstantiable class B) from which no other class inherits. Any method m attached to B can never be activated. Our semantic restriction does not decrease the expressive power of class dictionaries. In the worst case it will add come superfluous inheritance links. \begin{semantic-rule} In an alternation class with implied inheritance (i.e. using {\tt *common*}) the alternatives have to be defined by construction productions or by alternation productions which are eventually defined by construction productions. All alternation productions which define alternatives must also use implied inheritance. \end{semantic-rule} \item The third and fourth cases can be proven by observing that any method :m known to the class A is also known to the class B by the definition of inheritance (via *inherit* and *common* respectively). Since B knows about the method :m and Associated(C) contains only the class B we can say that C knows about the method :m by definition of ``know''. \item In the fifth case (4e) the class C directly inherits from A, therefore any method :m known by the class A will also be known by the class C. \end{enumerate} \end{itemize} The subtype concept is used both for checking whether the type of an actual parameter of a method is a subtype of the formal parameter type and for checking the specialization hierarchy property for $*$override$*$. The above proof guarantees the desirable property of name compatibility for Demeter subtypes. We have proven stronger forms of compatibility, but a discussion of this topic is beyond the scope of this paper. We now prove the transitivity of the subtype relation by examining each of our four cases. This proof is needed since our subtype definition contains a disjunction (or). \begin{theorem} The subtype relation is transitive. \end{theorem} Proof by case analysis. \begin{itemize} \item If a class C inherits from a class B and the class B inherits from A then the class C inherits from the class A. \begin{enumerate} \item True by definition of inheritance and since the case 1e is disallowed by a semantic rule. \end{enumerate} \item If a class C is compatible with a class B and the class B is compatible with a class A then the class C is compatible to the class A. \begin{enumerate} \item Associated(C) $\subseteq$ Associated(B) \quad by def. of compatibility \goodbreak Associated(B) $\subseteq$ Associated(A) \quad by def. of compatibility \goodbreak Associated(C) $\subseteq$ Associated(A) \quad by transitivity of subset \goodbreak C is compatible to A \quad by definition of compatibility \goodbreak \smallskip \end{enumerate} \item If a class C inherits from a class B and a class B is compatible to a class A then: \begin{enumerate} \item In case 3a we can prove the subtype relation in two ways. C is both compatible to A via the definition of compatibility and C inherits from A via the definition of inheritance. This occurs because B is both compatible and inherits from A and C is both compatible and inherits from B. \item In case 3b the class C is compatible to A by definition of compatibility. Associated(C) is a subset of Associated(A) by inspection. \item We disallow the case 3c with the following semantic rule. \begin{semantic-rule} A class which is used for inheritance with *inherit* cannot be used as an alternative in an alternation production without implied inheritance (i.e. without *common*). \end{semantic-rule} We need this restriction since we cannot reduce the combined subtype links in class dictionaries of this form to either compatibility or inheritance exclusively. Our restriction does not reduce the expressiveness of our notation. We can simply add implied inheritance to the alternation production in case 3c (i.e. we use *common*). \end{enumerate} The fourth and final case is a class hierarchy where a class C is compatible to a class B and the class B inherits from a class A. There are five cases where this can occur. We will analyze these separately. \begin{enumerate} \item In case 4a C inherits from A since all elements in Associated(C) inherit from A. \item The case 4b has been defined as illegal by a semantic rule. \item In cases 4c and 4d the class C inherits from class A since all elements in Associated(C) inherit from A. \item In the case 4e the class C directly inherits from the class A. \end{enumerate} \end{itemize}