\documentstyle[12pt]{article}
%\documentstyle{article}
%\documentstyle[proc]{article}
%\documentstyle[proc]{article}
%\pagestyle{headings}

\begin{document}
\bibliographystyle{alpha}
\raggedbottom

%\title{Formulations and Benefits\\ of the Law of Demeter}
\title{Preventive Maintenance of\\ 
Object-Oriented Software}
\author{Karl J. Lieberherr, Ian Holland\\
Northeastern University, College of Computer Science\\
Cullinane Hall, 360 Huntington Ave., Boston MA 02115\\
lieber or holland@corwin.CCS.northeastern.EDU}

\maketitle
\input time.tex 
%\centerline{Texed at \thetime\ \today}
%\centerline{Copyright \copyright 1988 Karl Lieberherr}
\input common.tex

\begin{abstract}
Our initial ideas on preventive maintenance of object-oriented
systems were presented in \cite{LHLR:law-paper} which describes
a set of guidelines called the ``Law of Demeter''.\footnote{\demsign\/
is the name of the object-oriented CASE system we are working on \cite{lieber-riel:oop}, \cite{karl1:class}, \cite{karl:demeter}.}
As a result of this publication we have received much feedback, as
well as a detailed critique from \cite{sakkinen:law-88}.
In this paper we expand on the initial ideas and present a new
perspective with which to view the Law. This perspective is based
on client/supplier relationships between methods and classes
and allows a cleaner description of the benefits of the Law.
This paper is also a reply
to \cite{sakkinen:law-88} which was pointing out that the formulation
of the Law for C++ needs additional work.
\end{abstract}

\begin{quote}
At first sight, the idea of any rules or principles being 
superimposed on the creative mind seems more likely to hinder  than to
help, but this is really quite untrue in practice.
Disciplined thinking focusses inspiration rather than blinkers it.

{\em "G.L. Glegg: "The Design of Design"}
\end{quote}

\noindent 
{\bf Keywords:} Object-oriented programming style, preventive program maintenance, information
hiding and restriction, C++, CLOS, Eiffel, Flavors,
Smalltalk-80.
\section{Introduction}

In our work with \oo\ techniques we have become
interested in the issues of \oo\ programming style and preventive
maintenance of \oo\ systems.  Our initial ideas were presented in 
\cite{LHLR:law-paper} which describes a set of guidelines 
called the ``Law of Demeter''. 
This ``Law'' promotes a restriction on message
sendings to objects.  As a result of this publication we have received
much feedback as well as a detailed critique from \cite{sakkinen:law-88}.
In this paper we expand on the initial ideas and present a new perspective
with which to view the Law.  The initial papers dealt primarily with
a pure \oo\ system and here we discuss issues concerning mixed paradigm
languages (e.g., C++ \cite{stroustrup:c++}) and other variations on the \oo\ theme (e.g., CLOS \cite{clos:bobrow-88}).

   As the initial formulation of the Law was stated in the terminology
of the Demeter system, we present formulations in notation specific to
several \oo\ programming languages.  We hope that these formulations
will aid the 
understanding of the ideas and motivations behind the Law.  However,
we do not pretend to be experts in all the \oo\ programming languages
in use today and so we do not consider these formulations as cast in
concrete.  We welcome responses like \cite{sakkinen:law-88} to iron out the programming
language specific wrinkles.

   The Law of Demeter as presented in \cite{LHLR:law-paper} was proposed as one of a
number of \oop\ guidelines which we want to follow and teach.  Since our
original paper was submitted other works, e.g., \cite{johnson-foote:rules},
\cite{smith:principles},
have appeared
which deal with the issues of programming style, data 
encapsulation and preventative maintenance.  
%The \oo\ paradigm is a powerful
%one and it is clear that we are not alone in investigating how it can be
%used to its full potential.

\section{A question of notation}

It is our intention that the ideas presented in \cite{LHLR:law-paper}
and in this
paper be as generic as possible and not be bound to any particular
\oop\ language.  The Demeter EBNF\footnote{Extended Backus-Naur Form,
\cite{wirth:ebnf}.} \cite{lieber-riel:oop}
used
to describe the class hierarchies in the examples is a compact, 
programming language independent notation.  This approach of using a language definition 
language to describe class hierarchies has been lately taken up by
others (e.g., \cite{wegner:concept} and \cite{madsen-norgard:hicss-88}) 
though in a slightly 
different form.  
We use the Demeter EBNF as a high-level \oo\ data structure
description language rather than using pages of bubble diagrams or
some specific \oo\ language.  In the
Demeter system these descriptions are used directly to generate
Lisp/Flavors and C++ language specific class definition code.

%   As a further clarification of the terminology used, we will sometimes
%refer to the ``type'' of an object.
%By this we mean the class of which
%the object is an instance.  
%So a type is any concept which would translate
%into a class definition.  We distinguish this concept from the typing
%system 
%belonging to a particular object orinted programming language (e.g. int, char
%or lisp list).  We call the latter `base types'.(C++ conflict)
%This allows us to deal with
%issues concerning strictly base typed languages such as C++ and base type free
%languages such as Smalltalk.  In this terminology we can now discuss the types
%of objects in Smalltalk and Lisp/Flavors in a meaningful way.
%
%The Law in terms of Clients etc and the dependancy links between classes
%
%discussion about object creation constructor functions etc
%
%discussion about return objects from method calls and documentation
% requirements
%
%discussion of meta classes 
%
%discussion about the type version and object version
%
%discussion about the law for multiparadigm
%
%discussion about the law for multi arguement method selection.
%
%problems with the law
%
%conclusion.
\section{Clients, suppliers and dependencies}

In this section we explain the Law of Demeter from
a new point of view by adapting 
the client/supplier terminology from \cite{meyer:book-88}.
This terminology is useful for explaining the consequences of the Law
and also for formulating it. Before we get into the technical details,
we give intuitive formulations of the Law and we
outline the space of possibilities for formulating the Law.

The goal of the Law of Demeter is to organize and reduce dependencies
between classes.
We can informally summarize the Law with the following three formulations:

\begin{itemize}
\item
Each method can only send messages to a limited set
of objects, namely to the {\em argument objects} and to the {\em immediate
subparts} of the class to which the method is attached.
 
\item
 Each method is ``dependent'' on a limited set of objects 
(organize dependencies).
  
\item
  Each method ``collaborates'' with a limited set of objects
  (organize collaborations).
\end{itemize}


To formulate the Law we can choose from the following
independent possibilities. We refer the reader to \cite{LHLR:law-paper}
for more details.
\begin{itemize}
\item 
object/class: We can formulate the Law in terms of objects  or in terms 
of classes. The class formulation is intended to be compile-time
checkable. 
%The class formulation is subdivided into the ``pure classes''
%and the ``associated classes'' formulation. The ``pure classes'' formulation
%is easier to understand and does not require any Demeter terminology.
The ``object'' formulation is intended to be followed conceptually
and not to be enforced statically by a tool.

\item
messages/generic functions/methods: We can formulate the Law in terms
of message sending, generic function calls or methods.
The generic function call formulation distinguishes between two alternatives:
one method selection arguments and several method selection arguments.

\item
weak/strong: Depending on how we interpret the term ``instance variable''
we get two different formulations of the Law. If we interpret it as all instance variables, including the inherited ones, we get the weak form of the Law.
If we exclude the inherited instance variables, we get the strong form 
of the Law.
\end{itemize}

In this section we use the ``pure classes'' and ``message sending''
terminology for explaining both the weak and strong
Law of Demeter and its consequences.
We then explain the translation to C++, using some of the other
formulations.

The discussion in this section concentrates on a formulation of the
Law which can be enforced statically or
at ``compile-time''. This is useful for
statically typed languages with user-defined type declarations,
such as C++.
It is also useful for dynamically typed languages, such as Lisp/Flavors and 
Smalltalk, 
with the Demeter type system imposed on them.
In this case, the classes are defined by a class dictionary and
the signatures of methods are
   described by Demeter type notation.
   The type of every statement or expression can be inferred at compile-time
   through type inferencing. 

We adopt the following terminology from \cite{sakkinen:law-88}:
Given a style rule and a compile-time checking algorithm for checking
the rule, and
a suitable constant \(c\), 
we say that a program of size \(s\) is 
\begin{itemize}
\item
in good style, 
if the checking algorithm proves in time \(c s^2\)
that the program obeys the rule;
\item
in bad style, 
if the checking algorithm proves in time \(c s^2\)
that the program disobeys the rule;
\item
in questionable style otherwise.
\end{itemize}

The choice of a quadratic upper bound on the running time
of the style checker is arbitrary, but indicates that we
look for a formulation of the Law of Demeter which can be enforced 
efficiently.
The class version of the Law of Demeter can be enforced in
linear time for languages such as C++
and therefore it does not leave C++ programs in questionable
style. However the object version of the Law is difficult
to enforce statically due to aliasing complications and therefore
it leaves some programs in questionable style. Nevertheless,
the object version is useful as a conceptual formulation of the Law:
It is a goal to be approximated in programming.

To formally introduce the Law and its consequences, we need a few definitions.
Those definitions are for a pure object-oriented system, such as 
Smalltalk-80. Later when we translate the Law to other languages, such as C++,
we deal with the mixed paradigm wrinkles.

\subsection{Definitions}

\begin{definition}
A B-object is an object belonging to class B.
\end{definition}

\begin{definition}
The protocol of a class C is the set of messages that can be sent
to C and its instances.
\end{definition}

\begin{definition}
A method {\em uses the protocol of a class C} if inside the method a message
is sent to a C-object, to C, or if an instance variable of C is accessed
or modified.
\end{definition}

\begin{example}
The protocol of class A consists of messages m1 and m2.
\bv
B = <x> Ident.
METHOD m1 (<self> B) Ident.
A = *inherit* B.
METHOD m2 (<self> A) Number.
\end{verbatim}
\end{example}

We use the following terminology: A (construction) class is defined
by a production of the form
\bv
A = B <x> C [D] E.
\end{verbatim}
This means that an object of class A has four parts:
an object of class B, called B, an object of class C, called x,
optionally an object of class D, called D and an object of class E.
For further details, see \cite{lieber-riel:oop}.

For method headings we also use the construction class notation. In the
above example, method m1 has one argument, called self, which
is of type B. The return type of m1 is class Ident.
To send a message, we use generic function call notation: e.g.,
f(s) is the same as ``send s the message f''.

We adapt the client/supplier terminology from \cite{meyer:book-88} to
our needs.

\begin{definition}
Method M is a client of class B and B is a supplier of method M,
if a B-object or B itself are sent a message in method M or if an
instance variable of B is accessed or modified.
\end{definition}

According to this definition, a method M which gets passed an object
of class B and which passes the object further without directly sending
a message to the object or to B, is {\em not}
viewed as a client of class B.

We use in the last definition the Smalltalk-80
approach which allows us to send a message to a class.
Sending a message to a class B activates a class method
of B (a method of B's ``metaclass'') which might, e.g., create instances of B or initialize a class
variable of B.

\begin{example}
\emptyspace
\bv
B = <x> Ident.
METHOD m1 (<self> B) Ident.
\end{verbatim}
If method m1 sends a message to x then m1 is a client of Ident.
\end{example}

Next we define potential preferred supplier classes of method M.
These classes are potentially suppliers of method M. A potential preferred
supplier class B of method M becomes a supplier and a preferred supplier
of M, if M chooses to 
send a message to B or one of its instances.

\begin{definition}
Class B is called a potential
preferred supplier of method M (attached to class C) if
one of the following conditions holds:
\begin{itemize}
\item
  B is an instance variable class of C or
\item
    B is an argument class of M, including C or
\item
        B is a class of objects created in M (directly or indirectly by
        calling a method which creates and returns a new object) or
\item
          B is a class of a global variable used in M.
\end{itemize}
\end{definition}

%One might be tempted to also view a class from which C 
%inherits as a preferred supplier of method M. Our definition, however, excludes
%those ``inheritors'' from the potential preferred suppliers. 

Next we define potential preferred client methods of class B.
These methods are potentially clients of B. A potential preferred
client method M of class B becomes a client and a preferred client of B,
if M sends a message to B or one of B's instances.

\begin{definition}
Method M is called a potential preferred client of class B if 
class B is a potential preferred supplier of method M.
\end{definition}

\begin{example}
In the following 5 examples,
class B is a potential
preferred supplier of method M and M is a potential
preferred client of B.

\bv
instance variable class:
// C = <s> B ...
METHOD M (<self> C ...)
  { ... f(s) ... }

argument class:
METHOD M (<self> C <s> B ...)
  { ... f(s) ... }

argument class:
METHOD M (<self> B ...)
  { ... f(self) ... }

newly created object class:
METHOD M (...)
  { ... f(new(B)) ... }  

// s is of type B, global to M
METHOD M (<self> A ...)
  { ... f(s) ... }
\end{verbatim}
\end{example}

For describing the benefits of the Law we also need the following 
definition
which describes a special kind of clients 
and and a special kind of suppliers.

\begin{definition}
Class B is a preferred supplier of method M and M is a preferred client
of B if B is both a supplier of M
and a potential preferred supplier of M.
\end{definition}

In the last example (all 5 subcases),
class B is not only a potential preferred supplier of
method M but also preferred supplier.

%We can prove the following facts:
We notice that
class B is a potential
preferred supplier of method M attached to B. 
%If M sends a message to
%a B-object then B becomes a preferred supplier of M.
Also, method M attached to class B is a potential preferred client of class B.

\subsection{The Law}

\begin{definition}
Law of Demeter (class version):
In all methods M use only the protocol of the potential preferred
supplier classes of M.
%In all classes C and in all methods M attached to C, use only the protocol of
%the potential preferred suppliers of M.
\end{definition}


Since the protocol of a class includes both the messages
to objects and to the class, the Law restricts the use of the
protocol of classes and metaclasses.
Restricting the protocol of metaclasses might be misleading since
we need to be able to make instances of classes wherever we like. 
Indeed, a close inspection of the Law reveals that it does not
impose any restriction on where we are allowed to make new instances.

%\begin{fact}
%Reformulation 1 of Law of Demeter (selective information hiding):
%For all classes B,
%only a potential preferred client method of class B is allowed 
%to use the protocol of class B.
%\end{fact}

The following reformulation of the Law of Demeter shows the protocol
restriction aspects.

\begin{fact}
The following statement is equivalent to the Law of Demeter (class version):
For all classes B,
the protocol of class B can only be used in potential 
preferred client methods of B.
\end{fact}

\noindent
Note that B's protocol includes the protocols of B's superclasses,
although the methods of B's superclasses are not necessarily
potential preferred client methods of B.

To underline the dependency control aspects of the Law, we
need the following definition:

\begin{definition}
A method M is dependent on class C, if M sends a message
to a C-object or to C.
\end{definition}

\begin{fact}
Method M is dependent on B if and only if M is a client of B.
Method M is dependent on class B if and only if
M uses B's protocol.
\end{fact}

The following reformulation of the Law of Demeter shows the dependency
control aspects.

\begin{fact}
The following statement is equivalent to the Law of Demeter (class version):
Only a potential preferred client method M of class B is allowed 
to be dependent on class B.
\end{fact}

%If A is a client of B then A is not necessarily dependent on B since
%no method of A might use A's protocol. (An object of type A might
%be passed as second argument and never sent a message.)

%The Law reduces the dependency between classes since only a preferred
%client class of a class B is allowed to use the protocol of B.
%Using the dependency concept, we can give the following equivalent
%formulation of the Law:

%\begin{fact}
%Only a preferred client class of B is allowed to be dependent
%on B.
%\end{fact}

%Yet another definition:
%\begin{fact}
%In any method M attached to class C use only the protocol
%of classes of which C is a preferred client.
%\end{fact}

%The following formulation of the Law also shows the protocol restriction
%aspects.
%
%\begin{fact}
%The following statement is equivalent to the Law of Demeter:
%In any method M use only the protocol
%of potential preferred suppliers of M.
%\end{fact}

%The current definition of preferred client and supplier does not
%deal with associated classes.
%Refined definition:
%
%\begin{definition}
%A is a preferred client of class B if B sends a message to an instance 
%The preferred clients A of class B are the classes which
%send a message
%to an instance of the classes associated with the following classes:
%  instance variable classes of B or
%  argument classes of methods of B or
%  classes from which B inherits or
%  classes which create
%
%A inherits from B or
%       B creates an object of class A or
%         B has a method which uses a global variable which is of class A.
%           Every class is a preferred client and a 
%           preferred supplier of itself.
%\end{definition}

To formulate a stronger version of the Law
of Demeter, we give here the Law in expanded form:

\begin{fact}
Law of Demeter (class version):
In all methods M attached to class C use only the protocol of the 
following classes:

\begin{enumerate}
\item
  instance variable classes of C or
\item
    argument classes of M, including C or
\item
classes of objects created in M (directly or indirectly) or
\item
classes of global variables used in M.
\end{enumerate}
\end{fact}

\begin{definition}
Strong Law of Demeter (class version):
If we interpret in the Law of Demeter
``instance variable classes of C'' as meaning the
classes of instance variables defined for C explicitly and not
the inherited ones, we get the strong Law of Demeter.
\end{definition}

\subsubsection{Conceptual guideline}
So far we discussed the class version of the Law of Demeter which can be
enforced at compile-time. The price we pay for compile-time enforceability
is that some programs which violate the ``spirit'' of the Law will pass
the test or that some programs which follow 
the ``spirit'' of the Law will be rejected.
Therefore we present now the object version of the Law which expresses
the spirit of the Law and which serves as a conceptual guideline to be
approximated in programming. We first need to 
define potential preferred supplier objects.

\begin{definition}
The potential preferred supplier objects of method M which is
attached to class C, are:
\begin{itemize}
\item
the immediate parts of C or
\item
the argument objects of M or
\item
the objects created in M (directly or indirectly) or
\item 
objects in global variables.
\end{itemize}
\end{definition}
Notice that we use the phrase ``immediate subparts of C'' whose
sensible interpretation is left to the user. E.g., the immediate parts of a list class
are the elements of the list. The immediate parts of a ``regular'' class
are the objects stored in instance variables.

\begin{definition}
Law of Demeter (object version): In all methods M send only messages to
potential preferred supplier objects of M.
\end{definition}

The object version of the Law expresses {\em what we really want}, but
unfortunately it is hard to enforce at compile-time \cite{LHLR:law-paper}.
The object version serves as an additional guide whenever the class version of the Law accepts a program which appears to be in bad style or when the class
version of the Law rejects a program which appears to be in good style.

\subsection{Benefits of the Law}
Now we are ready to state important benefits of the Law. We first give simple
benefits; the most significant one is the last one.

\begin{fact}
If the weak or strong Law of Demeter is followed and if
class A's protocol is renamed then at most the preferred client  methods of 
A and A's subclasses require modification.
\end{fact}

By the ``renaming'' of a protocol we mean the changing of some message
names.

A class consists of objects, the protocol and the composition.
By the composition of a class we mean the detailed structure of
its parts, subparts etc.

%By a constructor of a class A we mean a function or method which
%returns a new object of class A taking the some or all subparts as
%arguments.

\begin{fact}
If the Law of Demeter is followed and
if the composition of class A changes (not modifying A's protocol),
then at most the
methods of A and A's subclasses need modification.
\end{fact}

By ``not modifying A's protocol'', we mean that the public interface of A
remains invariant.

%following code needs modification:
%\begin{itemize}
%\item
%constructors of A and its subclasses and
%\item 
%methods of A and
%its subclasses.
%\end{itemize}


\begin{fact}
If the strong Law of Demeter is followed and
if the composition of class A changes (not modifying A's protocol),
then at most the preferred client methods of A need 
modification.
\end{fact}

%keeping the protocol
%invariant, then at most the the constructors and methods
%of A need modification.

It is interesting to notice that traditional information hiding can 
achieve the same as the strong Law in the special case when the composition
of a class is completely 
hidden. The composition of a class B is completely hidden if
all instance variables of B are private (in the C++ sense)
and any change to the instance variables does not affect the public
protocol of B.
In this special case we can state: {\em If the composition of a class is
completely hidden and if the composition of class A changes then at most
the methods attached to A need modification}.

Now we state an important 
benefit of the Law in the case the protocol of a class changes:

\begin{fact}
If the weak or strong Law of Demeter is followed and if the protocol of class A
changes,
% (keeping the composition invariant), 
then only preferred client methods of A and its subclasses
need to be modified and
only methods in the set of potential preferred clients of
A and its subclasses need to be added.
\end{fact}

We can also informally state the following benefit which shows how the
Law controls the complexity of programming:
{\em
When reading a method M, the programmer has only to be aware
of the preferred supplier classes of M.
}
The preferred suppliers of a method M are usually a small subset of all
classes of the application and furthermore, they are ``closely related''
to the class to which M is attached. This relationship makes it
easier for the programmer to remember those classes and their functionality.

\subsection{Justification}

Having seen the benefits of the Law, we have to justify now the
4 clauses of the Law. Clearly, we still get the benefits described above,
if we delete all four clauses. But then we could no longer write
useful programs. We justify now each clause of the potential
preferred supplier definition. The goal is to convince the
reader that we need all 4 clauses for writing
useful programs:
\begin{itemize}
\item
instance variable classes: If we cannot use the protocol of instance variable
classes, we cannot delegate to subparts.
\item
argument classes: If we cannot use the protocol of argument classes,
the classes could not collaborate through arguments: arguments
could only be passed without ever be sent a message.  
\item
classes of objects created: Why do we need to send messages to newly
created objects? \cite{sakkinen:law-88} writes: ``including them
seems to run counter to the purposes of the Law, viz. to simplify
modifications and to simplify the complexity of programming''.
Consider a purely object-oriented language in which we can write
\((*(+ 2 ~~ 3) 4 )\). Here the addition of 2 and 3 returns a new
object, namely 5 to which we send the multiply message. Clearly,
we want not to exclude such expressions.

Another reason is that the transformation of an instance variable
access method
into a method which computes the value from other instance variables
should not affect the adherence to the Law.

\item
classes of global variables:
Some object-oriented languages, such as Eiffel\cite{meyer:book-88}, don't even have global
variables and for those this clause is not needed. In languages
which have global variables we allow messages being sent to
the objects for programming convenience. Certainly, the old
rule applies: global variables are considered harmful.
\end{itemize}

\subsection{Translation to C++}

C++ does not have messages, therefore we formulate the
Law in terms of methods which are called member functions in C++.
We also refine the Law to restrict access to data members.
C++ does not have metaclasses, but it has constructors and static
data members instead. Therefore we drop the use of metaclasses from the
C++ version of the Law.

%We make the assumption that all data members are private and that their
%values are accessible and settable through member functions. This
%is a good approach for large applications and simplifies the
%formulation of the Law for C++.
%Since in C++ not only member functions but also friend functions

\begin{definition}
In all member functions M of class C use only members (function and data)
of the following classes and their base classes:
\begin{itemize}
\item
C
\item
data member classes of C 
\item
argument classes of M
\item
classes of objects created in M 
(directly,
      by calling a constructor function or indirectly,
            by calling a function which will call
                  a constructor function and return a new object)
\item
classes of global variables used in M.
\end{itemize}
\end{definition}

Using the potential preferred client classes terminology, we can abbreviate the
Law for C++:
\begin{definition}
In all member functions M use only members (function and data)
of the potential preferred supplier classes
and their base classes.
\end{definition}

We have to include the phrase
``their base classes'' since the formulation in terms
of the protocol implicitly involves inheritance.
Another reason for using ``their base classes'' is that C++ has the
:: scope resolution operator to call member functions which are hidden.
This allows to call a member function of a base class directly in the
code of a derived class, although the 
member function might have been redefined (overriden).

% In C++: use virtual functions if we want dynamic member function selection.
% Need virtual for alternations.
% A = *inherit* B. here we don't need virtual functions; still
% get inheritance.

C++ supports the friend function concept. A friend function of a class
has privileged access to the private data members in the same
way as a member function of the class. A function can be a friend of several
classes. Friend functions are provided for efficiency reasons.
We have decided not to include friend functions in the formulation of the
Law of C++. When friend functions are used and the composition
of a class B changes then  potentially all friend functions of B
need to be modified. Users of friend functions have to be aware
of the dependencies which they create.

C++ allows non-class types for
data members. We need to explain
how the Law applies to such classes. To represent objects of
variable size, it is necessary to use pointers. When a data member d is
of type B* d, where B is some class, we view B as a data member class
for the purpose of the Law. 
When a data member is a vector of pointers to D-objects, then D
is viewed as a data member class.
Similarly, when a data member is a vector of D-objects, then D
is viewed as a data member class.

C++ allows overloading of function names
with respect to all arguments.
But run-time dynamic member function
selection is done only with the object whose member function
is called (for virtual member functions).
Therefore we don't allow in the formulation of the Law
for C++ to call member functions of the
argument classes of a member function.

%In the Demeter system, we use non-class data members in a very restricted
%way. Either they are are of a built-in data type of C++ (e.g., int, char


Next we discuss the translation of benefits:
We have to assume that {\em data member names and calls to member functions
are not used outside of member functions, except in the main program (the function main())}. This assumption
is necessary for two reasons:
\begin{itemize}
\item
C++ is a mixed paradigm language. Therefore member functions
can be used in regular functions.

\item
C++ has friend functions which have privileged access to data members.
\end{itemize}

Under this assumption we can prove:

\begin{fact}
If the weak or 
strong Law of Demeter is followed and if the member functions
of class B change,
then at most the main program and the preferred client member functions of B
and its subclasses 
need to be modified and only member functions in the set of potential preferred
clients of B and B's subclasses need to be added.
\end{fact}

The preferred client member functions of class B are
member functions which use a member (function or data) of a B-object or
a constructor or destructor of B and which
\begin{itemize}
\item
are member functions of classes who have B or a pointer type to B (or an array of those types)
as a data member type or
\item
which are member functions which have B or a pointer type to B (or an array of those types)
as an argument type or
\item
which call a constructor of class B (directly or indirectly) or
\item
which use a global variable of type B or a pointer type to B (or an array of those types).
\end{itemize}

\subsubsection{Violations of the Law}

%In this subsection we don't assume that all data members are private.
Consider the following fragment of
a C++ program:
%see file viol.c
\bv
class MicroFicheFiles {
public:
  boolean search(Book* book) {}
};

class Documents {
public:
  boolean search(Book* book) {}
};

class Archive {
public:
  MicroFicheFiles* m;
  Documents* d;     
  boolean search_good_style(Book* book) {
    return
      (m->search(book)) ? true :
        d->search(book);
  }
};

class ReferenceSec {
public:
  Archive* a;
  boolean search_bad_style(Book* book) {
    return
      (a->m->search(book)) ? true :
        a->d->search(book);
  }
  boolean search_good_style(Book* book) {
    return a->search_good_style(book);
  }
};

class Book {
};
\end{verbatim}  
Here member function {\sf search\_bad\_style(Book*)} violates the Law of Demeter
since it calls a member function of classes {\sf MicroFicheFiles} and 
{\sf Documents}
which are both not preferred supplier classes of member function
{\sf search\_bad\_style}.
The preferred supplier classes of 
{\sf search\_bad\_style} are: 
{\sf ReferenceSec} (the class for which the member function is defined),
{\sf Book} (from member function 
argument) and {\sf Archive} (data member).

Member functions {\sf search\_good\_style} for {\sf Archive} and
{\sf ReferenceSec} 
satisfy the Law of Demeter.

It is interesting to notice that the member hiding mechanism of C++
is capable of detecting this type of violation of the Law, if we make
the members {\sf m} and {\sf d} private. This, however does
not mean that the Law is superfluous: The Law promotes restricted use
of members which have to be public, while information hiding completely
disallows the use of private member names outside the member (and 
friend) functions
of a class.
\section{Message passing versus generic function calls}

In the above sections the ideas and examples are presented in the context
of the message passing paradigm.  However, this is not appropriate 
for languages like CLOS.
Instead, CLOS uses the concept of a generic function.  In this case a 
pure functional notation is used and one or more objects may participate
in the determination of the method to be executed.   Essentially, this is a
generalization from the single argument dynamic method selection mechanism to a 
multi-argument dynamic method selection mechanism.

In the message passing framework we think of a method as attached to one
class. In the generic function generalization it is natural to think
of methods as being attached to several classes.  
Similarly, instead of individual objects receiving messages,  groups
of objects receive messages.

The essential ideas and motivations for the Law still apply in this context.
But some further motivation is needed for the generic function formulation.

The key feature of \oop\ is that we can attach functionality to data structures
to form classes.  This attached functionality (method)
is an integral
part of the class and (in most \oo\ languages) the method
can access the parts of
an object of that class 
in an easier way than other methods.  We view methods attached to a class
as having the right to access and make assumptions about the immediate 
structure of the class.  In other words, the class reveals its implementation
details to the methods attached to it.   These details should not be revealed
elsewhere.
Our Law re-affirms and extends this basic encapsulation principle and 
restricts the extent of the
revealed details.

In message passing languages where a method is
attached only to one class, we restrict
the message sending essentially to instance variables types of that class.
The arguments to the method act in support of the method.  Since
the method is not attached in any formal way to the classes of the arguments
the method has no right to access the implementation details of the arguments.
This does not prevent the method from sending messages to the arguments to
perform services in support of the method.
We make a big distinction between the first argument to which
a message is sent 
and the remaining arguments which act as support information.

A different situation arises with generic function call languages 
which allow several arguments
to select a method (e.g., CLOS).
There is potentially a case analysis of all
argument types to decide which method is to be invoked, i.e., the method
is attached to many classes.  If a method is attached to many classes
then it is a part of the semantics of each {\em and} should be allowed 
to send messages
to instance variable types of all the classes to which it is attached.   
Any remaining
arguments are treated in the same way as the second and following
arguments in message passing languages.
We must still distinguish between the two roles played by the arguments:
the method selection role and the supporting role.

Therefore the Law for generic function call languages with several
method selection arguments, allows the programmer of
a method to look directly at the instance variables of several classes.
We have to compensate for this by restricting more than the first
argument in a generic function call:
We require that
all function calls inside a method M must have a restricted set
of objects in all their method selection arguments.

An important idea behind the Law
is that the user has only to think about the classes
of the method arguments
and the classes of the
immediate subparts of the method selection argument classes when reading a method.
Consider the following example:
\bv
METHOD f (<a> A <b> B) C.

... (g a (h b)) ...
\end{verbatim}
Let's assume that g has two method selection arguments. The Law requires
that the return type of h be restricted. The reason is that we only
want to think about a certain set of classes when figuring out which method
g will activate.  The return type of ` (h b) ' has to be a preferred 
supplier of ` f ' for the above to be in good style.

Furthermore, we want to restrict the effect of class changes.
Assume that the return type of ` h ' is Q which is a subpart type of B.
If the structure of B changes, we might have to modify ` f ' since a 
potentially completely
different method ` g ' might be selected.
If a class changes, we ideally 
only want to change methods attached to that class.
The Law strives to achieve this ideal, but does not succeed in all
casses (see the formulation of the benefits).

{\em Therefore, we view the number of method selection arguments of a function
as part of the interface of that function along with the typing information
and a formal or informal specification.}

Here we clash with the view of some of the CLOS inventors\footnote{We had
a lengthy correspondence with the CLOS community via the CLOS mailing
list.  We
gratefully acknowledge our correspondents for providing us with some
insight into the CLOS world.}.  They do not
consider the number of method selection arguments as being important
documentation nor part of the published interface of a function. 
They view the syntactic
non-distinction as an important feature of CLOS. 
%while we view the fact that
%we can have several method selection arguments as one (among many) contributions
%of CLOS. 

%We choose ``type(associated classes)'' for
%the remaining choice.

\noindent
LAW OF DEMETER (CLOS, class version)

In all methods M and all function calls inside method M, all method selection arguments
must belong to one of 
the following classes
\begin{itemize}
\item argument classes of M, 
\item slot classes of method selection argument classes of M, 
\item classes of objects created in M (directly or indirectly) or
\item classes of global variables used in M.
\end{itemize}


\section{Overloading and dynamic method selection}

The Law of Demeter attempts to limit certain dependencies on types
of objects occurring within methods.  In an
\oo\ program there may be many methods defined having the same name.
Determining which of these is to be executed when the name is evoked
depends on the types of one or more of the arguments.  
The language C++ allows overloading using the argument types.
At first glance, this looks like the same kind of multi-argument method 
selection described above.  However, this is not the case.  In C++ only
the object whose member function is called
is used for dynamic method selection and member functions 
are only ``attached'' to the class they are defined for.
Therefore only data member classes
of the attached class are potential preferred suppliers and not 
data member classes
of the classes of the arguments of the member function.  
Overloading is a syntactic device
dealt with at compile-time, distinct from the semantic device of dynamic
method selection which is done at run-time.

For C++ the message passing formulation of the Law is appropriate.
We view the compile-time method selection of C++ due to overloading
as syntactic sugar. We could always rename the methods by prefixing
their names with argument types to avoid the need of overloading.
But CLOS is inherently more powerful: it does true dynamic method
selection 
on all the method selection arguments.
%Therefore we need for CLOS the generalization of the Law as proposed
%in the section on ``Formulation for existing languages''..


\section{The Law in mixed paradigm languages}

In languages which support both the action-oriented and the object-oriented
paradigm we have the coexistence of function or procedure calls with
message sendings or generic function calls. We can view a regular
function or procedure as a degenerate generic function with zero method
selection arguments. Therefore the Law of Demeter does not restrict
the regular function or procedure calls inside a method.


However, we do want to restrict the use of generic functions with
one or more method selection arguments within regular functions.
Outside methods we tend to disallow generic function calls, i.e., a regular
function or procedure should not contain generic function calls.  
In languages such as C++, the main program is an exception: it is allowed to
contain calls of member functions.

%This means
%that when a shift of paradigm from \oo\ to action oriented is neccessary
%it should be consistent.

For mixed paradigm languages it makes sense to distinguish between 
types and classes. The types include:
\begin{itemize}
\item
the types of the action-oriented base language
such as character, integer, real and other types,
\item the classes of the object-oriented extension.
\end{itemize}

In CLOS many but not all Common Lisp types are classes and all
classes are types.

\section{Nesting of generic function calls}

The Law of Demeter has implications for the nesting of function
calls. To discuss those implications, we classify  functions as follows:

\begin{definition}
\begin{itemize}
\item
An accessor function returns an object which existed before the
function was called.
\item
A constructor function returns an object which did not exist before
the function was called.
\end{itemize}
\end{definition}

The Law allows the nesting of constructor functions, as well as
the nesting of non-generic functions (if a multi-paradigm language is 
involved). 
A generic function does not necessarily have to contain
an explicit 'make-instance' call (or its equivalent) to cause the
creation of a new object.  If the generic function has within it
a call to a function which returns {\em a newly created object} then this
too is considered a creation of an object within the generic function.

However, the Law restricts functional composition of accessor functions.
As an example, consider the following list processing example which the
Law would consider to be in bad style: 
(We use the CLOS notation in this section.)

\bv
;;; Number-List ~ {Number}.
(defmethod third ((x Number-List))
  (car (cdr (cdr x))))
\end{verbatim}
We view this in bad style for two reasons. First, the writer of the
method assumes that there are at least three elements in the list.
If this is indeed the case, it might have been better to define the
list as follows.
\bv
;;; New = <salary> Number <age> Number 
;;;   <third> Number <rest> Number-List.
\end{verbatim}
Then selecting the third element is much easier (more intuitive and
less error prone).

The second reason is that it is not good to ``dig'' into structures.
We choose another example to demonstrate this point:

\bv
;;; P0 = <p1> P1.
;;; P1 = <p2> P2.
;;; P2 = <p3> P3.
(defmethod get-p3 ((x P0))
  (p3 (p2 (p1 x))))
\end{verbatim}
A better solution is:
\bv
(defmethod get-p3 ((x P0))
  (get-p3 (p1 x))
(defmethod get-p3 ((x P1))
  (p3 (p2 x)))
\end{verbatim}
If we update class {\sf P1} (say instead of p2 we have two other instance
variables), in the first case we have to modify
the triply nested method {\sf get-p3}
attached to class {\sf P0} (although the definition of 
{\sf P0} did not change). 
In the second case, we only have to update the method attached to {\sf P1}.

In the Number-List example we could not make the same point
since cdr returns another object of the Number-List class. 
The compile-time checkable version of the Law of Demeter
would allow
the  nesting of accessor functions, if their return type is not
a ``new'' type, that is, if it is an instance variable or an argument
type.  Thus, the class formulation of the Law is less restrictive in this
case than the object formulation. 


%The following illustrates the valid use of nested constructor functions.
%The following nesting is in good style:
%\bv
%        (defmethod good-style ((x float) (y float)) 
%         (+ (sqrt (- x y)) y))
%\end{verbatim}
%
%since the functions +, -, and sqrt are all constructor functions.
%The result of applying the built-in generic function - is an argument
%to the sqrt generic function, as is the result of sqrt to +.
%The result of applying - and sqrt
%is generated within the function body itself.
%
%In other words,
%the application of the '-' to the x and y causes the creation of a new
%instance of the float class which is initialized to a value representing
%'x - y'. So calling '-' is just the same as creating a new object  and
%giving the instance variables of that object some initial values depending
%on the arguments and some procedural processing.  
%%Thus the argument to the
%%sqrt function is again valid.
%
%
%%      As a less trivial example, consider the following. In Unix, the
%%pipe construct has the semantics of functional composition. Thus, to
%%format a troff document and send it to the line printer the following
%command line is used:
%
%\bv
%troff <switches> <filename> | lpr -P<printer>
%\end{verbatim}
%Using functional composition in Lisp/CLOS, we have:
%\bv
%(defmethod output-document 
%    ((switches string) 
%     (filename pathname) 
%     (printer string))
%  (lpr printer (troff switches filename)))
%(defmethod lpr 
%    ((printer string) 
%     (document stream))
%  <<code to send document to the printer>> )
%
%(defmethod troff 
%  ((switches string) 
%   (filename pathname))
%  <<code to do troff, 
%    might be in Lisp or a call to Unix>>)
%\end{verbatim}
%
%The method output-document is in good style since troff is a constructor
%function which returns a new stream object.
%
%%      The functional programming style is a useful and powerful one,
%%though, as with all programming styles, it has its limitations. One of
%%the advantages of CLOS is that it makes a combination of functional
%%and object-oriented styles possible, allowing side effect producing
%%operations to be limited in extent, which, in turn, makes proofs of
%%correctness easier. By eliminating functional composition for generic
%%functions, the Law of Demeter puts unnecessary restrictions on the
%%programmer.
%It is important to note that we haven't ``thrown out the baby with the
%bathwater'' with this interpretation of ``object creation''.  
%A generic function 
%is allowed to be dependant on 
%a limited set of objects and their
%types, but is not allowed to be dependent on the sub-parts of those objects.
%The deep structures of the objects are hidden from the generic function.
%This is a real and
%important restriction on the programmer.
\section{Anarchy: valid violations of the Law}
The Law is intended to act as a guideline and not as
an absolute restriction.  The programmer has the choice
whether to follow it or not.  In this section, we outline
a few situations when the cost of obeying the Law may be greater
than the benefits.  However, when the Law is willingly violated, the
programmer takes on the responsibility to provide future
maintainers support, with documentation, which
otherwise would have been provided by the Law.  Consider
the following prototypical method (in Lisp/Flavors) which
is in bad style:
\begin{verbatim}
(defmethod (C :M) (p )
   ( .... (send (send p :F1)  :F2) ....))
\end{verbatim}
where p is an instance of class A and F1 returns
a subpart of p.  If the immediate composition of A changes the
method M may have to change also because of F1.  Here are two
situations when it is reasonable to leave the above as it is:
\begin{itemize}
\item
F1 is intended to serve as a ``black box"  and the programmer
knows only about the types of its arguments and the return type.
In this case, the maintainer of F1 has the responsibility to ensure that
any updates to F1 are upward compatible so as not to punish any
users of the function.
\item
If run-time efficiency is important to the application, the use
of mechanisms like friend functions of C++ may be necessary. 
Friend functions should be used carefully, since whenever the private members
of a class change, the friend functions also might change.
\end{itemize}

In an application which solves differential 
equations the following definitions may occur:
\begin{verbatim}
Complex_Number = 
  <real_part> Real <imaginary_part> Real.

(defmethod (Vector :R) (c :Complex_Number)
   ( .....  
     (send (send c :real_part) 
       :project self) ....))
\end{verbatim}
The method R is in the same form as M above and is in
bad style for the same reason.  The question is whether it is important
to hide the structure of complex numbers and to rewrite the method?
In this application where the concept of a complex number is well
defined and well understood it is unnecessary to rewrite the method
so that the Law is obeyed.
   
In general, if the application concepts are well defined and the
classes which implement those concepts are stable, in the sense that
they are very unlikely to change, then such violations as the above
are acceptable.

Following \cite{hewitt:law} and \cite{sakkinen:law-88}, we introduce the concept an 
{\em acquaintance class} of a method. The motivation is to have a systematic,
compile-time enforceable control of violations of the Law. The
acquaintance classes of a method are declared in the ``heading'' of a method
and they are added to the potential preferred suppliers of the method.
When a method has acquaintances declared, the programmer is warned that there
are ``unusual'' dependencies between the classes and the compile-time
Law enforcement program has the necessary information to suppress
Law violation messages. With the presence of acquaintances we formulate
the Law (class version) as follows: In all methods M attached to class C, use only the protocol
of the following classes:
\begin{itemize}
\item
instance variable classes of C or
\item
argument classes of M, including C or
\item
classes of objects created in M (directly or indirectly) or
\item
acquaintance classes of M.
\end{itemize}

The goal is to write methods with a minimal number of acquaintances.
We can prove that any \oo\ program can be written without acquaintances
\cite{LHLR:law-paper}.

To easily check the Law at compile-time, the user has to provide the
following information for each method:

\begin{itemize}
\item
the types of each of the arguments and the result,
\item
the answer to the question: is the method a constructor, 
\item
the acquaintances.
\end{itemize}

The Law enforcement program has to keep 
track additionally of the following information
about each method:
\begin{itemize}
\item
the message sendings inside the method and
\item
the classes of the objects created by the method.
\end{itemize}

\section{Formulations for existing languages}

We give the formulation of some of the Law of Demeter 
versions in a few object-oriented languages.
Each formulation adapts the Law to the terminology of the particular language.
\begin{itemize}
\item
LAW OF DEMETER (Smalltalk-80, object version) 

In all message expressions inside a method M 
the receiver must be one of the following objects:
\begin{enumerate}
\item
an argument object of M including objects in 
      pseudo variables "self" and "super" or
\item
an instance variable object of the class to which M
      is attached or
\item
an object created by M or by methods which it calls or
\item
  an object in a global variable.
  \end{enumerate}

%\item
%C++:
%type, 
%message passing.
%
%C++ allows overloading of function names
%with respect to several arguments.
%But run-time implicit case analysis is done only with the first argument.
%\bv
%----- LAW OF DEMETER (C++, type version) -------------
%Every member or friend function M of a 
%class C is only allowed to call member
%or friend functions of the following classes:
%  - The argument classes of M (including C),
%  - The data member classes of C,
%  - The classes of objects created by M (directly, 
%      by calling a constructor function or indirectly, 
%      by calling a function which will call 
%      eventually a constructor function)
%  - The classes of global objects.
%----------------------------------------------------------
%\end{verbatim}

\item
CLOS: We assume that the user can determine 
for each generic function the number of method selection
arguments (not necessarily all required ones) and that this number is
part of the interface of the generic function.
A method selection argument is an argument which is used
for identifying the applicable methods.

\noindent
LAW OF DEMETER (CLOS, object version) 

All function calls inside a method M must use only the 
following objects as method selection arguments:
\begin{enumerate}
\item
M's argument objects or
\item
slot values of method selection argument classes of M or
\item
objects created by method M, or by functions which
    M calls or
\item
objects in global variables.
\end{enumerate}

Note that this formulation implies that the only legal use of
function slot-value is on method selection arguments.

%\bv
%----- LAW OF DEMETER (CLOS, pure type version) -----------
%In all function calls inside a method M all method 
%selection argument objects must belong to one of the 
%following classes:
%  - argument classes of M
%  - slot classes of method selection argument classes of M.
%----------------------------------------------------------
%\end{verbatim}

\item
LAW OF DEMETER (Old Flavors, object version) 

In any method M attached to class C send only messages 
to the following objects:
\begin{enumerate}
\item
M's argument objects or
\item
the instance variable objects of C or
\item
objects created by method M, or by methods or 
      functions which M calls or
\item
objects in global variables.
\end{enumerate}

%\noindent
%LAW OF DEMETER (Old Flavors, associated classes)
%
%In any method M attached to class C send only messages
%to instances of classes associated with the following classes
%\begin{enumerate}
%\item
%argument classes of M or
%\item
%instance variable classes of C or
%\item
%classes of objects created by method M, 
%      or by methods or functions which M calls or
%\item
%classes of objects in global variables.
%\end{enumerate}


%\item
%New Flavors:
%object/ type,
%generic function.

\item
LAW OF DEMETER (Eiffel, object version) 

In all calls of routines inside a routine M 
the entity object must be one of the following objects:
\begin{enumerate}
\item
  an argument object of M or
\item
  an attribute object of the class in which M is defined or
\item
  an object created by M
      or by routines which M calls.
\end{enumerate}

\end{itemize}

\section{Conclusions}

We have explained and motivated the Law of Demeter and its benefits
for several languages which support \oop. We succeeded in giving 
a qualitative assessment of the benefits of the Law in terms 
of programming complexity and program maintenance.

The Law has many other benefits which cannot be easily quantified.
In our teaching of \oop\ since spring 1986 
to over 300 students and the teaching of the Law since 
fall 1987
we
made the following observations:

\begin{itemize}
\item
The Law serves as an excellent tool to spot many bad designs.
After a sensible
transformation to good style (satisfying the Law), the programs become usually cleaner.

\item
Some students write Lisp Flavors programs which are 400 lines long, without
knowing the Law, but without violating it.
One reason for this is that we automatically generate an application
skeleton program from the class dictionary.
This skeleton program satisfies the Law and serves as the initial sketch
of the application software and is modified by the programmer for the needs
of the application \cite{lieber-riel:oop}.
The application skeleton, which is an object-traverser, serves as a useful
example of a program which satisfies the Law.

\item
Transformation to good style sometimes reduces the size of the program
by factoring the code in a better way.

\item
Clever transformation to good style sometimes reduces the coupling
(with respect to call/return links) between classes.

\item
The Law is easy to teach and the students easily learn to transform programs 
to good style. 
The ramifications of the Law, however, are subtle and you only realize
the Law's benefits when you apply it to your own programs.  
\end{itemize}
\begin{center}
{\bf{Acknowledgements}
}
\end{center}

Members of the CLOS community (Daniel Bobrow, Richard Gabriel,
Jim Kempf, Alan Snyder, Gregor Kiczales, Lanning etc.) have participated
in the debate and formulation of
the CLOS version of the Law and prompted
us to write this paper.

Carl Woolf was initially proposing to view the object version
of the Law as the one to follow
conceptually and he did help us in wording the formulations.

Ken Baclawski's feedback on the Law for C++ has helped us in finding
a better reformulation. Mitch Wand has given us detailed feedback
on the final paper.
Arthur Riel provided moral support and helped us with the formulations
and implications of the Law.

\bibliography{biblio}
%\tableofcontents

\end{document}

