next up previous
Next: Behavior Specification Up: Demeter/Java User Manual Version Previous: Introducing Demeter

Subsections


Class Dictionaries

Class dictionaries describe the structural relations between classes. Optionally, they may also define an application-specific grammar in order to describe objects by english sentences. This chapter explains everything you need to know about the structural relations between classes defined in class dictionaries.

Throughout most of this chapter we will use the following class dictionary,
conglomerate.cd, as reference:

// Class dictionary that describes conglomerates of companies.
// You can use Java-like comments in Class dictionaries.

Conglomerate = <name> Name <head> Company .
Name = <n> Ident .
Company = Name
          <payroll> List(Employee)
          <subsidiaries> List(Company) .
Employee : Manager | Staff common Name
                                    Salary.
Salary = <amount> Integer .
Manager = .
Staff = .
List(S) ~ { S }.

EmpSearchVisitor = <emplname> Name.

All class dictionary files have, by convention, the extension .cd. Lines of comments start with the Java comment token //. Whatever is not commented defines the structure of the classes.

Defining Classes

A class dictionary file consists of several entries with the generic form:

$<$class$>$ $<$def-sign$>$ $<$definition$>$ . (Notice the ``.'' at the end!)

Each of these entries defines a class of the application. The $<$def-sign$>$ determines the nature of the class, i.e., if it is concrete (``='' sign), abstract (``:'' sign) or collection (``~'' sign) class. In the conglomerate example, class Conglomerate is concrete, class Employee is abstract and List(S) is a collection class.

Below is a more formal description of class definitions.

Concrete Classes

In Demeter terminology, a concrete class is called a construction class and is defined to be any leaf class in the inheritance hierarchy. In the Demeter system superclasses cannot be concrete, or in other words, concrete classes may not have any subclasses.

Formally, construction classes have the following syntactic structure:

$\langle$ConcreteClassDef$\rangle$ ::= $\langle$ClassName$\rangle$ "=" { $\langle$Reference$\rangle$ $\}^*$ "."
$\langle$Reference$\rangle$ ::= ["$<$" $\langle$ReferenceName$\rangle$ "$>$"] $\langle$ClassName$\rangle$

The class Conglomerate is defined as being a concrete class (indicated by the ``='' sign), having as data members (parts) an object of class Name called name and an object of class Company called head.

Note that the parts may or may not have explicit labels (i.e. the ReferenceName). For example, the parts of the class Conglomerate have explicit labels, namely <name> and <head>. However, the part Name in class Company does not have a label; in such cases, the label is, by default, the name of the correspondent part class in lower-case letters. Therefore, the label name of the part Name in class Company is also name.

Ident is an example of the Demeter built in primitive classes. The other built in primitive classes are Text, String, Word, Line. Together, they are called Demeter Terminal classes, or Terminals for short. Because they are primitive, they don't have to be defined in the class dictionaries. All of the Java types and primitive classes can appear in Demeter class dictionaries.

Abstract Classes

In Demeter terminology, abstract classes are called alternation classes. An abstract class defines a class that cannot be instantiated. Any abstract class must have at least one subclass.

Formally, alternation classes have the following syntactic structure:


$\langle$AbstractClassDef$\rangle$ ::= $\langle$ClassName$\rangle$ ":" $\langle$Subclasses$\rangle$ 

common { $\langle$Reference$\rangle$ }$^*$ "."
$\langle$Subclasses$\rangle$ ::= $\langle$ClassName$\rangle$ { "$\mid$" $\langle$ClassName$\rangle$ $\}^*$

In the conglomerate example, class Employee is an alternation class with alternatives Manager and Staff. Abstract classes can also define common data members which are inherited by its subclasses. In this case, Name and Salary are common to Employee and therefore these data members are inherited by Manager and Staff.

Collection Classes

In Demeter terminology, collection classes are called repetition classes. A collection class defines a class whose instances are collections of instances of a given class.

Formally, repetition classes have the following syntactic structure:


$\langle$CollectionClassDef$\rangle$ ::= $\langle$ClassName$\rangle$ "$\sim$" $\langle$CollectedClass$\rangle$"." 

$\langle$CollectedClass$\rangle$ ::= [$\langle$ClassName$\rangle$] "{" $\langle$ClassName$\rangle$ "}"

Syntactically, a collection class is introduced with a tilde sign ("$\sim$"). The repeated class is given within curly braces ("{"$\ldots$ "}"). A non-empty collection class is defined by specifying the repeated class both before and within the curly braces. For a non-empty collection class, the optional and the repeated part should be the same. For example,

C_List ~ C { C } .

defines a C_List class to be a non-empty collection of C-objects (i.e., it must contain at least one C-object).

In the conglomerate example, class List is a collection class that can be empty.

Parameterized Classes

In addition to being a collection class, List is a parameterized class. This is analogous to a template in C++. Class List can thus be instantiated with class arguments.

In our example we have two parameterizations, List(Employee) and List(Company), which are the payroll and subsidiaries data members of Company. Internally, these parameterizations result in Java classes named Employee_List and
Company_List. Instances of these two classes contain collections of Employee and Company objects, respectively.

Note that all three types of Demeter classes (construction, alternation and repetition) can be parameterized. Consider, for example, the following parameterized construction class:

Xpto(X) = <name> String <x> X.

The parameterization Xpto(Conglomerate) defines a concrete class named
Conglomerate_Xpto containing two data members: a String called name and a Conglomerate called x.

Other Keywords

Following is a list of some of the common keywords encountered while writing Demeter programs.

Class Dictionary keywords

parse, noparse, visitors, endvisitors,init, common, extends, implements, lookahead, derived, extends, public, private, final, interface, static, read-only.

Behaviour file keywords

init, start, finish, before, after, around, traversal, to, to-stop, bypassing, only-through, throws.

Creating Legal Class Dictionaries

In order for Demeter to be able to produce useful Java code from a class dictionary, the class dictionary must follow some rules.


Cycle-free Alternation Axiom

This rule makes sure that a class does not inherit from itself, either directly or indirectly. The following class dictionary violates this rule:

Example = <ex> Edible.
Edible : Fruit.
Fruit : Apple.
Apple : Edible.


Unique Label Axiom

This rule makes sure that a class does not have two data members, either immediate or inherited, that have the same name. The following class dictionary violates this rule:

Brick = <weight> DemNumber.
WeightRelated : Coin | Brick common <weight> DemNumber.
Coin = .


Inductiveness Axiom

This rule makes sure that a class dictionary does not define any objects that are inherently circular. The following class dictionary violates this rule since any A-object has an inherent circular structure.

A = B .
B : C .
C = A .

Such a violation is not strictly an error. Non-inductive class dictionaries per se are fine and often useful. The only problem with them is that a textual representation in sentence form cannot be provided for their inherently circular objects. In addition, the generic software cannot be used for such objects.

Below, we give a similar class dictionary as above which is inductive.

A = B .
B : C | D .
C = A . D = .


Alternatives

Alternatives of alternation classes can only be class names and not strings. This rule makes sure that inheritance is properly used and generated. The following class dictionary violates this rule:

Example = <ex> Op.
Op : "+" | "*".


First Class

The first class defined must be a construction class. This rule serves the practical purpose of allowing Demeter to generate sample code with respect to an instantiable class. The following class dictionary violates this rule:

Exp : Simple | Compound.
Simple = .
Compound = .


Class Names

Class names should start with a capital letter, or should at least contain a capital letter, if they are used without a label as part of a class.

This rule makes sure that class names and variable names are distinct. Otherwise the compiler may produce an error. The following class dictionary violates this rule:

a = b.
b = .


Use of Terminals

Terminal classes such as Integer and Ident may not be alternatives; they can only be used as part classes.

This rule makes sure that the terminal classes Ident, Number, Real, String, Text and the rest do not inherit from user's classes, since they belong to Demeter and Java's run-time class libraries. The following class dictionary violates this rule:

Example = <ex> NumberOrIdent.
NumberOrIdent : DemNumber | DemIdent.

Code Generation

One of the major functions of class dictionaries is to give Demeter instructions for generating Java class definitions. For example, class Conglomerate would be turned into Java code that looks something like:

class Conglomerate implements Cloneable {
  protected Name name;
  public Name get_name() { return name; }
  public void set_name(Name new_name)
    { name = new_name; }
  protected Company head;
  public Company get_head() { return head; }
  public void set_head(Company new_head)
    { head = new_head; }
  Conglomerate() { super(); }
  public Conglomerate(Name name, Company head) {
    super();
    set_name(name);
    set_head(head);
  }
  ...
};

Notice also in the class definition of Conglomerate that its parts name and head had been defined as protected data members, each one with two public accessors: a reader method get_xxx, and a writer method set_xxx.

External Types

It's not necessary that the parts of a class are classes defined in the same class dictionary, and they can even be of user-defined types. The following features are supported by the Demeter system:

  1. Class in some external package (java.io, etc).
    In this case the declaration is:

    A = <p> SomePackage.SomeClass ...

  2. User-defined type.
    In this case the declaration is:

    A = <p> MyType ...

    For example,

    A = <count> MyType.

    If the type is user-defined, say MyType, then MyType must be defined in some .java file.

Application-Specific Grammar

One of the features of the Demeter System is the ability for describing objects by english sentences. By default, Demeter will generate generic methods for parsing/printing any object in a class dictionary from/to the form of an english sentence. This is very convenient for the instantiation of the application objects and for debuging purposes.

How are Grammars Useful?

As a motivation example, suppose we want to instantiate an object of class Conglomerate (see the class dictionary defined in the beginning of this chapter). Using Java constructor calls, we would have the following code:

Conglomerate iConglomerate = new Conglomerate();
iConglomerate.set_name(new Name (new Ident(``DEMETER_SA'')));

Company iCompany = new Company();
Name iName = new Name();
iName.set_n(new Ident(``DEMETER_INC''));
iCompany.set_name(iName);
iConglomerate.set_head(iCompany);

Manager iManager = new Manager();
iName = new Name();
iName.set_n(new Ident(``KL''));
iManager.set_name(iName);
Salary iSalary = new Salary();
iSalary.set_amount(new Integer(100));
iManager.set_salary(iSalary);

Staff iStaff = new Staff();
iName = new Name();
iName.set_n(new Ident(``CVL''));
iStaff.set_name(iName);
iSalary = new Salary();
iSalary.set_amount(new Integer(0));
iStaff.set_salary(iSalary);

Employee_List el= new Employee_List();
el.append(iManager);
el.append(iStaff);

iCompany.set_payroll(el);

Company_List cl = new Company_List();
iCompany.set_subsidiaries(cl);

This kind of code is very tedious to write, especially if the application requires hundreds or thousands of objects. Furthermore, it is error prone and very difficult to reason about. Instead, it would be nice to describe this object by a sentence such as:

conglomerate DEMETER_SA with head
  company DEMETER_INC
    employees
      manager KL with salary 100
      staff CVL with salary 0
    subsidiaries { }

The Demeter generated parse method and PrintVisitor provide this facility. In order for these methods to execute successfully the programmer must include syntax elements in the class dictionary file.


Defining Grammar

Defining application-specific grammars in class dictionaries is very straightforward. The programmer simply needs to include pieces of syntax at some points of the class definitions.

For the conglomerate example, and according to the sentence we used above, the class dictionary becomes:

// Class dictionary that describes conglomerates of companies.
// You can use Java-like comments.

Conglomerate = "conglomerate" <name> Name "with head" <head> Company .
Name = <n> Ident .
Company = "company" Name
          "employees" <payroll> List(Employee)
          "subsidiaries" "{" <subsidiaries> List(Company) "}" .
Employee : Manager | Staff common <name> Name
                                    "with salary" Salary.
Salary = <amount> Integer .
Manager = "manager" .
Staff = "staff" .
List(S) ~ { S }.

Below is a list of restrictions for positioning syntax elements in class dictionaries.

  1. Right-hand only.
    Syntax elements can only be inserted on the right-hand of the productions, i.e. to the right of the signs `=', `:' and `~'. The following example violates this rule:

    Manager "manager" = .
    

  2. Before the dot.
    Syntax elements cannot be inserted after the ``.''. The following example violates this rule:

    Manager = . "manager"
    

  3. Before the labels.
    If the parts have explicit labels, syntax elements cannot be inserted between the labels and the part names. The following example violates this rule:

    Conglomerate =
      <name> "conglomerate" Name <head> "with head" Company .
    

  4. After alternatives.
    In the definition of alternation classes, the syntax elements can only occur after all the alternatives have been declared. The following class dictionaries violate this rule:

    Employee :  "manager" Manager | Staff common Name
                                          "with salary" Salary.
    

    Employee :  Manager "manager" | Staff common Name
                                          "with salary" Salary.
    

Grammar-Related Files

Demeter will generate a file, grammar.jj which will contain a grammar in a form suitable for input to Sun's JavaCC parser generator tool. The grammer specifies a parser that is capable of producing objects of the classes that appear in a class dictionary from strings in english form.


How to Make a Class Dictionary LL(1)

The grammars used by Demeter's parser and printer must satisfy the LL(1) conditions of grammar technology. If the class dictionary fails to be LL(1) you will need to modify it to make it be LL(1) by adding concrete syntax.

Remember that the violation of the LL(1) conditions is only important for the generic parser and printer: if the application uses these Demeter facilities, then LL(1) must be used; if not, then there is no problem in violating it.

When defining class dictionaries or when correcting a non-LL(1) class dictionary, the following rules should be followed.

  1. The repetition delimiter rule.

  2. The alternation delimiter rule.

  3. The left recursion rule.

  4. The optional delimiter rule.

  5. The recursion delimiter rule.

Let us consider each rule in turn.

The Repetition Delimiter Rule

The repetition delimiter rule states that a repetition class should either start with a syntax terminal and terminate with a syntax terminal, unless it is always used between delimiting syntax terminals.

Consider the following class dictionary

TransportationSystem =
  <planes> List(Plane)
  <passengers> List(Passenger).
Plane = <name> String.
Passenger = <birthplace> String.
List(S) ~ {S}.

Notice that the LL(1) violation is not a class structure problem but only a grammatical problem and users may not be concerned with them at some points.

The problem with this example is that since both Plane- and Passenger-objects are represented by Strings, the generated parser has no way to find out when, in a given input sentence, the list of planes ends and when the list of passengers starts.

A way of disambiguating the grammar is to make clear when lists end (and begin) in general. This can be done by putting simple parentheses around a list, such as in the parameterized class List(S) below.

TransportationSystem =
  <planes> List(Plane)
  <passengers> List(Passenger).
Plane = <name> String.
Passenger = <birthplace> String.
List(S) ~ "(" {S} ")".

The following class dictionary further illustrates the repetition delimiter rule with a programming language example.

Example = <ex> List(Statement).
Statement : Conditional | Assignment.
Conditional :
  Repeat | While common
    <condition> Exp
    <statements> List(Statement).
Repeat = "repeat".
While = "while".
Assignment = <lhs> Ident <rhs> Ident.
Exp = <var> Ident.
List(S) ~ {S}.

We can improve this class dictionary in two ways. We can either use parentheses at the parameterized class List(S), or we can use end as a terminator for Conditional.

Using parentheses yields the class dictionary

Example = <ex> List(Statement).
Statement : Conditional | Assignment.
Conditional :
  Repeat | While common
    <condition> Exp
    <statements> List(Statement).
Repeat = "repeat".
While = "while".
Assignment = <lhs> Ident <rhs> Ident.
Exp = <var> Ident.
List(S) ~ "(" {S} ")".

Using end yields

Example = <ex> List(Statement).
Statement : Conditional | Assignment.
Conditional :
  Repeat | While common
    <condition> Exp
    <statements> List(Statement) "end".
Repeat = "repeat".
While = "while".
Assignment = <lhs> Ident <rhs> Ident.
Exp = <var> Ident.
List(S) ~ {S}.

The Alternation Delimiter Rule

The alternation delimiter rule states that the alternatives of an alternation class should normally start with distinct syntax terminals. If you allow one (exactly one) alternative to be empty, then make sure that the terminals which follow the empty alternative are distinct from the terminals which start the other alternatives. This will improve readability of your notation and it avoids LL(1) violations. In general, keep the empty alternate as the last alternate.

Consider the following class dictionary

Basket = "basket" <contents> Contents <final> Fruit.
Contents : Fruit | Basket | Nothing.
Fruit : Apple | Orange.
Apple = "apple".
Orange = "orange" .
Nothing = .

You may change the class dictionary as follows so that it no longer violates the the alternation delimiter rule.

Basket = "basket" <contents> Contents "final" <final> Fruit.
Contents : Fruit | Basket | Nothing.
Fruit : Apple | Orange.
Apple = "apple".
Orange = "orange" .
Nothing = .

The Left Recursion Rule

The left recursion rule makes sure that on every path in a class dictionary at least some input is consumed. This rule improves the unique readability of the input language.

The following class dictionary violates this rule.

Basket = <contents> Contents.
Contents : Fruit | Basket.
Fruit : Apple | Orange.
Apple = "apple".
Orange = .

Remember that non-left recursion is a prerequisite to LL(1)-ness. We can correct this violation by adding some concrete syntax at the beginning of Basket. This allows determining whether we have a Fruit or a Basket without having to recur into Basket as follows.

Basket = "basket" <contents> Contents.
Contents : Fruit | Basket.
Fruit : Apple | Orange.
Apple = "apple".
Orange = .

The Optional Delimiter Rule

The optional delimiter rule states that an optional part should start with a unique terminal and the same terminal cannot appear after the optional part. This rule improves the readability of the notation and avoids LL(1) problems.

The following class dictionary violates this rule.

Tree = <contents> C [<left> Tree] [<right> Tree].
C = "c".

This problem can be corrected as follows

Tree  = <contents >C 
        [<left >Tree ] "left" .
        [<right >Tree ] "right" .
C  = "c" .

The Recursion Delimiter Rule

The recursion delimiter rule asks that a recursive structure start with a delimiter and end with a delimiter. In the previous example, a first, but incorrect attempt to make the class dictionary LL(1) is the following.

Tree = <contents> C ["left" <left> Tree] ["right" <right> Tree].
C = "c".

The problem with the above class dictionary is that it is not clear for the parser where to attach subtrees. For example, the object sentence c left c right c could specify both a tree with a left and right subtree, or, alternatively, a tree with a left subtree which in turn has a right subtree.

When we apply the recursion delimiter rule, the class dictionary becomes LL(1). The corrected class dictionary looks as follows.

Tree = "(" <contents> C ["left" <left> Tree] ["right" <right> Tree] ")".
C = "c".

The recursion delimiter rule is also useful to avoid left-recursion . Consider the class dictionary

Example = <ex> Postfix.
Postfix : Simple | Compound.
Simple = <v> Integer .
-Compound = <args> Postfix_List <op> Op.
Op = "+".
Postfix_List ~ {Postfix}.

Which contains left recursion. After applying the recursion delimiter rule to Compound we get

Example = <ex> Postfix.
Postfix : Simple | Compound.
Simple = <v> Integer .
Compound = "(" <args> Postfix_List <op> Op ")".
Op = "+".
Postfix_List ~ {Postfix}.

Lookahead feature for LL(k) grammars

In a few cases, it might be very difficult to keep the grammar LL(1). In such cases, its possible to exploit the LL(k) feature of JavaCC.

The lookahead keyword

Parser lookahead directives can be specified at choice points in the class dictionary grammar with lookahead (@ <directive>@). A choice point is one of:

A lookahead directive is one of:

Parsing and Printing Objects

Any object which is an instance of some construction class can be parsed and printed from/to a sentence. This is done by invoking the parse(String theString) method and PrintVisitor on that object.


The parse method

The use of the parse method involves the existence of a input source that contains the sentence describing some object. Demeter currently generates parse methods that accept either a java.io.InputStream or a java.lang.String as input. These parse methods exist as static methods on every class that Demeter generates. A typical use of the parser might look like:

public static void main(String args[]) {
   Conglomerate c = Conglomerate.parse(System.in);
...
}

When run, this code expects the user to enter a string to the program's standard input. That string will then be parsed into a Conglomerate object and execution will continue.


Selective parser code generation

There are now several ways to turn off generation of parser code for classes. To turn it off for an individual class, use:

    Foo noparse = ...
To turn it off for a section of the class dictionary:
    noparse
    Foo = ...
    Bar = ...
    parse
To turn it on for an individual class inside a section of classes in which it's turned off:
    noparse
    Foo = ...
    Bar parse = ...
    Baz = ...
    parse


Printing Objects

Printing can be done by using the PrintVisitor. This is explained in chapter [*].


next up previous
Next: Behavior Specification Up: Demeter/Java User Manual Version Previous: Introducing Demeter
Doug Orleans 2000-09-21