Parsing details for a simple grammar

We will examine how parsing proceeds for a grammar that only describes the y-axis line and its tick marks and does so in a very simple way. When a grammar is entered into the system, a Lisp defgrammar macro expands the form into one that includes an additional defrule macro, applied to each rule structure. When the defrule forms expand, they define the CLOS classes for each non-terminal as well as the goal predicates for the search and the generator functions needed. These macros act as the parser generators. Once they have been applied, the parsing process then acts as a top-down tree-search.

A simple grammar with three rules:

;;; ****************** < X-Ticks > ***************     

( Y-Ticks -> Ticks Y-Line 
       (Ticks (touch Y-Line '?) :constraints (> (size Ticks) 2)))      
;;; ****************** < X-Line > ***************     

   ( Y-Line -> Line           
        (:constraints (vertp Line)                 
        (long Line)))      

;;; ****************** < Ticks > ***************
   ( Ticks -> Set(Line)       
        (:element-constraints (horizp Line)             
                              (short Line)))


The parse is assumed to be started at the top node, Y-Ticks. Parsing proceeds by a depth-first search to discover and build elements that satisfy the various geometric constraints. The order of the search is dictated by the order of the constituents in the body of the rule (not in the order of the constituents in the "->" production).

  1. In the Y-Ticks rule, the Y-Line constituent is searched for first. There are no specific restrictions on the Y-Line search included in this Y-Ticks rule.

  2. In the Y-Line rule the Line primitive is the only constituent, and it must obey the constraints that it be vertical and long.

  3. Returning to the Y-Ticks rule, Ticks are searched for next. The expression "(touch Y-Line '?)" is a context generator. The set of all elements in the diagram that obey this constraint is created and the Ticks constituent rule is then entered with this context restriction -- Ticks members can only be chosen from the set of elements in the diagram that touch the long vertical line found by the Y-Line rule. The context is propagated to a constituent rule as an inherited attribute.

  4. The Ticks rule is then entered. Ticks is a set of Line primitives. The elements of the set are chosen from the context inherited from the Y-Ticks rule, as explained in step 3. The elements of this set are further filtered by constraining every one of them to be both short and horizontal. Any subset of the y tick marks would satisfy this rule. But our algorithm attempts to choose the maximal set of elements that satisfy the rule. In this way, a single rule can return twenty tick marks or a hundred data points in one step. In the Y-Ticks rule, an acceptable set of ticks must have greater than 2 members, the constraint ":constraints (> (size Ticks) 2))".

Our grammars, because of the way they are designed and operate, are called Context-based Constraint Grammars. It is essentially an Attributed Multiset Grammar, but its use of context is new. A rule inherits a context specified in its parent that restricts the set of items within which the rule is allowed to search for a solution. This successive narrowing of the search set helps to make the parsing process more efficient. Parsing generates a set of all distinct solutions that satisfy the constraints. In the example above, the set is a singleton. In addition, the same element can be reused or shared by various parts of a single solution, see the discussion of sharing.

When the parse is complete we can highlight the elements in the Diagram Viewer that make up the Y-Ticks structure, as shown on the right. Note that this simple grammar also identifies the top and bottom sides of a data point as tick marks. This is avoided in our more complex grammars by requiring that a tick mark not be a line that participates in a polygon.



Next: Parsing runs with timing

Back to Index page