From dougo@ccs.neu.edu Wed Oct 8 04:51:37 1997 Received: from rigel.ccs.neu.edu (rigel.ccs.neu.edu [129.10.113.96]) by amber.ccs.neu.edu (8.8.6/8.7.3) with ESMTP id EAA01944; Wed, 8 Oct 1997 04:49:32 -0400 (EDT) Received: (dougo@localhost) by rigel.ccs.neu.edu (8.8.6/8.6.4) id EAA01353; Wed, 8 Oct 1997 04:04:59 -0400 (EDT) Date: Wed, 8 Oct 1997 04:04:59 -0400 (EDT) Message-Id: <199710080804.EAA01353@rigel.ccs.neu.edu> From: Doug Orleans To: "lblando@gte.com" Cc: "'lieber@ccs.neu.edu'" , dem@ccs.neu.edu Subject: Re: primitive types in demjava In-Reply-To: <01BCD329.4EFF3800.lblando@gte.com> References: <01BCD329.4EFF3800.lblando@gte.com> X-Mailer: VM 6.22 under 19.15p7 XEmacs Lucid X-Face: (4D-osoq?}7M3\EgvbWKo i am trying to learn demjava and in the process do something useful for my > work with it. i am writing (attempting to is a better way to put it) a > compiler from a 'proprietary' language to c++. > > i am trying to parse this language and, from it, generate the object graph. > i am faced with a problem. the language allows things like: > if (lsr.act == "2") then .... > where lsr.act is just an IDENTIFIER (ie: it is _not_ a structure or object, > just a 'name'). > > i first tried: > Field = Ident . > but i realized that Ident does not allow '.' in the name (which makes sense > for Java/C++). then i tried: > Field = String . > but that would force me to write the above line as: if ( "lsr.act" == "2" > )... > which is not good. > after talking with karl, i realized i can model the field name more or less > like: > > Field ~ SubFieldName { "." SubFieldName } . > SubFieldName = Ident . > > but this will, for the most part, create a bunch of classes and overhead > that i really do not need. all i need is just ONE object (ie. Field) that > has a data member with the whole string in it. > > so the question is: is there an easy way to add another 'primitive' type to > demjava? something like "IdentLuis" :-) that would take '.' as part of the > name? Currently there isn't, but there should be a way to specify a lexical specification for a new Terminal class. In this particular example, however, are you sure you need "lst.act" to be scanned as a single token? Is "lst . act" allowed as well? (It is in Java or C.) If so, then you have to make it be three separate tokens, as you did above. This is what I do in the demjava class dictionary-- a package-qualified name is not a single lexical token, but a series of identifiers separated by periods. The overhead is not a lot, and if you use the Terminal Buffer Rule, it shouldn't affect the rest of your program. In demjava, I have a ClassName class, which I used rather than going directly to the Ident within, so making the change was very easy. About the only think I had to change was to use .parse() to create a ClassName object, rather than using constructor calls. As a side note, it can be tough to use Demeter (in its current state) to parse a predefined syntax. You may want to make some sort of preprocessor in awk or perl to translate the input file into an intermediate form that Demeter can more easily parse, e.g. by putting strings in double quotes or text in (@ @). --Doug