[stringtemplate-interest] Two-Character-Bracket.templateLexer

Dreyer Ulf (CR/APA3) Ulf.Dreyer at de.bosch.com
Tue Mar 13 08:22:29 PDT 2007


 Hi Terence,

   thanks for looking into this. 
   I've made some progress changing the AngleBracketLexer
   and today I'm going to write some tests.

   Contents:
	-) some Comments to Terence's answer
	-) [What I did] ... to create said lexer
	-) [Testresults] / encountered problems


> -----Original Message-----
> Howdy.  Note I just submitted the ANTLR book to publisher...I'm now  
> going to try to catch up with ANTLR and ST bug reports etc...

  I think, I'll get me one of those even if I am currently only using ST.

> >  I'd like to try for <$ ...$>  as it is not used in my target  
> > language AND it is possible to
> >  differentiate begin- and end-delimiters.
> Hmm...two char, eh?  well, a syntactic predicate ought to work..hang  
> on...hmm...grammar doesn't look too hard to change.  Try using  
> literals not a rule ref.

[What I did]:  (sorry Terence but I find this easier on the eyes ;)

  change all '<' Occurrences to ACTIONBEGIN
  change all '>' Occurrences to ACTIONEND

and define 
ACTIONBEGIN
	: "<$"
	;
ACTIONEND
	: "$>"
	;
This its mostly fine for the ACTIONBEGIN-part.
In the grammar ACTIONEND (formally '>') is often inverted ( ~('>') )
and this does NOT work with Rules or even two character combinations.

So I wrote two predicates  upcomingACTIONBEGIN(int i <lookahead>) 
and upcomingACTIONEND(int i <lookahead>) (checking LA(i) and LA(i+1)).

Now most occurrences of ~(ACTIONEND,'<somechar>', '<someOtherChar') can be
changed to
(!upcomingACTIONEND(1))? ~('<somechar>', '<someOtherChar') 
which ought to be equivalent.

The only part I am really unsure about is the escape-character thing.
The easy way would be to disallow escaping of "<$" and "$>" (as those are the choices
BECAUSE the don't occur in the target language) but I feel this is
somewhat sloppy.

If my tests are successful I may post the entire new grammar (or a diff)
if anyone is interested.


[Testresults]
	These modifications failed at an early test:

group test;
top(foo,bar) ::= <<<$foo$>___<$bar$>>>
                                 ^
	This input yields a TokenStreamRecognitionException
	  Message="expecting '$', found 'r'"		 

 	It took me quite a while to figure out, that this Error results
	not from the template lexer but from the group lexer.
	The BIGSTRING-rule swallows  the first ">>" and discards the third ">"
      giving only "<<<$foo$>___<$bar$" to the template lexer.

   @Terence: 1) I don't think this is easily fixed (especially for arbitrary delimiters) or is there?
             We would need a predicate testing for the current template-delimiter and only match "<<"
		 if it's not part of that.
             2) Template lexers are very easily plugged in but is there a mechanism to change
		    the group-lexer?

  [Testresults] continued:
	Some quick tests (VERY simple cases) for escaping (LITERAL) and if-else-endif
      seem to work as expected.

That's all for today - ... to be continued  ;)

-Ulf

--
Dipl. Inf. Ulf Dreyer
Robert Bosch GmbH
Zentralbereich Forschung und Vorausentwicklung
Software und Systemengineering in der Fertigungsautomatisierung CR/APA3
Postfach 30 02 40 D-70442 Stuttgart
Tel.: 0711/811- 34365
Fax: 0711/811-518 34365
eMail: ulf . dreyer at de . bosch . com 

Robert Bosch GmbH, Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart HRB 14000 Aufsichtsratsvorsitzender: Hermann Scholl; Geschäftsführung: Franz Fehrenbach, Siegfried Dais; Bernd Bohr, Wolfgang Chur, Rudolf Colm, Gerhard Kümmel, Wolfgang Malchow, Peter Marks; Volkmar Denner, Peter Tyroller.




More information about the stringtemplate-interest mailing list