|
Home |
Download |
ANTLRWorks |
Wiki |
About ANTLR |
Feedback |
Support |
Bugs |
v2
|
|
|
Latest version is 3.4 Download now! » |
|
|
A big thanks to everyone who came to this year's incarnation of the ANTLR workshop. What a great success! 30 people from all over came to present and discuss ANTLR experience and to map out future directions for ANTLR 3.0. Ric Klaren was able to build a C code generator for ANTLR 3.0 prototype in 2 days without ever having seen StringTemplate before. It validates my theory that StringTemplate's strict separation of model/view yields a guarantee of retargetability for code generators! This document summarizes the workshop and the important ANTLR 3.0 ideas; plus, it has links to photos. Look for ANTLR2005's announcement in the Spring of 2005. :) Here is a page of photos from ANTLR2004. Organizers
Presentations, ProceedingsThursday, Oct. 7
Friday, Oct. 8
ANTLR 3.0 Future Directions Summary(I'll flesh out this list shortly...wanted to write it all down before I forget...TJP) These ideas (in no particular order) were discussed in the workshop and at a mini-cabal brunch on October 9th (Loring Craymer, Ric Klaren, John Mitchell, Terence Parr). Parser grammar + auto parse-tree construction + template (whose names map to grammar rules) for source tweaking or instrumentation. Attribute scopes in grammar can be passed in and referenced as scope.foo in ST templates? Suffix predicates where you can say "match this but only if followed by this expression or NOT followed by this expression (cite Bryan Ford for the NOT pred). Error codes instead of embedded strings in antlr itself to make internationalization easier. Use templates so people can fill in the blanks for errors. :) Make FIRST/FOLLOW info available generally. Restricted form of StringTemplate "composition" to allow variations of
a single template. For example,
We like single spec (probably still two objects though) for parser/lexer. What about tree parser? Context-sensitive lexing. As I build overall DFA predictor for lexer grammar, I can learn which rules are subsets of the others in order to make FOR really FOR|ID when doing context-sensitive token fetch. For building DFAs, can I generate bytecodes directly to use the "goto" bytecode? Yes, but I don't want to have to include a library like jasmine. Perhaps build a java shell and run javac on it, load into binary template array and then just fill in the blanks. :) Still, I'd be happier if we can find a Java source method that isn't huge! Ric's C code generator; done in 2 days; validates CodeGenerator + templates approach. Tracking original tokens, whitespace. ASTs point into stream of token objects so always can reproduce original sequence. Use DFA for keywords as it lets us use a static keyword token. Much faster; have to use bytecode gen for DFAs in Java though. Attribute scopes
Or define scopes in decl section above and then ref in 'rule'. Also
allow static versus stack-based attribute like static int x of
sorcerer fame. Hmm...this syntax may be better:
John Mitchell adds: "Adding intelligence to dynamic scope attribute specifications so that we can use things like ?, *, +, etc. so that they are allowed zero or one, zero or more, 1 or more, etc. times. In addition to those classics, we should also have some syntax to be able to say things like: [0..n], [1..n], [n..m], {0 or [n..m]}, {0 or [n...]}. Also, for the attributes that may be multi-valued, it seems like there should be support for both FIFO and LIFO ordering." GUI. Most important thing to do first is provide support for analyzing nondeterminisms. Will be standalone and written in java at first. Grammar diff tool. Visualization of NFA, DFA. Interpreted version, can show ASTs built and parse tree and derivations. "What can be matched here?" (generational grammar). Automatic tree grammar construction is cool. Must keep tree construction actions out of target-specific actions. Try to avoid having to parse target actions in any way. perhaps need $, #, and @ stuff though. Grammar reuse. No way around it. Actions make any formal method useless. Formalize what programmers do now (and naturally): cut-n-paste become some kind of live RCS thing where you can feed forward changes later. UNICODE. Forget 32 bits until Java does it natively. Stick with 16 bit; analyzer is cool with 21 bits though; try to make it 32 bit clean. Shiva's combined multi-grammar thing was cool. Lookahead across grammars is hard though. Error alternatives. Paul Lucas asked about yacc style error alts where
you can list alternatives that match common mistakes. We could insert
an imaginary token into the token stream so that users could say:
Upon error, the error alts would be attempted. They would not normally be a part of the decision for a rule; here there is no decision for assign until there is an error. It would rewind the input and try the error alts. This is like a fancy exception handler that lets you use a grammar to look for common bad patterns. |
|||||||||||||||||||||||||||||||||||||||||||||||