New blog entries
For all the entries, see Terence's blog RSS
The ANTLR project is moving to github within a few days. Thanks to user anatol for setting up the ANTLR organization and pulling in the perforce (p4) repositories that we've been using. Everything is now set up for us to seamlessly start using git/github. The purpose of this blog post is to announce this move and to outline how I think workflow should go.
Repositories
Branches
I enjoyed reading A successful Git branching model and I think a lot of it makes sense for my small ANTLR team.…
Over the Christmas holidays, I've been busy building example grammars for ANTLR v4. The thing I noticed immediately is that grammars just work. There are no error messages from ANTLR when generating code and all we can get are true ambiguity errors at runtime. E.g., if you can recognize T(i) as both a function call and a constructor call. The lexers work extremely well and I'm a happy guy. Nothing scares ANTLR v4, "it's pretty bad-ass" like the The Crazy Nastyass Honey Badger.
And then,…
ramblings about design as stream of consciousness
I built a quick mockup with NetBeans just so that I have all of the Windows in front of me with a couple of faked images. The basic design is easy because NetBeans allows us to move windows around as we want. I happen to lay the Windows out like this:
(Sam already has the navigator and the editor Windows filled in, but I was too lazy to incorporate.)
Every window has content, publishes events, and listens for events.…
I just finished attending a three day workshop on developing standalone GUI applications with this awesome Java applications framework that you've never heard of. Actually, that's not true. You've heard of it but thought it was an IDE--NetBeans. Unfortunately, the amazing applications framework has been hitched to the NetBeans IDE wagon which, for better or worse, has much less market share than eclipse. (I know how NetBeans users feel because I use Intellij,…
I have a prototype working for the automatic parse tree construction and automatic visitor generation. Imagine we have the following simple grammar:
grammar T;
s : i=ifstat ;
ifstat : 'if' '(' INT ')' ID '=' ID ';' ;
The usual startup code looks like:
TLexer t = new TLexer(new ANTLRFileStream(args[0]));
CommonTokenStream tokens = new CommonTokenStream(t);
TParser p = new TParser(tokens);
p.s(); // invoke the start rule, s
To make it create a parse tree,…
Ok, been doing some thinking and playing around and also talking to Sam Harwell / Oliver Zeigermann.
The first modification I've made is to turn parse tree construction on or off with a simple Boolean, rather than having to regenerate the parser with -debug. Also, the parsers fire methods enterMethod/exitMethod with the rule index all the time now since it is so convenient to have these. No more needing -trace and regenerating to get debug output.…
Summarizing discussion from people on the interest list.
Editor
Likes
the editor works pretty well to help with auto indenting etc to make things look pretty and provide easy to read formatting.
Dislikes
editor is quirky
forward and backward arrows don't always work
undo is character by character
a number of people pointed out the inefficient and sluggish error checking and syntax highlighting. there are little user benefits for key-stroke-by-keystroke checking while the user is typing,…
After a few weeks away from ANTLR v4 coding, I'm back to thinking about tree grammars and the automated generation of tree visitors. I recently replaced a number of tree grammars in ANTLR v4 itself with much simpler visitor implementations. Doesn't require a separate specification and is much easier to debug. I made an ubervisitor that actually matches patterns in the tree rather than nodes (using a single prototype tree grammar) and then calls listener functions.…
Introduction
I'm abandoning this post mid-stream...seems that regular alternatives can match erroneous input just as easily as so-called error alternatives. Because of adaptive LL(*), it shouldn't affect production speed at all once it gets warmed up.
ANTLR has a built-in mechanism to detect, report, and recover from syntax errors. It seems to do a pretty good job. Certainly it's better than PEG, which can't detect errors until EOF.…
At long last, I'm back on the ANTLR v4 rebuild after 9 months hiatus to write an academic LL(*) paper with Kathleen Fisher and release StringTemplate v4. Woot!
Ok, so what does all that title nonsense have to do with ANTLR v4? Well, v4 will use all those things at some point, either in analysis or in the generated code. I'm proposing something a little different for v4: Along with a recursive-descent parser,…
After reading more about whitespace handling in scannerless parsing generators (e.g., GLR, PEG), it looks like you have to manually insert references to whitespace rules after every "token rule" and one at the beginning of the parse. So apparently, ANTLR is a scannerless parser generator if you simply use characters as tokens. This page shows not only how to build a real scannerless parser in antlr but also shows how to build abstract syntax trees (i.e., not parse trees)!…
Scannerless parsing generators have an advantage over separate lexers and parsers: it's much easier to create Island grammars, combine components of grammars, and deal with context-sensitive lexical constructs. I still think I prefer tokenizing the input, but thought I would run an experiment to see what a scannerless ANTLR grammar would look like.
I started out with the grammar that contained an LL(*) but non-LL(k) rule (stat). Because we're looking at characters as tokens,…
I just tested the new version of ANTLR that uses ST v4 not v3. In terms of code generation, it's just about twice as fast given plenty of memory (750M). To process Jim Idle's 15296 line TSQL grammar, it takes 1760ms instead of 3624ms, though that doesn't alter the overall wall clock performance much. It still takes about 12 seconds to process the grammar. It generates a whopping 168,677 lines of Java code (not including the lexer). That gives us about 95,…
Raw notes for thinking about programming language productivity...will update as i get more thoughts and time to flesh out.
interoperability via REST / sockets
lambdas to reuse strategies over structures
streams from data structures
pure funcs not methods needed
packages to hide name space
stack variables shared by multiple funcs like antlr's scopes?
register comparators and such for diff types so sort, find, etc... work on new data structures?
built in templates, sets, arrays, lists, trees,…
I've moved this content to the v4 ANTLR pages.
Links to Terence Notes on Antlr3.
These are the old non-wiki-based entries.
*lookahead, analysis
*lexers, parser integration
*tree grammars, parsing
*code generation
*semantic predicate hoisting
*error reporting, recovery
*ASTs, parse trees, transformation
*Aspects, Actions, Rewriting, Attributes
Labels: