Methodology: typed "antlr grammar" into sourceforge.net and google
code.  Downloaded source for any project that wasn't obviously
something not using antlr. (lots of parser generators matching those
keywords).

Initial count:
~/research/papers/LL-star/grammars-in-wild $ find . -name '*.g' |wc -l
     110

rm ANTLR temp files like Gaml__.g and generated grammars from
inheritance: expandedGnuCParser.g (v2?). Discount ANTLRv3.g
ANTLRv3Tree.g from distribution

$ find . -name '*__.g' -exec rm {} \;
$ find . -name '*expanded*.g' -exec rm {} \;
$ find . -name ANTLRv3Tree.g -exec rm {} \;
$ find . -name ANTLRv3.g -exec rm {} \;

Count:

$ find . -name '*.g' |wc -l
     103

Some are dups:

$ find . -name '*.g' | sed 's#.*/\(.*.g\)#\1#' | sort | uniq -c |sort -r -n
   6 TreePHP.g
   6 CompilerAst.g
   2 generic.g
   2 Jiffle.g
   2 ComponentTreeParser.g
   2 ComponentParser.g
...

rm dups by copying to single dir:

$ find . -name '*.g' -exec cp {} /tmp/flat \;

Count:

$ ls flat|wc -l
89

Compute stats (ugh):

$ more *.g

track lexers, combined, parser, tree parser and actions/no-actions
Count v2 also. 

Include sem preds.

Don't count ANTLR tree construction related or lexer actions like skip(). Don't count @members

$ ls > stats

Then add 1 or 0 column depending on whether or not the actions in the grammar.

~/research/papers/LL-star/grammars-in-wild $ grep 0 stats | wc -l
      22
~/research/papers/LL-star/grammars-in-wild $ grep 1 stats | wc -l
      68

$ grep '^tree grammar' *.g |wc -l
      28
$ grep 'lexer grammar' *.g |wc -l
       1
$ grep '^grammar ' *.g |wc -l
      19

$ grep -l '^class ' *.g|wc -l   # v2 grammars
      41
$ grep 'extends TreeParser' *.g | wc -l  # v2 tree grammars
      15

Totals up to 89. sanity check. 41 v2, 48 v3

43 tree grammars

Compute size of grammars in lines

~/research/papers/LL-star/grammars-in-wild/flat $ for f in `cat /tmp/g`; do wc -l $f ; done | awk '{n += $1} END {print n/89}'
660.775