Token attributes
| attribute |
description |
| text |
|
| type |
|
| line |
|
| index |
|
| pos |
|
| channel |
|
| tree |
|
| int |
|
Rule attributes
Parsers
| attribute |
description |
| text |
|
| start |
|
| stop |
|
| tree |
|
| st |
|
Tree parsers
| attribute |
description |
| text |
|
| start |
|
| tree |
|
| st |
|
Lexers
| attribute |
description |
| text |
|
| type |
|
| line |
|
| index |
|
| pos |
|
| channel |
|
| start |
|
| stop |
|
| int |
|
The Rule text Attribute in Tree Grammars
In a parser grammar, the relationship between the elements matched by a rule and the associated input text is very clear. A rule begins parsing at a particular token and stops parsing at a particular token. The text attribute for a rule, $text, is simply the concatenated text from all tokens in that range, including hidden channel tokens. What does $text mean in a tree grammar, though?
Tree grammar rules match nodes and trees not tokens. Fortunately, each node has an associated token start and stop index (See TreeAdaptor). As the parser builds trees, each rule sets the token indexes for its return AST to the start and stop token of that rule. We can then define the text attribute for a tree grammar rule to be the text concatenated from the range of tokens indicated by the range in the root of the first tree matched by the rule. This rule may seem strange, but is the most efficient implementation and works in almost all situations. Here are a few examples:
The following code embodies the text attribute definition. The token range from a rule's start node defines the range of text for the entire rule.
int start = input.getTreeAdaptor().getTokenStartIndex($start);
int stop = input.getTreeAdaptor().getTokenStopIndex($start);
String text = input.getTokenStream().toString(start, stop);
Be careful when referencing the text of a rule that happens to be the root of a tree. The text of a rule is the text of all tokens underneath the first rout matched by the rule. In the following example, rule @r op matches a single node, but $op.text will include the text associated with the two operands as well. The parser that build the plus and multiply operator nodes will set the token range to include all tokens for that expression.
Note that the text for a node label is always just the string returned from getText() invoked on that node whereas the text for a rule reference is always the text for the tree rooted at that labeled node.
Finally, here is the case where the definition of the text attribute does not do what you expect. The text attribute is derived from the first node matched by a rule, but a rule such as rule slist that matches multiple subtrees has an ill-defined text attribute because it only gives you the text for the first statement subtree:
In general, you just need to keep this in mind--the text attribute is natural in most cases.
Rule scopes
Global shared scopes
http://www.antlr.org:8080/pipermail/antlr-interest/2007-November/024617.html
In a tree parser rule action, to use the ".text" attribute with a
matched element reference, you must provide the original TokenStream
from which the tree was made. If you do not, then input.getTokenStream()
will return null and you get a NullPointerException.
For example this line...
...after you create the NodeStream.