In Chapter 10, Attributes and Actions, we learned how to embed actions within grammars and looked at the most common token and rule attributes. This section summarizes the important syntax and semantics from that chapter and provides a complete list of all available attributes.
Actions are blocks of text written in the target language and enclosed in curly braces. The recognizer triggers them according to their locations within the grammar. For example, the following rule emits
found a decl after the parser has seen a valid declaration:
Most often, actions access the attributes of tokens and rule references:
All tokens have a collection of predefined, read-only attributes. The attributes include useful token properties such as the token type and text matched for a token. Actions can access these attributes via
$ label.attribute where label labels a particular instance of a token reference (
b in the example below are used in the action code as
$b). Often, a particular token is only referenced once in the rule, in which case the token name itself can be used unambiguously in the action code (token
INT can be used as
$INT in the action). The following example illustrates token attribute expression syntax:
The action within the
(...)? subrule can see the
INT token matched before it in the outer level.
Because there are two references to the
FLOAT token, a reference to
$FLOAT in an action is not unique; you must use labels to specify which token reference you’re interested in.
Token references within different alternatives are unique because only one of them can be matched for any invocation of the rule. For example, in the following rule, actions in both alternatives can reference
$ID directly without using a label:
To access the tokens matched for literals, you must use a label:
Most of the time you access the attributes of the token, but sometimes it is useful to access the
Token object itself because it aggregates all the attributes. Further, you can use it to test whether an optional subrule matched a token:
$ T and
$ l evaluate to
Token objects for token name T and token label l.
$ ll evaluates to
List<Token> for list label ll.
. attr evaluates to the type and value specified in the following table for attribute attr:
The text matched for the token; translates to a call to
The token type (nonzero positive integer) of the token such as
The line number on which the token occurs, counting from 1; translates to a call to
The character position within the line at which the token’s first character occurs counting from zero; translates to a call to
The overall index of this token in the token stream, counting from zero; translates to a call to
The token’s channel number. The parser tunes to only one channel, effectively ignoring off-channel tokens. The default channel is 0 (
The integer value of the text held by this token; it assumes that the text is a valid numeric string. Handy for building calculators and so on. Translates to
Parser Rule Attributes
ANTLR predefines a number of read-only attributes associated with parser rule references that are available to actions. Actions can access rule attributes only for references that precede the action. The syntax is
$ r.attr for rule name r or a label assigned to a rule reference. For example,
$expr.text returns the complete text matched by a preceding invocation of rule
Using a rule label looks like this:
You can also use
$ followed by the name of the attribute to access the value associated with the currently executing rule. For example,
$start is the starting token of the current rule.
$ r and
$ rl evaluate to
ParserRuleContext objects of type R
Context for rule name r and rule label rl.
$ rll evaluates to
> for rule list label rll.
. attrevaluates to the type and value specified in the following table for attribute attr:
The text matched for a rule or the text matched from the start of the rule up until the point of the
The first token to be potentially matched by the rule that is on the main token channel; in other words, this attribute is never a hidden token. For rules that end up matching no tokens, this attribute points at the first token that could have been matched by this rule. When referring to the current rule, this attribute is available to any action within the rule.
The last nonhidden channel token to be matched by the rule. When referring to the current rule, this attribute is available only to the
The rule context object associated with a rule invocation. All of the other attributes are available through this attribute. For example,
You can pass information to and from rules using parameters and return values, just like functions in a general-purpose programming language. Programming languages don’t allow functions to access the local variables or parameters of invoking functions, however. For example, the following reference to local variable
xfrom a nested method call is illegal in Java:
x is available only within the scope of
f, which is the text lexically delimited by curly brackets. For this reason, Java is said to use lexical scoping. Lexical scoping is the norm for most programming languages. Languages that allow methods further down in the call chain to access local variables defined earlier are said to use dynamic scoping. The term dynamic refers to the fact that a compiler cannot statically determine the set of visible variables. This is because the set of variables visible to a method changes depending on who calls that method.
It turns out that, in the grammar realm, distant rules sometimes need to communicate with each other, mostly to provide context information to rules matched below in the rule invocation chain. (Naturally, this assumes that you are using actions directly in the grammar instead of the parse-tree listener event mechanism.) ANTLR allows dynamic scoping in that actions can access attributes from invoking rules using syntax
r is a rule name and
x is an attribute within that rule. It is up to the programmer to ensure that
r is in fact an invoking rule of the current rule. A runtime exception occurs if
r is not in the current call chain when you access
To illustrate the use of dynamic scoping, consider the real problem of defining variables and ensuring that variables in expressions are defined. The following grammar defines the
symbols attribute where it belongs in the
block rule but adds variable names to it in rule
stat then consults the list to see whether variables have been defined.
Here’s a simple build and test sequence:
There’s an important difference between a simple field declaration in a
@members action and dynamic scoping.
symbols is a local variable and so there is a copy for each invocation of rule
block. That’s exactly what we want for nested blocks so that we can reuse the same input variable name in an inner block. For example, the following nested code block redefines
i in the inner scope. This new definition must hide the definition in the outer scope.
Here’s the output generated for that input by
$block::symbols accesses the
symbols field of the most recently invoked
block’s rule context object. If you need access to a
symbols instance from a rule invocation farther up the call chain, you can walk backwards starting at the current context,
getParent to walk up the chain.