Dashboard > ANTLR 3 > ... > ANTLR v3 documentation > Grammars
  ANTLR 3 Log In | Sign Up   View a printable version of the current page.  
  Grammars
Added by Terence Parr, last edited by Terence Parr on May 16, 2008  (view change)
Labels: 
(None)

Grammar syntax

There are four kinds of grammars: lexer, parser, tree, and combined (no modifier), but all grammars are of the form:

/** This is a grammar doc comment */
grammar-type grammar name;
options { name1 = value; name2 = value2; ... }
import delegateName1=grammar1, ..., delegateNameN=grammarN; // can omit delegateName
tokens { token-name1; token-name2 = value; ... }
scope global-scope-name-1 { «attribute-definitions» }
scope global-scope-name-2 { «attribute-definitions» }
...
@header {...}
@lexer::header {...}
@members {...}

«rules»

To set the superclass of the generated class, use the superClass option.

Rule syntax

/** rule comment */
access-modifier rule-name[«arguments»] returns [«return-values»] throws name1, name2, ...
options {...}
scope {...}
scope global-scope-name, ..., global-scope-nameN;
@init {...}
@after {...}
    : «alternative-1» -> «rewrite-rule-1»
    | «alternative-2» -> «rewrite-rule-2»
    ...
    | «alternative-n» -> «rewrite-rule-n»
    ;
    catch [«exception-arg-1»] {...}
    catch [«exception-arg-2»] {...}
    finally {...}

Lexer rules

Rules in a lexical grammar are token names:

ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;

Here are some common lexical rules for programming languages:

WS  :  (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;}
    ;

COMMENT
    :   '/*' .* '*/' {$channel=HIDDEN;}
    ;
LINE_COMMENT
    : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    ;

The $channel=HIDDEN; action places those tokens on a hidden channel. They are still sent to the parser, but the parser does not see them. Actions, however, can ask for the hidden channel tokens. If you want to literally throw out tokens that use action skip();.

Sometimes you will need some help or rules to make your lexer grammar more readable. Use the fragment modifier in front of the rule:

HexLiteral : '0' ('x'|'X') HexDigit+ ;

fragment
HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;

In this case, HexDigit is not a token in its own right; it can only be called from HexLiteral.

Tree grammar rules

Rules in tree grammars are identical to parser grammars except that they can specify a tree element to match. The syntax is:

^( root child1 child2 ... childn )

decl : ^(DECL type declarator) {System.out.println($type.text+" "+$declarator.text);}
     ;

Attribute scope syntax

Attribute scopes are a set of attribute definitions of the form:

scope name {
    type1 attribute-name1;
    type2 attribute-name2;
}

Grammar action syntax

Actions are, in general, of the form:

@action-name { ... }
@scope-name::action-name { ... }

For example, @header {...} is the same as @parser::header {...}. The action scope names differ depending on the target, but targets should support @parser and @lexer. Another common action name is members, which inserts that action into the generated class definition.

Rule elements

Rules may reference:

Element Description
T Token reference. An uppercase identifier; lexer grammars may use optional arguments for fragment token rules.
T<node=V> or T<V> Token reference with the optional token option node to indicate tree construction note type; can be followed by arguments on right hand side of -> rewrite rule
T[«args»] Lexer rule (token rule) reference. Lexer grammars may use optional arguments for fragment token rules.
r [«args»] Rule reference. A lowercase identifier with optional arguments.
'«one-or-more-char»' String or char literal in single quotes. In parser, a token reference; in lexer, match that string.
{«action»} An action written in target language. Executed right after previous element and right before next element.
{«action»}? Semantic predicate.
{«action»}?=> Gated semantic predicate.
(«subrule»)=> Syntactic predicate.
(«x»|«y»|«z») Subrule. Like a call to a rule with no name.
(«x»|«y»|«z»)? Optional subrule
(«x»|«y»|«z»)* Zero-or-more subrule
(«x»|«y»|«z»)+ One-or-more subrule
«x»? Optional element.
«x»* Zero-or-more element.
«x»+ One-or-more element.

Site powered by a free Open Source Project / Non-profit License (more) of Confluence - the Enterprise wiki.
Learn more or evaluate Confluence for your organisation.
Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.5.1 Build:#806 May 06, 2007) - Bug/feature request - Contact Administrators