Input pattern can be classified into four types:
- concrete exemplar
- grammar fragment
- tree grammar fragment
- predicated expression

Note tree grammar fragment is not supported yet...
concrete exemplar
Concrete exemplar basically is like a string enclosed by double quotes. Inside the double quotes, you can define the exact string for matching and ANTLRMorph also supports the use of <xxx> as delimiters of concrete exemplar like StringTemplate. It is useful especially when you're matching some clear and simple patterns, e.g. mathematical formula.
Here are some examples of concrete exemplar input pattern.
Syntax |
Description |
Example |
|---|---|---|
<token-ref> |
Defined lexical rule of the input grammar. |
"<INT><ID>^<INT>" |
<rule-ref> |
Defined grammar rule of the input grammar. |
"(<expr>)" |
label |
When using predefined labels, there's no need to use angle brackets as delimiters. In the example, we need to set up the lable definitions: tokens{a,n=INT}. |
"a x^n" |
<label=token-ref> |
You can define a label of token reference inside the exemplar. |
"<a=INT><ID>^<n=INT>" |
<label=rule-ref> |
You can also define a label of rule reference inside the exemplar. |
"(<e=expr>)" |
... |
Special attribute for concrete exemplar, which is used for partial match. Note that it can only be put at the start or the end of the exemplar pattern |
"... x+0" |
<...> |
Same functionality as the above one. |
"<...>+0<...>" |
When using the predefined labels in your input exemplar pattern, please be careful of the possible ambiguity. Ambiguity can be always prevented from separating a label and an element by a space. However, extra space between elements are not always required as long as there is no ambiguity. The rule is, always match on the longest alphanumeric character set. For example, in the above input pattern example "a x^n", a and n are predefined labels, and the character('x') right next to a is alphanumeric but the one('') before n is not. If there's no space between a and 'x', morph assumes you mean a string "ax". On the other hand, morph can recognize n appropriately even there's no space separated because '' is not an alphanumeric character.
Here is another example:
"a1+1=a2" is recognizable, but "a1a2+0=a1a2" is unrecognizable. Morph recognizes pattern1 "a1+1=a2" as "<INT>+1=<INT>", but treats pattern2 "a1a2+0=a1a2" as "a1a2+0=a1a2" instead of what you may expect "<INT><INT>+0=<INT><INT>".
| Warning Concrete exemplar can only be used individually in a morph rule alternative. It can not be used together with other types of input patterns. |
grammar fragment
The grammar fragment inherits the syntax from the ANTLR grammar. It also allows arbitrary rule actions and semantic predicates. Please note that concrete exemplar pattern doesn't not allow the use of rule actions and semantic predicates.
tree grammar fragment
Under construction.
predicated expression
Predicated expression extends the usage of ANTLR grammar. It allows you define a predicated expression easily instead of using a long rule action. For example, if we want to match a token INT with value 0, we can use the syntax like: INT<"0"> instead of INT {$INT.text.equals("0")}. Morph also implements the use of an arbitrary boolean expression within angle brackets and you can access rule/token attribute via special symbols. For example, we can modify the previous example to: INT<{$text.equals("0") || $text.equals("1")}>.
Syntax |
Description |
Example |
|---|---|---|
x<"y"> |
x is a token/rule reference or label, and y is the value x should be equal to. |
ID<"x"> + INT<"0"> |
x<{...}> |
... is an arbitrary boolean expression. there's no need to put the rule/token name in front of the attribute. |
expr<{!$text.equals("this")}> '.' ID<{$text.equals("this")}> |