Introduction and language definition
Your goal in this project is to build a grammar that constructs ASTs for a simple object-oriented programming language that looks like the following samples:
class T {
int i = 3;
string foo() { return "Hi"; }
}
class Dog extends Animal {
Dog() {...} void speak() { print "woof"; }
}
There is no main() like there is in Java. Main programs are just files with statements like:
int i = 3;
print "i="+i;
list names = ["Ter", "Sriram"];
There are two built-in types: string, list, and int but everything is an object--there are no primitive types like there are in Java.
Expressions have only the following operators with the usual precedence: +, -, *, &&, ||, <, <=, >, and >=. Allow nested expressions in (...). Allow square bracket indexing for list member access like names[<expr>].
There are literals for integers and strings as you can see from the above as well as a list literal: a list of expressions enclosed in square brackets. These lists can actually be nested so that you can have a list of lists.
Method calls looks like Java: foo() or obj.foo(<args>).
Other details:
- Single inheritance
- Instance vars (assume one variable per declaration for instance, locals, and parameters)
- locals
- parameters
- no method overloading
- no packages
- java IF, WHILE, RETURN, and assignment statements
- {...} block statements
- print <expr>; statement
- new operator for new Dog()
- allow // and /.../ comments
When in doubt, do what Java does.
Tasks and deliverables
1. You must create a grammar that properly recognizes the language as we've loosely described. Everybody's grammar will be different and even the language definition will be slightly different. I am deliberately making this definition loose so that you have to do a lot of thinking not cutting and pasting from existing grammars. Work hard on the grammar because you will see it again in a future project. Call the grammar SL and put it in SL.g file. This will be a combined grammar with parser and lexer. Call the start rule compilationUnit. It should match either a class definition or a list of statements.
2. You must modify the grammar so that it constructs reasonable ASTs. Again, this is a very loose undefined specification. You must do something reasonable by looking at other grammars and reading the web and my book (and listening in class). You must be able to defend your choice of AST structures. Every input subphrase should have a reasonable AST subtree. You will need a number of imaginary tokens to represent some internal AST nodes.
3. Once you have your grammar building trees, you must generate DOT files so that you can visualize the ASTs. Here is some code that generates DOT and spits to standard output:
CharStream input = new ANTLRInputStream(System.in);
SLLexer lex = new SLLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lex);
SLParser parser = new SLParser(tokens);
SLParser. compilationUnit_return r = parser.compilationUnit();
Tree t = (Tree)r.tree;
DOTTreeGenerator gen = new DOTTreeGenerator();
StringTemplate st = gen.toDOT(t);
System.out.println(st);
Create a document that you can print out that contains a series of input sentences and their associated ASTs. You should create a series of sample input programs anyway for testing. Use these to write your document that shows the mapping of input to output AST.
4. Create a tree parser that walks the trees you have created in the previous phase. Call this SemanticPhase.g. This parser has few actions--it simply guarantees that the AST structure is correct that you built in the parser. You must print out "found class T", "found method T", "found instance variable T", "found parameter T", "found local variable T" for all of the declarations you see in the input tree. For example for the first example above you would see:
Submission
You will create a jar file called ast.jar containing source and *.class files and place in your lib directory:
You can use the svn account for development of the software to if you would like, but I will only be looking at your ast.jar file in the lib directory.
Grading
You will be graded quickly in class interactively. We will move the class to the CS labs before the end of class in order to do the grading. Generally the grading will occur on the day the project is due.
I will review your document describing the input to output mapping and look at your grammar. The quality and thoroughness of your testing is part of your grade. I will also grade cleanliness and quality of the grammar. The structure of your trees is extremely important here. For example, no structure (a linked list) gets a zero on that part.