Introduction
In this project, you will build a symbol table for the programming language from the AST Construction project. You will create two new tree grammars from the SemanticsPhase.g tree grammar I give you to add symbol table actions. You will need to track symbol definitions and references and place them in appropriate scopes. Further, you will have to chase references to classes by loading them from the disk (assume a file extension of .sl for our programming language) and adding them to the symbol table. To complete this project, you will need to study materials from the Symbol tables lecture carefully.
As an example of what I expect, consider the following two SL class definitions.
class S {
int a;
void blort() { a=1; }
}
and
class U extends T {
int i;
S s;
}
and
class T extends S { int i;
U u; void foo(string q) {
string w = q;
list x = bar();
print x;
u = new U();
if ( x==0 ) {
int i = 9;
print i;
}
a = 99; blort();
}
list bar() {
i = 3;
return ["dog", "cat"];
}
}
There will be two passes using two tree grammars: Def.g and Ref.g, which are essentially cut-and-paste from the Semantics.g tree grammar from the previous project. During the first pass, as you fill your symbol table, you will print out definitions of symbols. At the close of every scope, you will print out the list of symbols in that scope along with the name of that scope. For the above classes, you would see the following:
Then executing the tree parser from Ref.g, you must emit the following messages for the references:
Note at the Def.g tree grammar must chase references to classes by loading them from the disk in the current directory where is the reference Ref.g pass already sees them in the symbol table and does not need to reload them.
It is okay if you're output is slightly different or the order is slightly different, but you must track all of the information above and printed out in a sensible fashion.
Tasks and deliverables
- Your grammar file will build an AST as before and we will do symbol table management in the tree grammar Def.g. You need to add actions at definition. You also need to add actions to push and pop new scopes. You must allow forward references to fields and methods. Forward references are dealt with by having two passes.
- Ref.g is the same grammar as Def.g, but with different actions. You must add an action at each symbol reference spot and check to see that it is a valid symbol. During the second phase, you are not building the symbol table and, hence, you do not have ready access to the "current scope". If you look below, you will see that the AST node definition has a scope field. This field must be set during the Def.g phase. Your Ref.g phase must pull this pointer back out of the AST as it walks.
- Emit an error message if a symbol is not defined. If it is defined, print the message per the above.
- For each definition of a variable (parameter, field, or local variable), method, or class, you must print out a message per the above.
- For references to classes, load them from the disk during the definition phase. You do not need to run the Ref.g references pass on the loaded classes. You only run the Def.g phase so that we can learn about the fields and methods. Create a new Def.g tree parser instance each time you go to the disk looking for a type. Assume all types are in the current directory and with file extension ".sl".
- For each reference to a variable, method, or class, you must print out a message per the above.
- For expr.method() and expr.field references you will not know the type of expr and so you do not need to deal with these kinds of expressions. If you do the extra credit for semantic analysis, then you would be able to handle this.
- Once you reach the end of a scope (class definition, method definition, or {...} block), you must print the list of symbols and that scope per the above.
- Create a TestSym class with a main() method that accepts a file name on the commandline as the first argument. That method should open the file, parse and build and AST, run the Def.g tree parser on the tree, run the Ref.g tree parser on the tree, and then terminate.
- Deal properly with cyclic class references: C has a field of type D and D has a field of type C.
Sample symbol table objects
You should create a class hierarchy as described in the following diagram (except and we do not have to deal with interfaces or packages in SL):

The following code snippets may be useful to you.
public class Symbol {
/** All symbols at least have a name */
public String name;
/** Classes, methods, variables, and closures have types */
public Symbol type;
/** All symbols know what scope contains them. */
public Scope scope;
/** Often we want to know where in tree this symbol was defined. */
public Tree definition;
public Symbol(String name, Symbol type) {
this.name = name;
this.type = type;
}
}
public class SLAST extends CommonTree {
/** MethodSymbol etc... track symbol ptr */
public Symbol symbol;
/** If node defines a scope such as a class definition node or
* local {...} scope etc., this points to the Scope of symbols.
*/
public Scope scope;
}
class SymbolTable {
public static final String[] predefinedTypeNames = {
"int", "string", "list"
};
/** predefined Object */
ClassSymbol objectRoot;
/** List of all classes defined or loaded while parsing the input file */
List<ClassSymbol> classes = new ArrayList<ClassSymbol>();
public SymbolTable() { /* defined predefined IntTypeSymbol, Object etc. */ }
...
}
Extra credit
If you would like to get extra credit, you can implement some type analysis to verify that the two sides of an operator have the same type. For example, the following syntactically valid subphrases are semantically invalid due to type conflicts:
int i = "foo";
34 * "99"
Dog d = new Cat();
Be careful to allow assignments where the left-hand side is a supertype of the right-hand side:
To demonstrate that you have completed some static analysis, you will provide some sample input for which your program emits static analysis errors.
Submission
You will create a jar file called sym.jar containing grammars, source, and *.class files and place in your lib directory:
You can use the svn account for development of the software to if you would like, but I will only be looking at your sym.jar file in the lib directory.
Grading
Make sure your jar has the TestSym in the default package with a main method so that I can test your code. You can give me a number of samples in your jar at the top level so that I can test your code using your samples.
You will be graded on the actions within the grammar as well as the symbol table code. I will test your program on valid and invalid input.