Your goal for this project is to build a parser for Smalltalk that constructs suitable ASTs. You also need a tree grammar that walks those trees. I provide a test rig/driver for you: a class called smalltalk.compiler.Compiler (Note that you will be using packages for the remainder of the semester.) Your deliverables then are Smalltalk.g and SmalltalkTreeWalker.g.
File format
Smalltalk had no file format since it did everything with the development environment. It does have a file out mechanism and a file in mechanism, but we will define our own special more readable version. Here is how we would define the Boolean class:
Classes with non-primitive methods and fields look like this:
The grammar
A file in our syntax is a list of classes followed optionally by a list of statements followed by EOF. Class definitions look like:
That's equivalent to the following Java code:
The file can also be just a list of statements like:
or both:
The implied classes is called MainClass and the implied method is called doIt.
There are lots of smalltalk grammars out there on the web, though I was not really happy with any of them that I saw. I made a very clean grammar that is about 130 lines. From your knowledge of Smalltalk and example programs, you will have to abstract a grammar. To aid you in your quest, I'm providing copious unit tests. See the attached files. There are a total of 93 unit tests, all of which work in my solution.
As a hint, here are the list my rule names: file, classDef, method, methodBlock, block, body, locals, expr, messageExpression, keywordExpression, binaryExpression, bop (as in binary operator), opchar, unaryExpression, primary, literal, array. Then, I have labeled tokens: ID, KEYWORD, SYMBOL, COMMENT, CHAR, NUMBER, STRING, WS.
Some of the lexical description is kind of ugly so I will just give it to you here:
In order to implement smalltalk, some methods can't be expressed in smalltalk or are awkward. We need "primitives" like this:
Your parser must handle this syntax.
All methods must end with ^self in case the programmer forgot to specify a return value.
Tree structure and walking
In order to construct the proper trees, you must abstract their structure from the unit tests. For example, input:
becomes tree structure:
You know this because you can see it in the unit tests such as equalObjects2().
Here are some more examples from the gunit tests:
Your tree walker should be derived from the parser grammar's tree construction rewrite rules (as shown in the ANTLR reference guide). My solution's tree walker grammar has 15 rules.
Submission
You will create a jar file called smalltalk-parser.jar containing source, grammar, and *.class files and place in your build directory:
https://www/svn/userid/cs652/smalltalk-parser/build/smalltalk-parser.jar
I will run your code by executing the following:
$java -cp "smalltalk-parser.jar:antlr-3.3.jar:$CLASSPATH" smalltalk.compiler.Compiler test.st
You can use the svn account for development of the software too if you would like, but I will only be looking at your jar file in the build directory.
For more information, see svn in CS601. Naturally you will have to substitute cs652 for cs601.