SmallTalk bytecode compiler

Skip to end of metadata
Go to start of metadata

Your goal for this project is to compile smalltalk code down to the bytecodes of our virtual machine by following Smalltalk to bytecode mapping and Compiling Smalltalk to bytecodes.

During initialization you will read in standard library, image.st and compile it. Then, your Run class' main method must read in Smalltalk code from the standard input or filename from command-line and dump the disassembly and string table (see Run.java attachment)

You will extend my solution to the previous parser/tree parser project so that we are all starting from the same base. I will give you binary of the parser grammar but will give you my solution in text form to the tree grammar.

Compiler control-flow

First we parse the input into an AST, then define classes, then compile from the AST:

As before, here is how we get the AST:

You'll hook up the compiler, code generator, and VM using this ctor for tree parser:

Deliverables

  1. DefineClasses.g which defines all classes found in a given AST
  2. CodeGenerator.g modified with code generation and symbol table management actions
  3. Compiler.java
  4. STClass.java
  5. CompiledMethod.java
  6. CompiledBlock.java

I strongly recommend that you use a functional style. In other words, don't use side effects within the rules. Have them all return the byte code or compiled objects for the Smalltalk they parse. E.g.,

To generate code, we actually create a list of bytes. For example to generate code for a return instruction (from in my Compiler.java), we can do this:

We then reference this method from within CodeGenerator.g:

Once we have a list of bytes and strings, we can inject them into a CompiledMethod or block:

You'll find the following methods handy for creating byte lists.

So that we can all get the same output, I include a number of useful routines in the CompiledMethod class as well as the bytecode disassembler (taken almost literally from the "Language implementation patterns" book).

Also, on page 242 of the printed book you'll see the bytecode format in a byte array. page 247 talks about storing strings and other large constants in a constant pool. Since we're going straight to bytecodes from source code rather than using a bytecode assembler, we have to manage the string table ourselves. see page 254.

Error handling

You must emit an error if an ID is not a valid arg, local, field, or class name.

This includes superclass references in class defs.

Submission

You will create a jar file called smalltalk-compiler.jar containing source, grammar, and *.class files and place in your build directory:

https://www/svn/userid/cs652/smalltalk-compiler/build/smalltalk-compiler.jar

I will run your code by executing the following:

$java -cp "smalltalk-compiler.jar:antlr-3.3.jar:$CLASSPATH" smalltalk.tool.Run test.st

You can use the svn account for development of the software too if you would like, but I will only be looking at your jar file in the build directory.

For more information, see svn in CS601. Naturally you will have to substitute cs652 for cs601.

Labels: