Introduction
In this final project, you will create a real compiler for a subset of the C programming language by translating the C code into the LLVM SSA intermediate representation (IR). Download llvm-2.4 and following build instructions (from source). If you have a Mac, they also have binary executables for a GCC front-end that generates LLVM intermediate code. Helpful to learn about LLVM this way. Or, you can use LLVM on the web.
You will need to deal with arrays but very few of the operators/statements and only int types. The features are:
- int type
- functions with void or int return type and multiple arguments
- if statements
- while statements
- return statements
- printf("s", n) method; allow this simplified version with format string and single integer argument. This is the only place a string is allowed (i.e., you can't use it in a general expression nor as an argument to a user-defined function)
- operators: +, -, ==, !=
- arrays
- globals and locals
You do not have to mess with pointers.
Here is an example translation. Input:
yields
Tasks
First, get LLVM running on a computer you have access to. Most likely you will have to build it from the source code, which is a good exercise for you. Please make sure you are using version 2.2 (the latest).
I am providing the grammar for the C subset so that we can all work on the exact same input. Your goal is to emit LLVM IR in the form of standard output with in the main program/class called CC. The full toolchain should look like this
java CC < t > t.ll llvm-as -f t.ll # translate the IR to bitcodes t.bc # or to remove memory load/stores, use: llvm-as < t.ll | opt -mem2reg > t.bc # to see optimized SSA, do this: llvm-dis t.bc llc -f t.bc # compile bitcodes two assembly code gcc -o go main.c t.s # compile, assembly and link to t executable ./go # execute
Notice that you will need a main program that invokes whatever you compile:
Ignore invalid input---assume all input is correct.
Your translator class should be called CC (uppercase CC) and have the main method that reads C subset files from standard input.
Submission
You will create a jar file called cc.jar (lowercase cc) containing grammars, source, and *.class files and place in your lib directory:
https://www/svn/userid/cs652/proj7/trunk/lib
You can use the svn account for development of the software to if you would like, but I will only be looking at your cc.jar file in the lib directory.
Please bring a printout of any tree grammar you build and support code as well as the output code generated from all examples shown above.
Grading
Make sure your jar has the CC class in the default package with a main method so that I can test your code. I will run a number of examples that you have not seen through your project.