General strategies
All language implementation strategies are technically translators. For the next 3.5 months, your life is going to be all about:

All translators have a common preliminary step: language recognition. To emit different output for different input, a translator must distinguish between input sentences. Translators emit output y for input statement x; i.e., y = f(x). There may be many many intermediate steps between x and y:

Even the recognition process is broken down into multiple phases: preprocessing, lexing, parsing:

The token stream is sent off to a parser that build an intermediate representation.
Interpretation
Simple interpreters, that you might call syntax directed interpreters, are typically one pass interpreters that read the input and emit some output. A calculator is a good example of this. More complicated translators generally follow the following strategy:
- Translate source to an intermediate representation. Usually this intermediate representation is high-level enough to recover the original source. Common representations: tokens (BASIC), trees (LISP), bytecodes (Java), high-level instructions (Forth, PostScript)

- Walk intermediate representation once or more times executing actions

Java VM is stack machine with a byte code representation.
yields
f: iload_1 iconst_1 iadd ireturn
Translation

demo Java profiling
Compilation
Compilation is a specific instance of a well-known class of translators that translate programming languages down to machine code. The term "compile" is used pretty loosely sometimes to include translation from source code to an intermediate representation; the javac compiler translates Java to byte codes.

demo LLVM
C code:
LLVM static single assignment (SSA) intermediate representation: (mul_add.ll)
define i32 @mul_add(i32 %x, i32 %y, i32 %z) {
entry:
%tmp = mul i32 %x, %y
%tmp2 = add i32 %tmp, %z
ret i32 %tmp2
}
main.c:
Compile/execute:
llvm-as -f mul_add.ll llc -f mul_add.bc as -o mul_add.o mul_add.s gcc main.c mul_add.o ./a.out
emits r = 70
demo gcc and -S
$ gcc -S t.c # generates t.s assembly code not t.o $ cat t.s .file "t.c" .text .globl f .type f, @function f: pushl %ebp ; save frame pointer register movl %esp, %ebp ; stack pointer goes into frame pointer movl 8(%ebp), %eax ; get x into return register eax addl $1, %eax popl %ebp ; restore frame pointer ret .size f, .-f .ident "GCC: (GNU) 4.1.2 20070502 (Red Hat 4.1.2-12)" .section .note.GNU-stack,"",@progbits
Then assembler translates the assembly code to machine code: t.o file.
$ as -o t.o t.s $
Then a linker combines with a main program and standard libraries to make executable image, though in this case we have no main:
$ ld t.o ld: could not find entry point "start" (perhaps missing crt1.o) for inferred arc $
Java execution possibilities
Java initially was purely interpreted (after translation from Java code to byte codes). Now the story is much more complicated:

Tools
Tools we will use: ANTLRWorks, ANTLR, StringTemplate, JBurg, LLVM, DOT/Graphviz.