Dashboard > USF Computer Science 652 - Programming Languages > ... > Lectures > Overview of language implementation and supporting tools
  USF Computer Science 652 - Programming Languages Log In | Sign Up   View a printable version of the current page.  
  Overview of language implementation and supporting tools
Added by Terence Parr, last edited by Terence Parr on Jan 23, 2008  (view change)
Labels: 
(None)

General strategies

All language implementation strategies are technically translators. For the next 3.5 months, your life is going to be all about:

All translators have a common preliminary step: language recognition. To emit different output for different input, a translator must distinguish between input sentences. Translators emit output y for input statement x; i.e., y = f(x). There may be many many intermediate steps between x and y:

Even the recognition process is broken down into multiple phases: preprocessing, lexing, parsing:

The token stream is sent off to a parser that build an intermediate representation.

Interpretation

Simple interpreters, that you might call syntax directed interpreters, are typically one pass interpreters that read the input and emit some output. A calculator is a good example of this. More complicated translators generally follow the following strategy:

  1. Translate source to an intermediate representation. Usually this intermediate representation is high-level enough to recover the original source. Common representations: tokens (BASIC), trees (LISP), bytecodes (Java), high-level instructions (Forth, PostScript)
  2. Walk intermediate representation once or more times executing actions

Java VM is stack machine with a byte code representation.

int f(int x) { return x+1; }

yields

f:
	iload_1
	iconst_1
	iadd
	ireturn

demo PostScript case study

Translation

demo Java profiling

Compilation

Compilation is a specific instance of a well-known class of translators that translate programming languages down to machine code. The term "compile" is used pretty loosely sometimes to include translation from source code to an intermediate representation; the javac compiler translates Java to byte codes.

demo LLVM

C code:

int mul_add(int x, int y, int z) { return x*y+z; }

LLVM static single assignment (SSA) intermediate representation: (mul_add.ll)

define i32 @mul_add(i32 %x, i32 %y, i32 %z) {
entry:
  %tmp = mul i32 %x, %y
  %tmp2 = add i32 %tmp, %z
  ret i32 %tmp2
}

main.c:

#include <stdio.h>

extern int mul_add(int,int,int);

int main(int argc, char *argv[]) {
        int r = mul_add(5, 10, 20);
        printf("r = %d\n", r);
}

Compile/execute:

llvm-as -f mul_add.ll 
llc -f mul_add.bc
as -o mul_add.o mul_add.s
gcc main.c mul_add.o
./a.out

emits r = 70

demo gcc and -S

int f(int x) { return x+1; }
$ gcc -S t.c   # generates t.s assembly code not t.o
$ cat t.s
	.file	"t.c"
	.text
.globl f
	.type	f, @function
f:
	pushl	%ebp ; save frame pointer register
	movl	%esp, %ebp ; stack pointer goes into frame pointer
	movl	8(%ebp), %eax ; get x into return register eax
	addl	$1, %eax
	popl	%ebp ; restore frame pointer
	ret
	.size	f, .-f
	.ident	"GCC: (GNU) 4.1.2 20070502 (Red Hat 4.1.2-12)"
	.section	.note.GNU-stack,"",@progbits

Then assembler translates the assembly code to machine code: t.o file.

$ as -o t.o t.s
$

Then a linker combines with a main program and standard libraries to make executable image, though in this case we have no main:

$ ld t.o
ld: could not find entry point "start" (perhaps missing crt1.o) for inferred arc
$

Java execution possibilities

Java initially was purely interpreted (after translation from Java code to byte codes). Now the story is much more complicated:

Tools

Tools we will use: ANTLRWorks, ANTLR, StringTemplate, JBurg, LLVM, DOT/Graphviz.

Site powered by a free Open Source Project / Non-profit License (more) of Confluence - the Enterprise wiki.
Learn more or evaluate Confluence for your organisation.
Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.5.1 Build:#806 May 06, 2007) - Bug/feature request - Contact Administrators