This is the home page for Computer Science 652, Graduate Programming Languages, at the University of San Francisco.
Instructor: Terence Parr
Spring 2008: CO 327 MW 1:30pm - 3:15pm
Last day of class: Wednesday May 7, 2008
Final exam: Thursday May 15, 2008 @ 9:00AM
Materials use ANTLR v3.1 beta or above
No laptops are allowed in my lectures ('cept mine)
Abstract
Because programming languages are at the core of writing software, programmers should have a thorough understanding of how languages are designed, implemented, and manipulated. This course concerns itself specifically with the implementation and translation of computer languages, leaving an in-depth study of language design to another course. Students will learn the formalisms behind computer languages, but the focus will be on developing the ability to build languages and their translators.
This class is only moderately difficult for the most part, though some of the language formalisms take a while to sink in. Well, actually you have one major hurdle to get over and then it's easy--abstraction in the sense of recursion, meta-language, programs that generate other programs (or even themselves), etc... If you get a headache when you try to figure out how the first C compiler could have been written in C, you might invest in a big bottle of aspirin.
Students are required to build projects in Java unless specified otherwise. You should have a good understanding of data structures and algorithms. Prior experience with language implementation is helpful, but not required. You will be expected to write a lot of code this semester--lots of little projects and then a big language implementation project at the end.
Two classes is considered full-time and, hence, you can expect this class to require <= 20 hours/week including class time.
Student Learning Outcomes
At the end of this course, you will
- be familiar with the evolution of programming languages
- be able to build NFA, DFA, and related state machines
- understand the nature and structure of language
- be able to describe a variety of garbage collection strategies
- be able to describe how object-oriented languages implement inheritance, polymorphism, encapsulation, and data-hiding
- be very comfortable with grammars and parsing.
- know how to implement a simple interpreter
- be able to translate common language constructs to assembly by hand
- know how static, just-in-time, and on-the-fly compilers work
- be able to build symbol tables and intermediate representations
- be able to build lexers and parsers by hand and via ANTLR
- be able to build source-to-source translators, config files readers, XML parsers, interpreters, and code generators for language such as PostScript and HTML.
In short, you will be able to discuss intelligently the implementation and translation of programming languages and will have built several small projects that teach the fundamental implementation techniques.
Syllabus
I. Introduction (3.5 class periods)
Homeworks:
Labs:
II. Parsing (.75 periods)
- LL Parsing and recursive descent design pattern (required reading: 11.1-11.4 from ANTLR reference)
- ANTLR introduction: Chapter 3 from ANTLR reference
- Abstract syntax trees (ASTs): Chapter 7 from ANTLR reference
- Ambiguities and Nondeterminisms: Section 11.5 from ANTLR reference
- Using semantic and syntactic predicates: Chapter 12 from ANTLR reference (required reading: Chapters 13, 14 from ANTLR reference)
- Symbol tables
Labs:
III. Interpreters and program execution
Labs:
IV. Translators
Labs:
Projects
Parsing Graphviz DOT files the hard way Due Feb 6
Build a lexer and parser by hand that checks DOT files for correct syntactic structure.
AST Construction Due Feb 20
Build an AST for an object-oriented programming language.
Symbol table Due Mar 3 (Was Feb 27)
Build a symbol table for an object-oriented programming language.
Byte code assembler and interpreter Due Mar 12
Build an assembler and interpreter for a simple stack-based byte code.
Tree-based interpreter Due Apr 7
Build an interpreter that walks trees instead of byte codes to execute programs.
Adding Closures to Java Due Apr 21
Source-to-source translator that adds simple closures and map operator to Java.
C subset compiler Due May 7
Build a compiler for a subset of C with arrays and int types. Generates LLVM intermediate representation, which will compile the IR down to machine code.
There are no late projects and incomplete features do not count.
All projects will be graded and must run under UNIX. Projects will be graded interactively and may require students to meet with the instructor outside class.
SVN repository
All projects must be submitted in SVN. For example, here is how I would create my cs652 directory:
Exams
There will be two written exams. The midterm exam is on March 26 and the second exam is during the normal final exam: May 15, 2008 @ 9:00AM (well, it's 8:00AM, but I am not getting up that early).
Grading
Your grade will be computed according to the following relationship:
10% Labs
15% Exam 1
15% Exam 2 (canceled)
5% Homework/class participation
55% Projects (10% each except C compiler, which is 15%)
I consider an "A" grade to be above and beyond what most students have achieved. A "B" grade is an average grade or what you could call "competence" in a business setting. A "C" grade means that you either did not or could not put forth the effort to achieve competence. An "F" grade implies you did very little work or had great difficulty with the class compared to other students.
Unless you are sick or have a family emergency, I will not change deadlines for projects nor exam times. For example, I will not giveyou a special final exam just because you want to fly home early. Consult the university academic calendar before making travel plans.
Miscellaneous
Attendance is part of your grade. Even if you know this subject well, I will be providing a number of anecdotes from industry and other goodies during my lectures. Anything I say in class is fair game for an exam
Tardiness. All classes begin precisely at 1:30pm and will restart after the 10 minute break exactly on time.
Academic honesty. You must abide by the copyright laws of the United States and academic honesty policies of USF. If told you may for a particular project, use any code from the net that you find as long as it does not violate the software's license. You may not borrow code from other current or previous students. All suspicious activity will be investigated and, if warranted, passed to the Dean of Sciences for action.
Mailing list. You must all join:
http://cs.usfca.edu/mailman/listinfo/cs652
I will make lots of announcements on it.
Course text
The Definitive ANTLR Reference
. You will definitely need this as a reference for ANTLR.
Resources
How To Read C Declarations
Credits
I'd like to thank Kay Roepke for his help designing this course structure.