June 21, 2013

Technical Report

Expanded version of paper submitted for peer review:

Grammars

Sample input corpi

Sample boostrapping runs

For the parse times graph. E.g., the bootstrap runs the tsql parser starting that rule aaa_translation_unit and do 200 trials of 500 files each, parsing only .sql files and from directory tsql-input

java -Xmx1G Bootstrap tsql aaa_translation_unit -trials 200 -N 500 -files '.*\.sql' -toupper tsql-input
java -Xmx1G Bootstrap Java compilationUnit      -trials 200 -N 500 -files '.*\.java' /usr/local/javasrc-1.7
java -Xmx1G Bootstrap Verilog2001 source_text   -trials 200 -N 385 -files '.*\.v' verilog/preprocessed
java -Xmx1G Bootstrap JavaRats compilationUnit  -trials 200 -N 500 -files '.*\.java' /usr/local/javasrc-1.7

These are pretty slow and so we ran only 10 trials:

java -Xmx2G Bootstrap C11SLL compilationUnit     -trials 10 -N 500 -files '.*\.c' postgres python
java -Xmx2G Bootstrap C11 compilationUnit        -trials 10 -N 500 -files '.*\.c' postgres python
java -Xmx2G Bootstrap JavaSable compilation_unit -trials 10 -N 500 -files '.*\.java' /usr/local/javasrc-1.7/

Generating emperical graphs

Bootstrap.java

Python code to plot graphs:

Boostrap data: