JSON Interpreter

Skip to end of metadata
Go to start of metadata

JSON (JavaScript Object Notation) is a straightforward data interchange format and alternative to XML. This page includes combined "front end" parser/lexer that emits an AST and a separate "back end" tree parser to generate the actual objects.

Why use an AST?

I had two goals with this design:

  1. Keep the syntax clear
  2. Make it easier to retarget the back end for different output languages. You only have to modify the tree parser and set the output language options.

Be aware that the Parser passes all character escapes unchanged: "\r" is passed as slash-r. It's the Tree Parser's job to convert these to the appropriate characters (Carriage Return, 0x0D in this case.)

Error handling

This version now checks that numbers are in the correct format (essentially, no leading zeroes) and disables error handling so the exceptions propagate all the way up. Thanks to Nuno Job for pointing this out: http://www.nunojob.com/blog/2009/05/26/json-grammar/

You don't have to include the numeric validation code or the "disable error handling" code if you don't want to. Another option would be to move the numeric validity check into the tree parser (extractNumber), which would have the advantage of leaving the parser and lexer language-independent (and, truth be told, would probably be cleaner.)

Here's the front end:

JSON.g

This back-end is written in Java. This implementation turns a JSON array into a Java List and a JSON object into a Map.

JsonTree.g

You can get the parser, tree parser, and unit tests as attachments to this page. You can also get the source repository at http://github.com/rdclark/json-antlr/tree/master

Labels:
  1. May 12, 2009

    The following code should work to convert from an int to a Unicode character. (not tested)

  2. Jun 05, 2009

    I think that object and array rules need to allow empty objects and empty arrays.

    object : '{' members? '}'
    -> ^(OBJECT members?)
    ;

    array : '[' elements? ']'
    -> ^(ARRAY elements?)
    ;

    Why did you split up the number into 2 parts number and exponent?
    Looks like you are trying to parse strict JSON which is fine. \' is not a valid escape sequence.