** WORK IN PROGRESS 4/22/2009 ** Summary ------- OMeta is a shorthand notation for defining parsers. (This is not exactly OMeta... OMeta is object-oriented. This isn't.) Let's start with an example:: ; Parse a binary number number = ws digits:d -> (string->number d 2) digits = ('0' | '1')+ -> (substring *s* __start *i*) That translates to this Scheme code:: (define (number) (let ((__start *i*) (d #f) ) (and (ws) ; ws = "skip whitespace, if any" (setf d (digits)) (string->number d 2) ))) ; returns an integer (define (digits) (let ((__start *i*)) (and (r+ (or (literal "0") ; r+ = "one or more" (literal "1") )) (substring *s* __start *i*) ))) ; returns the matched string In general, you define a language as a set of BNF-like grammar productions in this form:: prod = rules -> action These translate into Scheme as follows:: prod -> (define (prod) (let ((__start *i*) ...variables...) (and rules -> ...a bunch of Scheme code... action -> action) ) ) *prod* is just a name, following the usual alphanumeric or '_' convention. *rules* is one or more rules which follow this syntax:: *prod-name* [``+`` | ``*``] [``:`` *variable-name*] As in regular expressions, ``+`` matches 1 or more of the preceeding production, and ``*`` matches 0 or more. To do something with the text matched by a rule, assign it to a variable. The variable is false (``#f``) if the rule doesn't match anything. You can also use parens ``( ... )`` for grouping, square brackets ``[ ... ]`` for optional rules, and a vertical bar ``|`` ("or") for alternatives. *action* is just some Scheme code to append to your rules. When all the rules match, the action is executed (note the ``and`` clause). Typically, you just want to do something with the variables in the rules (with the *:variable-name* syntax). You can also access the ``(substring *s* __start *i*)`` You can do whatever you want in an action. You can use it purely for side effects. If you don't define an action, the production returns the result of its final rule. This whole syntax is defined in the file ``ometa.bnf``. It's very short, and possibly more readable than the explanation just given, so take a look. Actually, I lied a little... the low-level *axioms* are define in ``om-core.scm``. That's not too bad either. The parser generator script:: Usage: om Dependencies: PLT Scheme / MzScheme 4.x Organization - ``om-core.scm``: Low-level parser routines -- handwritten Scheme - ``ometa.bnf``: Grammar definition file -- OMeta defined in OMeta - Scheme code generated from grammar definition (handwritten the first time) - ``om``: The parser generator script -- glues the above together and adds a command-line interface