Sunday, January 20, 2013

Parsing Expression Grammar - part 2

At the end of part 1 I had a parser that will parse a PEG, but isn't that useful because its output is simply a recursive data structure of the matched elements.
The next step is to add the ability to process each matching production rule. I do this by associating a function with each rule, the function takes two arguments - the matching string and the parsed elements - and returns a parse result. The parse result type is extended to allow passing an arbitrary type as a parse result...
This adds the Parsed of 'a type to the discriminated union.
Because each grammar element now requires three pieces of information (name, expression, function) I define a record type to make handling this easier. The function declaration of parseExpression changes as expected.
The changes needed to parseExpression are small...
The original pattern match for NonTerminal simply recursively called parse with the definition for the NonTerminal name, now the same recursive call is made but, if the result is a successful match, the function associated with the match is called and the result of that function is passed as the result of the match.

Grammar example - 1

Taking the same grammar I used in part 1, I now need to add a function to each rule.
In this example, I am evaluating the expression as it is being parsed, this is the simplest approach, later on I will show a two stage approach where the expression is converted to an AST that can be evaluated later.
Hopefully each of these is self explanatory, the same function can be used to parse both & and | because they have equivalent structure; the first pattern in parseBin is taken when the Optional second element of the Sequence is not matched.
This code can now evaluate simple boolean expressions...

> parseExpression g "start" "true&false";;
val it : ParseResult<bool> * int = (Parsed false, 10)
> parseExpression g "start" "true|true&false";;
val it : ParseResult<bool> * int = (Parsed true, 15)
> parseExpression g "start" "(true|true)&false";;
val it : ParseResult<bool> * int = (Parsed false, 17)

Grammar example - 2

In this example the same grammar (with some minor changes, such as allowing whitespace) demonstrates parsing into an AST (the BooleanExpr type) which is then evaluated in a second stage.
The code for this artice is on github.
In part 3 I make things a little more practical, expressing the grammar as literal objects is very clumsy - I build a parser that parses textual PEG expressions...


  1. Success is a choice not a dream. So, work hard for it and always be open-minded. Visit my site for more information. Have a good day and keep on moving forward for that success.

  2. Thanks for sharing such a wonderful article, I hope you could inspire more people. Visit my site for an offer you wouldn't want to refuse.