RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Parsing with Active Patterns in F#

Discover how to use F#'s active patterns to build prototype parsers that you can test interactively.


ou may have come across the need to read textual input into a structured representation before. Consider interpreting your own domain-specific language (DSL), reading back marshaled data from the disk, turning XML into a typed representation, or simply reading a comma-separated file: When it comes to parsing you have a number of options. For many common parsing tasks (such as working with the various XML APIs, regular expressions, or string splitting for comma-separated files), .NET provides extensive library support, but once you start growing your input language you will need a uniform parsing approach that can handle your requirements.

The first half of this article discusses active patterns in F# and shows single-case, multi-case, and partial active patterns in action. Active patterns are one of the most versatile tools in F# programming; they let you convert and partition arbitrary values into shapes against which you can perform pattern matching. With this ability you can turn grammar parsing into a typed discipline directly expressible as pattern-matching rules. This in turn allows you to quickly prototype DSLs, custom parsers for your data, and the like—and it does so in an incredibly concise and elegant way.

Editor's Note: If you don't have F#, you can download it here.

The second part of the article shows how to use your new background in active patterns to develop parser rules using partial active patterns, and then combine them to form a complete parser for a small language of arithmetic expressions with function calls. You will also develop an evaluator for this language and see how quickly you can get from ground zero to a small working language that you can interact with.

Part 1: Discriminated Unions and Pattern Matching

In ordinary pattern matching over discriminated unions you define a union type with various shapes, and then examine those in a pattern match. You can write arbitrarily complex match cases to match very specific shapes of your union values, and the compiler gives you a warning if your match cases are not complete or if certain matches are never reached. This sort of data abstraction is central in functional programming for at least two reasons: it provides a type-safe way to deal with multi-shaped values, and recursive discriminated unions are an ideal way to express many common data scenarios. They go perfectly with recursive problem decomposition: you solve problems by decomposing them to smaller ones that are in turn decomposed until you are working with primitive values.

Consider the following simple arithmetic expression evaluator:

type Expr =
| Number of int
| Sum    of Expr * Expr
| Diff   of Expr * Expr
| Prod   of Expr * Expr
| Div    of Expr * Expr
let rec Eval = function
| Number i      -> i
| Sum (e1, e2)  -> Eval e1 + Eval e2
| Diff (e1, e2) -> Eval e1 - Eval e2
| Prod (e1, e2) -> Eval e1 * Eval e2
| Div (e1, e2)  -> Eval e1 / Eval e2

Sending this bit of code to F# Interactive allows you to quickly experiment with Expression values. For example, you can create a new instance and evaluate it on the fly:

let i = Prod (Sum (Number 11, Number 12), 
   Diff (Number 43, Number 34))
i |> Eval |> printf "The result is %d\n";;

The preceding code produces the output:

val i : Expr
The result is 207

You may find it awkward to define the arithmetic operators as separate shapes where in reality only their label (the operator name) is different. Instead you can express the above using a generic binary operation shape that carries the operation as a function and simply define additional members that map to this shape:

type Expr =
| Number of int
| BinOp  of (int -> int -> int) * Expr * Expr
    static member Sum  (e1, e2) = BinOp (( + ), e1, e2)
    static member Diff (e1, e2) = BinOp (( - ), e1, e2)
    static member Prod (e1, e2) = BinOp (( * ), e1, e2)
    static member Div  (e1, e2) = BinOp (( / ), e1, e2)
let rec Eval = function
| Number i          -> i
| BinOp (f, e1, e2) -> f (Eval e1) (Eval e2)

You can verify this with the previous test value, this time without retaining a binding to it:

> Expr.Prod (Expr.Sum (Number 11, Number 12), Expr.Diff (Number 43, Number 34))
|> fun i ->
    i |> Eval |> printf "The result is %d\n";;
The result is 207

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date