Grouping Data Into Buckets Using Fsharp
Last week I came across an interesting data grouping problem, which groups the data into different buckets based on the given filters as shown in the image below
In this blog post I am going to share how can we solve this problem very expressively using fsharp.
Parsing filters and translate to an Abstract Syntax Tree (AST)
The first step to solve this problem is parsing the filters and translate to an AST. The strong type system in fsharp makes this step pretty easy.
1 2 3 4 |
|
The Filter
type represents the AST of the raw string filter. There are n number of ways to transform this incoming string filter to its strongly typed counterpart.
The most idiomatic way of achieving it in fsharp is through Partial Active Patterns
1 2 3 4 |
|
You can easily understand this partial active pattern by thinking it as a function with the following signature
1
|
|
i.e Takes an input, matches it against a pattern. If it matched, then returns the list of matched strings wrapped with the Some
type else returns the None
type.
Now you may think why we need to write this straight-forward function as a partial active pattern with some weird symbols (|_|)
?
We can certainly do that, but we can’t do pattern matching against a function.
The best way to understand this by seeing it in action
1 2 3 4 5 6 7 |
|
The above function transforms the raw string filter into its equivalent fsharp type that we have defined before. The pattern matching makes our job very easier by declaratively saying if it matches this, then do this.
One another important benefit that we are getting here, the return value (list of matched strings) is available as a last parameter in the pattern matching. It’s just coming for free.
I’d like to give you an exercise to appreciate this excellent feature in fsharp. Try to implement the above using a function instead of a partial active patterns.
Great! We are done with the first step. Let’s move to the next one.
Grouping the data using the filter
To do the actual grouping we need to have a function that filters the data based on the given filter. Let me call it as a predicate.
1 2 3 4 5 |
|
The createPredicate
function takes a filter and returns the predicate function int -> bool
With the functions createFilter
and createPredicate
in place, we can combine them and create a new function that takes a raw filter string and translate that to a predicate.
1 2 |
|
Let me create a new record type BucketRule
which holds this string and the predicate
1 2 3 4 |
|
Now we have all required functions and it’s time to do the data grouping
1 2 3 4 5 6 7 8 |
|
Summary
Though the problem that we have seen is so trivial, the great features in fsharp make it very pleasant to write expressive code to solve it. The complete code snippet is available in fssnipnet.