TF#I Friday #4 : In which I count some words
For a recent MSDN Flash article I wrote some simple code to calculate word frequency in C#. As I get to grips with F#, I’m learning that the most rewarding but also the most difficult aspect is to think in a more functional way. To count words in an imperative style (as I did in my C# example) I would iterate through a collection of words and keep a running count. And, of course, you could write code in F# to do that. But what would be the point? How about approaching it in a different fashion? So, with those questions in mind, I fired up Visual Studio and went about trying to bend my brain into a more F# like shape. One of the things I like about F# is F# Interactive – a REPL which makes trying out and learning F# (as well as prototyping) easy – so that was where I started. First thing I needed to do was to create a list of words (since at this stage I’m concerned simply with calculating frequency and not reading files or strings.) It’s fairly simple to do that in F#:
let words = ["the"; "cat"; "sat"; "on"; "the"; "mat"];;
(the double semicolons are signal to F# Interactive the completion of a statement.) After reading a bit about processing sequences in F#, I spotted that there is a function to count elements in a list – it can easily be used against the whole list like this:
let count = words |> Seq.countBy(fun x -> x);;
The countBy function takes a function to generate a key – in this case we can use each individual word. To see if that has worked, we can print out the contents of the result:
printfn "%A" count;;
And in this case, I got the following result:
seq [("the", 2); ("cat", 1); ("sat", 1); ("on", 1); ...]
val it : unit = ()
Which means it worked as intended. There’s work to be done to make it the equivalent of the C# code, but the core counting is implemented in one line of code.