Archive for March 2011
TF#I Friday #5 : In which I return to C# for a bit
In my last post I counted some words using F#, which turned out to require a single, simple line of F#. When I’d done the same thing before in C# i had iterated over the words and kept a count as I went, which is a typically imperative approach. So, I wondered if you could apply the functional approach to C# – perhaps using LINQ. Turns out you can.
Firstly, it’s helpful to have some words to count. Here’s a simple approach:
string test = "The cat sat on the mat.";
string[] words = test.Split(' ');
(In F# I populated a list of words directly, so there’s an extra line of C# here – largely because I started with my previous C# code.) Right, now to the counting in one line:
var result = from word in words
let strippedWord = StripPunctuation(word).ToLower()
where strippedWord.Length > 0
group word by strippedWord into grouped
select new { Word = grouped.Key, Count = grouped.Count() };
You may have noticed a call to StripPunctuation – a utility function I had in my previous C# code. Here it is (declared static as I was running it in a console application:
private static string StripPunctuation(string word)
{
string result = word;
if (result.Length > 0)
{
if (char.IsPunctuation(result[0]))
{
result = result.TrimStart(result[0]);
}
if (result.Length > 0)
{
if (char.IsPunctuation(result[result.Length - 1]))
{
result = result.TrimEnd(result[result.Length - 1]);
}
}
}
return result;
}
And now, with a little sprinkling of dynamic capability, outputting the results to the console:
foreach (dynamic entry in result)
{
Console.WriteLine("{0}\t{1}", entry.Word, entry.Count);
}
So it is possible to apply the more functional approach courtesy of LINQ, although there’s still more code than I had in F#. The C# is doing a couple of extra things (it strips out punctuation and is case insensitive) – but the point isn’t really the comparison between the two examples so much as the fact that by grasping some functional concepts can result in a change to your C# – which is a good reason to learn some F#.
TF#I Friday #4 : In which I count some words
For a recent MSDN Flash article I wrote some simple code to calculate word frequency in C#. As I get to grips with F#, I’m learning that the most rewarding but also the most difficult aspect is to think in a more functional way. To count words in an imperative style (as I did in my C# example) I would iterate through a collection of words and keep a running count. And, of course, you could write code in F# to do that. But what would be the point? How about approaching it in a different fashion? So, with those questions in mind, I fired up Visual Studio and went about trying to bend my brain into a more F# like shape. One of the things I like about F# is F# Interactive – a REPL which makes trying out and learning F# (as well as prototyping) easy – so that was where I started. First thing I needed to do was to create a list of words (since at this stage I’m concerned simply with calculating frequency and not reading files or strings.) It’s fairly simple to do that in F#:
let words = ["the"; "cat"; "sat"; "on"; "the"; "mat"];;
(the double semicolons are signal to F# Interactive the completion of a statement.) After reading a bit about processing sequences in F#, I spotted that there is a function to count elements in a list – it can easily be used against the whole list like this:
let count = words |> Seq.countBy(fun x -> x);;
The countBy function takes a function to generate a key – in this case we can use each individual word. To see if that has worked, we can print out the contents of the result:
printfn "%A" count;;
And in this case, I got the following result:
seq [("the", 2); ("cat", 1); ("sat", 1); ("on", 1); …]
val it : unit = ()
Which means it worked as intended. There’s work to be done to make it the equivalent of the C# code, but the core counting is implemented in one line of code.