Re.Mark

My Life As A Blog

Dynamic Football Stats

with one comment

A couple of days ago I noticed that The Guardian had made data about the England vs USA game that took place last week available.  I downloaded the data (which is in a Google Apps spreadsheet) and saved each sheet as a CSV file.

Originally, I intended to read the data with IronPython.  Reading CSV data with Python is very simple – there’s a built in CSV module.  However, this module is written in C, which means it’s not available in IronPython – see here for more info.  There is aproject called IronClad that allows Python modules written in C to be used from IronPython.  At the moment, it’s built against .NET 2, which means that I could get it to work in .NET 2, but I had plans to use .NET 4 and the dynamic support in C#.   Time for another approach.

Using the CsvReader class,it’s easy to access the data in a CSV file.  I started with the Player Summaries sheet.  To make this dynamic (and, therefore, useful for each of these sheets and, potentially, other as yet unknown sheets) I created a class to hold each row of data.  Here it is:


public class DynamicDataObject : DynamicObject
{
    private readonly Dictionary<string, dynamic> data;

    public DynamicDataObject(Dictionary<string, dynamic> data)
    {
        this.data = data;
    }

    public override bool TryGetMember(GetMemberBinder binder, out object result)
    {
        result = data[binder.Name];
        return (result != null);
    }
}

By inheriting from DynamicObject, it will be possible to call this class dynamically – meaning that I can use the names of the data fields as defined properties on the class.  Next I created a DataReader class that reads the data from the CSV file and stores it as an IEnumerable<DynamicDataObject>.  Here’s that class:


public class DataReader : IEnumerable<DynamicDataObject>
{
    private readonly List<DynamicDataObject> dataList;

    public DataReader(string filename)
    {
        this.dataList = new List<DynamicDataObject>();
        using (StreamReader streamReader = new StreamReader(filename))
        {
            using (CsvReader reader = new CsvReader(streamReader, true))
            {
                string[] headers = reader.GetFieldHeaders();
                Dictionary<string, string> cleanHeaders = CleanHeaders(headers);
                while (reader.ReadNextRecord())
                {
                    Dictionary<string, dynamic> data = new Dictionary<string, dynamic>();
                    foreach (string header in headers)
                    {
                        int result;
                        dynamic value;
                        if (int.TryParse(reader[header],  out result))
                        {
                            value = result;
                        }
                        else
                        {
                            value = reader[header];
                        }
                        data.Add(cleanHeaders[header], value);
                    }
                    this.dataList.Add(new DynamicDataObject(data));
                }
            }
        }
    }

    private Dictionary<string, string> CleanHeaders(string[] headers)
    {
        Dictionary<string, string> result = new Dictionary<string, string>();
        foreach (string header in headers)
        {
            string cleanheader = header.Replace(' ', '_');
            cleanheader = cleanheader.Split('(')[0];
            result.Add(header, cleanheader);
        }
        return result;
    }

    #region IEnumerable<DynamicDataObject> Members

    public IEnumerator<DynamicDataObject> GetEnumerator()
    {
        return this.dataList.GetEnumerator();
    }

    #endregion

    #region IEnumerable Members

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return this.dataList.GetEnumerator();
    }

    #endregion
}

There’s a couple of things worth pointing out.  The first is that I’ve cleaned up the field names so that they can be used in code (by replacing spaces with underscores and removing anything in brackets).  The second is that if a value is an integer, I’m storing it as an integer.  I’m storing these values of type dynamic, which will come in handy when we want to query the data.

Speaking of querying that data, I wanted to use LINQ.  Here’s some simple code I wrote in a console application to try it out:


static void Main(string[] args)
{
    DataReader reader = new DataReader(@"C:\Users\Mark\Downloads\Eng-USA Data\Player Summaries.csv");

    var result = from dynamic player in reader
                 where player.Goals > 0
                 select player;

    foreach (dynamic player in result)
    {
        Console.WriteLine(player.Player_Name + " - " + player.Goals + " goals");
    }

    Console.WriteLine("Press any key to exit...");
    Console.ReadKey();
}

And here’s the output:

image

By using the dynamic support in C#, the LINQ query just works and I can reference properties dynamically without having to create a class specifically for each sheet of data.  It’s important in the LINQ query to declare the player of type dynamic – otherwise C# will revert to its statically typed ways and inform you that the Goals property doesn’t exist, which, given that it only exists at runtime, is correct.  Now I can analyse the data easily.  Doesn’t change the result though…

Advertisements

Written by remark

June 17, 2010 at 12:41 pm

Posted in .NET, c#, Python

One Response

Subscribe to comments with RSS.

  1. This is a good post. This post offer actually quality information.I’m definitely getting to look into it.Really terribly helpful tips are provided here.thank you so much.Keep up the nice work.

    Cassie Leghorn

    January 15, 2011 at 10:56 am


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: