Archive for the ‘Python’ Category
Dynamic Football Stats
A couple of days ago I noticed that The Guardian had made data about the England vs USA game that took place last week available. I downloaded the data (which is in a Google Apps spreadsheet) and saved each sheet as a CSV file.
Originally, I intended to read the data with IronPython. Reading CSV data with Python is very simple – there’s a built in CSV module. However, this module is written in C, which means it’s not available in IronPython – see here for more info. There is aproject called IronClad that allows Python modules written in C to be used from IronPython. At the moment, it’s built against .NET 2, which means that I could get it to work in .NET 2, but I had plans to use .NET 4 and the dynamic support in C#. Time for another approach.
Using the CsvReader class,it’s easy to access the data in a CSV file. I started with the Player Summaries sheet. To make this dynamic (and, therefore, useful for each of these sheets and, potentially, other as yet unknown sheets) I created a class to hold each row of data. Here it is:
public class DynamicDataObject : DynamicObject
{
private readonly Dictionary<string, dynamic> data;
public DynamicDataObject(Dictionary<string, dynamic> data)
{
this.data = data;
}
public override bool TryGetMember(GetMemberBinder binder, out object result)
{
result = data[binder.Name];
return (result != null);
}
}
By inheriting from DynamicObject, it will be possible to call this class dynamically – meaning that I can use the names of the data fields as defined properties on the class. Next I created a DataReader class that reads the data from the CSV file and stores it as an IEnumerable<DynamicDataObject>. Here’s that class:
public class DataReader : IEnumerable<DynamicDataObject>
{
private readonly List<DynamicDataObject> dataList;
public DataReader(string filename)
{
this.dataList = new List<DynamicDataObject>();
using (StreamReader streamReader = new StreamReader(filename))
{
using (CsvReader reader = new CsvReader(streamReader, true))
{
string[] headers = reader.GetFieldHeaders();
Dictionary<string, string> cleanHeaders = CleanHeaders(headers);
while (reader.ReadNextRecord())
{
Dictionary<string, dynamic> data = new Dictionary<string, dynamic>();
foreach (string header in headers)
{
int result;
dynamic value;
if (int.TryParse(reader[header], out result))
{
value = result;
}
else
{
value = reader[header];
}
data.Add(cleanHeaders[header], value);
}
this.dataList.Add(new DynamicDataObject(data));
}
}
}
}
private Dictionary<string, string> CleanHeaders(string[] headers)
{
Dictionary<string, string> result = new Dictionary<string, string>();
foreach (string header in headers)
{
string cleanheader = header.Replace(' ', '_');
cleanheader = cleanheader.Split('(')[0];
result.Add(header, cleanheader);
}
return result;
}
#region IEnumerable<DynamicDataObject> Members
public IEnumerator<DynamicDataObject> GetEnumerator()
{
return this.dataList.GetEnumerator();
}
#endregion
#region IEnumerable Members
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return this.dataList.GetEnumerator();
}
#endregion
}
There’s a couple of things worth pointing out. The first is that I’ve cleaned up the field names so that they can be used in code (by replacing spaces with underscores and removing anything in brackets). The second is that if a value is an integer, I’m storing it as an integer. I’m storing these values of type dynamic, which will come in handy when we want to query the data.
Speaking of querying that data, I wanted to use LINQ. Here’s some simple code I wrote in a console application to try it out:
static void Main(string[] args)
{
DataReader reader = new DataReader(@"C:\Users\Mark\Downloads\Eng-USA Data\Player Summaries.csv");
var result = from dynamic player in reader
where player.Goals > 0
select player;
foreach (dynamic player in result)
{
Console.WriteLine(player.Player_Name + " - " + player.Goals + " goals");
}
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
And here’s the output:
By using the dynamic support in C#, the LINQ query just works and I can reference properties dynamically without having to create a class specifically for each sheet of data. It’s important in the LINQ query to declare the player of type dynamic – otherwise C# will revert to its statically typed ways and inform you that the Goals property doesn’t exist, which, given that it only exists at runtime, is correct. Now I can analyse the data easily. Doesn’t change the result though…
Finding Lyrics
I was looking some lyrics up online this week, so I wondered how hard to would be to write a simple application to find lyrics to your favourite song. Or to your least favourite song. Or, in fact, to any arbritrary song. Via programmableweb, I found the API to lyricsfly, which looked easy to use. Another IronPython console app beckoned.
Keeping it simple, I decided to use optparse to parse the command-line options and urllib to make the http calls. This way the program can be called with the user_id that lyricsfly requires (head to their website and you can get a temporary weekly key to try this out) along with the artist name and song title. What I decided not to do at this stage was to process the resulting XML. Or handle any errors. Or handle cases where the user_id, artist or title is not supplied. But, although rudimentary, it works. Here’s the code:
from System import Console
import urllib
from optparse import OptionParser
print "Starting"
parser = OptionParser()
parser.add_option("-i", "--user_id",
action="store", type="string", dest="user_id",
help="The user id for the Lyrics Fly service")
parser.add_option("-a", "--artist",
action="store", type="string", dest="artist",
help="Artist name")
parser.add_option("-t", "--title",
action="store", type="string", dest="title",
help="Song title")
(options, args) = parser.parse_args()
print "Parsed options"
if (options.user_id):
user_id = options.user_id
if (options.artist):
artist = options.artist
if (options.title):
title = options.title
print "Getting Lyrics for " + artist + " - " + title
query = urllib.urlencode([("i", user_id), ("a", artist), ("t", title)])
url = "http://api.lyricsfly.com/api/api.php?" + query
print url
data = urllib.urlopen(url)
print data.read()
print
print "Press any key to exit.."
Console.ReadKey()
IronPython Console App
Having installed the CTP of IronPython tooling for Visual Studio, I thought I’d better try it out and write a simple app. A Console Application seemed like a good choice – and I haven’t done much with getting data over the internet in Python, so I combined the two. Here’s the resulting simple (and, in its current form, not especially useful) application:
import urllib2
from System import Console
print "Starting..."
data = urllib2.urlopen("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml").read()
print data
Console.WriteLine()
Console.ReadKey()
IronPython meet Visual Studio
On IronPython URLs, I just spotted that a CTP of IronPython tooling for Visual Studio 2010 is available. Now you can create IronPython projects in Visual Studio like this:
It’s an early preview, but you still get intellisense, a REPL, debugging and a lot more besides – there’s more detail here.
Dynamic Configuration with IronPython and .NET 4
A while back I posted a simple way of using IronPython to configure a .NET application. The I spotted Herman’s question about whether the application would react to a change in the Configuration.py file. The way I wrote the original sample the configuration is read in once. But it’s fairly simple to modify the code to react to changes in Configuration.py. Here’s the modified Program class:
class Program
{
static void Main(string[] args)
{
FileSystemWatcher watcher = new FileSystemWatcher(AppDomain.CurrentDomain.BaseDirectory, "*.py");
watcher.EnableRaisingEvents = true;
watcher.NotifyFilter = NotifyFilters.LastWrite;
DateTime modified = DateTime.Now;
Console.TreatControlCAsInput = true;
dynamic configuration = ConfigureFromIronPython();
DisplayConfiguration(configuration);
watcher.Changed += (sender, eventArgs) =>
{
if (DateTime.Now - modified > TimeSpan.FromMilliseconds(100))
{
modified = DateTime.Now;
configuration = ConfigureFromIronPython();
DisplayConfiguration(configuration);
}
};
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
watcher.EnableRaisingEvents = false;
watcher.Dispose();
Console.WriteLine("Exiting...");
}
private static dynamic ConfigureFromIronPython()
{
dynamic configuration = new ExpandoObject();
ScriptEngine engine = Python.CreateEngine();
ScriptScope scope = engine.CreateScope();
scope.SetVariable("configuration", configuration);
engine.ExecuteFile("Configuration.py", scope);
return configuration;
}
private static void DisplayConfiguration(dynamic configuration)
{
for (int i = 0; i < configuration.Count; i++)
{
Console.WriteLine(configuration.Text);
}
}
}
By adding a FileSystemWatcher, we can react to changes in Configuration.py and update the application configuration. (If you’re wondering the check of when the last modification occurred is to prevent handling the same change twice which can occur when using the FileSystemWatcher.) The only other change is a minor refactoring to extract a method that displays the configuration.
Batteries Included
Sometimes you’ll hear the Python standard library referred to as “batteries included” – a little more info here. IronPython can also use these included batteries. As an example, today I needed to list the files in a folder. So, a simple Python script seemed like a good way of doing that (I’m sure there are better and more ingenious ways.) Here it is:
import os
import os.path
import sys
from optparse import OptionParser
def list_files(path, indent=0):
for filename in sorted(os.listdir(path)):
print " " * indent + filename
full_path = os.path.join(path, filename)
if (options.recursive) and (os.path.isdir(full_path)):
list_files(full_path, indent + 2)
parser = OptionParser()
parser.add_option("-d", "--directory",
action="store", type="string", dest="path",
help="The directory to list.")
parser.add_option("-r", "--recursive",
action="store_true", default=False,
help="Whether to list subdirectories.")
parser.add_option("-f", "--output_file",
action="store", type="string", dest="output_file",
help="Directory contents will be listed to this file if specified.")
(options, args) = parser.parse_args()
if (options.path):
path = options.path
else:
path = sys.path[0]
out = sys.stdout
if (options.output_file):
output_file = open(options.output_file, 'w')
sys.stdout = output_file
list_files(path)
if (options.output_file):
output_file.flush()
output_file.close()
sys.stdout = out
As you can see, the script takes advantage of optparse to process the command line arguments, sys (to get the current folder and to get access to and redirect the output of the script), os (to list the contents of a folder) and os.path (to test if a given path is a folder.) And, being IronPython, you also get a second set of batteries in the form of the .NET framework.
Iron Python at QCon
I spent yesterday on the Microsoft stand at QCon 2010. I took a few Iron Python samples with me to show to those who are interested. I wanted to be able to show three things: .NET runs Python, Python extends .NET and Python runs .NET.
.NET runs Python
To show that .NET can run Python I used the Text Processing sample I’ve blogged about before. I’ve subsequently added optparse to it so that it can be driven from the command line. The point of this sample is that it uses standard Python libraries, the whole application is written in Python (there’s a little XAML to describe the UI) and runs on the DLR courtesy of IronPython.
Python extends .NET
For a simple demonstration of extending a .NET application with Python, I took the sample application described here. This application allows the user to write Python (at runtime) that interacts with the application.
Python runs .NET
The last sample was an adaptation of the code here that reads a Twitter feed. Rather than use Twitter (with all the shortened urls and abbreviations) I decided to use an RSS feed from the BBC to create an Iron Python newsreader. The code is remarkably simple:
import clr clr.AddReference('System.Speech') clr.AddReference('System.Xml') from System.Speech.Synthesis import SpeechSynthesizer from System.Xml import XmlDocument, XmlTextReader xmlDoc = XmlDocument() xmlDoc.Load("http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml") spk = SpeechSynthesizer() itemsNode = xmlDoc.DocumentElement.SelectNodes("channel/item") for item in itemsNode: print item.SelectSingleNode("title").InnerText news = "<?xml version='1.0'?><speak version='1.0' xml:lang='en-GB'><break />" + item.SelectSingleNode("description").InnerText + "</speak>" spk.SpeakSsml(news)
This is Python using standard .NET libraries to show how a Python programmer has the .NET framework available to them through Iron Python.
Gestalt
The final thing I talked about is Gestalt, which allows you to run Python (and Ruby) in the browser. It does this by using the DLR, which is part of Silverlight – this is all encapsulated in javascript.