Python in Science

Friday, June 29, 2007

Generate Google Maps/Earth layers from shapefiles

In my work, I deal with geo-referenced data an models all the time. So, being able to display the results of my simulations on top of a map is very important.

I use my own software, Epigrass, to run my geo-referenced dynamic populational models. Epigrass is great for running complex models but so far, it didn't help much when it came to representing the results in a nice way.

I then decided to give Epigrass a major overhaul (which I hope to release soon) to include support to shapefiles (.shp). Shapefile is a very common map format supported by every GIS software I know. Thankfully, there is a great library for handling this type of files (and others too) which has bindings for Python. Its called OGR and is distributed as part of another library called GDAL (apt-get install python-gdal). The following code is taken straight from Epigrass-devel CVS tree (module epigdal.py) on sourceforge, so feel free to explore the rest of the code if you fell like.

So, the OGR does the loading of vector maps (and any data associated with it) and make them available for manipulation in Python.

import ogr
map = ogr.Open('mymap.shp')
layer = map.GetLayer(0)

The code above takes care of extracting the first layer from "mymap.shp". A layer is a set of geometrical objects (points, lines or polygons), called Features. A Shapefile may contain many layers, if you want to find out about them, you can write something like this:

nlay = map.GetLayerCount()
layer_list = [map.GetLayer(i) for i in xrange(nlay)]

Or you might want to get the layer names:

layer_namelist = [map.GetLayer(i).GetName() for i in xrange(nlay)]

Fortunately my map had a single layer, so I just proceeded to get a hold of the features in the layers. In my case, I was only interested in Polygons, to calculate their centroids (geometric center). At this point, I will have to refer the reader to a link to the code since Blogger won't allow you to edit decently formatted code. So for the feature extraction code, look at lines 71-93 of this module.

In that code, I iterate over the layer's features, and get the centroids from their geometries and save both the centroids and the geometry objects in dictionaries using the variable 'geocode' from the map's own database as key. Note the I do this only for type 3 geometries (polygons).

My next step is then to generate another map layer with centroid data included. This time it will be a layer of points instead of polygons. See how I do that in lines 96-138.

Now we have covered how to read a layer an how to create a layer. We have the necessary skills to move on to the main topic of this post, which is creating a layer in Google Maps/Earth from a layer derived from a shapefile, plus whatever data we may want to associate with it. Google use its own XML schema to represent GIS layers. It is called KML. I am not going to explain KML in detail here, try this tutorial or the Google documentation. I am going to create the KML directly using minidom from Python's standard library. Also, I am going to encapsulate the code into a class to better organize it. Look at class KMLGenerator which starts at line 259. To use this class you just call the method addNodes (passing a layer object taken from a shapefile as shown above, after instantiating the class, and then call writeToFile to write your KML file.

As I create the polygons, I color them according to the values of one of the data fields of the layer. I use matplotlib cm class and rbg2hex function, tho choose the color and convert it to hex format.

to finish it off, the required screenshot:

Notice the polygons colored according to disease prevalence, and the comment which shows up in the pop-up balloon.

I hope you enjoyed reading this post. Please post a comment if you have any further quastions or comments.

Thursday, May 10, 2007

Creating Rich Internet Applications with Python

I have been looking for the ideal tool for creating Rich Internet Applications (RIA) with Python. Although there is a ton of web frameworks out there, my favorite being Turbogears, all of them have so far failed to jump in the RIA bandwagon. I have been playing with OpenLaszlo, which although quite visually stunning, is a bit hard to integrate with a Python backend (but I am working on it, stay tuned!).

So yesterday I came across this other framework called Rialto which is marketed as language agnostic. They directly support Python (among other dynamic languages). They also say that their goal is to enable people to create RIAs without having to write DHTML, javascript or understanding DOM concepts.

They have a tutorial integrating their toolkit with CherryPy, so to teach my self Rialto, I decided to port their tutorial to TurboGears . Read along for how to make a simple app looking like the figure below.

I assume readers of this howto will be somewhat familiar with TurboGears, otherwise what's the point? So the first step is to generate a bare-bones TurboGears app with "tg-admin quickstart". Then Download and install rialtoPython from the Rialto website.

Then go to you controllers.py and first add the required import line:

from rialtoPython import *

Then, in the class root, empty the guts of the index method. Change the decorator of this method to:

@expose
def index(self):

Now we can proceed to insert the Rialto content inside the method. RialtoPython works by assembling the HTML page as a big string using python objects to generate the javascript parts. It's easy for a simple example like this but might get out of hand quickly for larger projects. I believe the python calls made within the index method could be done inside a Kid template, separating the view from the controller part as it is expected in a Turbogears application.

Here is the contents of the index Method:

@expose()
def index(self):
rp = rialtoPython.RialtoPython()
out = []
DIV = "document.body"
LAYOUT = "ihmDemo"
WINDOW = "myWindow"
WINDOW_TITLE = "Rialto-Turbogears Integration Example"
FRAME = "boxDemo"
FORM = "myForm"
URL = "do"
# assembling the page
out.append(rp.addImport(devMode=False))
out.append('')
out.append('\trialto/I18N.setLanguage("en");\n')
#layout
out.append(rp.openLayout(LAYOUT))
#app components
out.append(rp.windowTag(WINDOW, WINDOW_TITLE, DIV))
out.append(rp.frameTag(FRAME, '30', '30', '300', '120', 'My 1st python Box', 'true', 'false', 'relative', 'false', WINDOW))
out.append(rp.formTag(FORM,URL,FRAME))
out.append(rp.labelTag('label1', '25', '10', 'Spam', FORM))
out.append(rp.textTag('login', '25', '90', '200', 'A', FORM, 'true', 'true', 'Eggs'))

out.append(rp.buttonTag('doLogon', 'Ni!', '70', '10', 'And now to something completely different', ['FORM', FORM], FORM))
out.append(rp.buttonTag('default', 'Camelot!', '70', '100', 'The television reception isnt good here anyway', ['FORM', FORM], FORM))
# close the layout Tag
out.append(rp.closeLayout(LAYOUT))
out.append(rp.init(LAYOUT))
# Completing the Html code
out.append('\n')
out.append('\n\n')
return " ".join(out)

Let's now go over how it works, in the beginning of the index method, the RialtoPython is instantiated and some variables are defined in order to make the following calls a little more readable.

All "rp" methods called, return strings that are stored sequentially in the "out" list and the joined into a single string which contains the complete code of the application page. The App contains a frame with a form in it containing a text lable, a text input box, and a couple of buttons, when we call the method that returns the form code, a URL is specified to handle the the data coming from the form. On Turbogears, the URLs are exposed methods of the Root class. So let us write this method:

@expose()
def do(self,login,action):
res=rialtoPython.RialtoPython()
out=[]
error_msg=[]
if action=='doLogon':
if login=="":
error_msg.append("Empty Spam")
elif login!="Eggs":
error_msg.append("Unknown Spam")
if len(error_msg)>0:
out.append(res.showAlert('myAlert', ','.join(error_msg)))
else:
out.append(res.showAlert('myAlert', 'This Spam was good!'))
elif action == 'default':
out.append(res.showAlert('myAlert', 'Are you sure, you want to go to camelot?'))

else:
out.append(res.showAlert('myAlert', 'I dont know anything about the action %s'% action))

return ''.join(out)

The method "do" shows a floating alert window in response to pressing a button in the form. The method takes two arguments: "login", which is the contents of the input text box of the same name. Action is generated by the rialto API and contains the name of the button pressed.

There is only a couple of other things left for us to do before our app is ready:

You have to download the Rialto javascript API source distribution, and unzip it in the same directory we have our controllers.py. After you do that, you should have a "rialtoEngine" directory in it.
Now you have to add the following code to your app.cfg in order to let Turbogears find the Rialto code:

[/rialtoEngine]
static_filter.on=True
static_filter.dir="%(top_level_dir)s/rialtoEngine"

[/config.js]
static_filter.on=True
static_filter.file="%(top_level_dir)s/rialtoEngine/config.js"

[/javascript]
static_filter.on=True
static_filter.dir="%(top_level_dir)s/rialtoEngine/javascript"

[/rialto.js]
static_filter.on=True
static_filter.file="%(top_level_dir)s/rialtoEngine/javascript/rialto.js"

[/images]
static_filter.on=True
staticFilter.dir="%(top_level_dir)s/rialtoEngine/images"

[/style]
static_filter.on=True
static_filter.dir="%(top_level_dir)s/rialtoEngine/style"

Now just start your application and point your browser to http://localhost:8080 to see your first rich internet application, written entirely in Python, running.

Tips: This code may require the latest version of RialtoPython released in may 2007.

Saturday, May 5, 2007

Running OLPC on VirtualBox

I have this project that I have been meaning to port to the OLPC but had been postponing to wait for a more mature development environment.
Well, with the release date set for october, I decided that waiting is no longer an option. On my main machine I have the sugar-jhbuild environment, which is OLPC's gui built from scratch within a Linux environment and the Qemu image which boots normally, but that is a bit too slow for my taste.

So decide to try and run OLPC from VirtualBox, a GPL'd new virtualization tool that I have been quite happilly using lately (much better than VmWare).

So this post is intended as a "how to run OLPC from VirtualBox", type of article. So let's get started:

My start point is an Ubuntu Feisty, with VirtualBox already installed. Installing VirtualBox on an Ubuntu box is a matter of pointing and clicking if you have Automatix installed.
First you need to obtain the latest development image from here.
Start VirtualBox by clicking its icon in the menu.
Click on the new button to create a new virtual machine. A wizard dialog will come up. click next, pick a name, "OLPC" for example. In the combo box below pick "Linux 2.6" as OS type. Click next.
Pick the recommended memory size (128MB). Click next.
Since the image you downloaded is a livecd image you don't have to create an HD, so just click next.
Click finish. you are almost done. You are taken back to the main window, and the panel on the left shows the VM you just created.
Click on the icon corresponding to the VM you created. O the right-side panel a descritpion of it will show up.
Click on "CD/DVD-ROM". Check the box that says "mount CD/DVD drive"
Select the "ISO Image File" radio button.
Select the ISO image file you have downloaded previously.
Click ok and you're done! Just go ahead and star your virtual machine!

Now Virtual box will show a console where the typical Linux boot up messages scroll by. After the boot is complete you will be presented with the XO's login screen. Type your name click on the ">" icon and you will be taken to the sugar desktop.

Hint: once you click on the desktop mouse and keyboard are captured by the VM. To release them, press the right CTRL key.
Enjoy!