Tuesday, September 29, 2009

The Internet Manifesto

This document is a must read (see link at the end).

I want to add my own items to it:

1. Net Neutrality is not only protecting Internet content provider corporations' profits

We need to defend net neutrality in a way which goes beyond what is currently done: We need to establish fixed ip addresss for every private individual so that publishing rights don't have to be gatekeeped by large corporations such as Google and the like.

2. Copyright should not be sellable item.

The source of most of the confusion about whether copyright is a good thing or not in the information age, is the fact that most of the commercial exploration of copyrighted materials is not done by the original authors, but by large publishing houses which either coerce authors to give them the commercial right to their work in exchange for pennies, or exploit materials which should be in the public domain for as much as 70 years after the authors death.

3. Network Infrastructure ownership should not be a priviledge of large corporations.

The right to form open "ad hoc" wireless networks should be guaranteed globally, like we have with amateur radio for decades. This is the only way to assure the basic freedom of association and expression.

referente a: Internet-Manifesto (ver no Google Sidewiki)

Monday, September 28, 2009

Python(x,y) A Scientific Python Distribution

I recently came across, this interesting, opensource Python Scientific Distribution for Windows. I normally don't pay too much attention to windows tools, but it's good to have something to recommend to windows users when you want them to try out some Python code.

For Linux users, it's not really relevant, because we all have powerfull package managers to help us get most Python packages installed very easily.

The dowside is that it is a really big download, 300+ MB, and the only mirror available was giving me only about 15 Kbps today (I am on a 18MBps connection). It is so big that it includes Enthought Tool Suite in it and much more.

For users of more civilized Operating systems, like Linux, It's worth checking out one of the editors bundled, spyder, which is available through pypi ("easy_install spyder"). It's a reincarnation of pydee, and despite its beta status, it is very good already.

referente a: python(x,y) - Python for Scientists (ver no Google Sidewiki)

Thursday, September 24, 2009

Scientific Python Group at LinkedIn

I have just created a Scientific Python Group at LinkedIn. I was actually surprised that there wasn't one already.

Python is quickly becoming a major tool in scientific computing and we should all do our best to advertise its capabilities.

referente a: Scientific Python Group | LinkedIn (ver no Google Sidewiki)

Thursday, September 17, 2009

Violin Plot with Matplotlib

One of the things I sorely missed from matplotlib for a very long time, was a violin plot implementation. Many a time, I thought about implementing one myself, but never found the time.

Today, browsing through Matplotlib's documentation, I found the recently added fill_betweenx function. Finally it seemed to have become a piece of cake to implement a violin plot. I Googled for violin plot and Python, to no avail. So I decided to write it myself.

Violin Plots are very similar to Box and whiskers plots, however they offer a more detailed view of a dataset's variability. It's frequently a good idea to combine them on the same plot. So here is what I came up with:

# -*- coding: utf-8 -*-
from matplotlib.pyplot import figure, show
from scipy.stats import gaussian_kde
from numpy.random import normal
from numpy import arange

def violin_plot(ax,data,pos, bp=False):
create violin plots on an axis
dist = max(pos)-min(pos)
w = min(0.15*max(dist,1.0),0.5)
for d,p in zip(data,pos):
k = gaussian_kde(d) #calculates the kernel density
m = k.dataset.min() #lower bound of violin
M = k.dataset.max() #upper bound of violin
x = arange(m,M,(M-m)/100.) # support for violin
v = k.evaluate(x) #violin profile (density curve)
v = v/v.max()*w #scaling the violin to the available space
if bp:

if __name__=="__main__":
pos = range(5)
data = [normal(size=100) for i in pos]
ax = fig.add_subplot(111)

The next step now is to contribute this plot to Matplotlib, but before I do that, I'd like to get some comments on this particular implementation. Moreover, I don't know if it'd be acceptable for Matplotlib to add Scipy as a dependency. But since re-implementing kernel density estimation for a simple plot would be overkill, maybe the destiny of this implementation will be to live on as an example for others to adapt and use.

WARNING: This code requires maplotlib 0.99 (maybe 0.99.1rc1) to work because of the fill_betweenx function.

Friday, September 11, 2009

Why Python

I recently came across a repository of jewels in the form of the unpublished manuscripts of E.W. Dijkstra. I just finished reading one entitled "Some Meditations on Advanced Programming". It is amazingly well written and still so relevant to present-day computer science, that I recommend anyone with at least a passing interest on the subject to read it.

I was pleasantly surprised to find in the last remark of the manuscript, something which can be considered, IMHO, the most fundamental design principle of Python and the my main reason to love it. I quote:

"As my very last remark I should like to stress that the tool as a whole should have still another quality. It is a much more subtle one; whether we appreciate it or not depends much more on our personal taste and education and I shall not even try to defined it. The tool should be charming, it should be elegant, it should be worthy of our love. This is no joke, I am terribly serious about this. In this respect the programmer does not differ from any other craftsman: unless he loves his tools it is highly improbable that he will create something of superior quality.

At the same time these considerations tell us the greatest virtues a program can show: Elegance and Beauty."

Wednesday, September 2, 2009

New-style String Formatting and LaTeX

The recommended new way (since 2.6) of doing string formating in Python is to use the format method of string objects instead of the % operator and %s (and variants) placemarks.

I decided to check it out to generate some LaTeX tables programatically. Bad Idea. the method expects placemarks like this: "some string {key}".format(key=123) and that is a big problem when formatting LaTeX strings. Curly braces are a very significant character in LaTeX, and therefore you will get a Key Errors all the time. The only solution I see is that format would silently ignore Key Errors. But that is not currently an option.

Has anyone else devised a solution to this? The scary part of this is that the official documentation says that the % operator for string formatting will eventually be removed (?!?) from the language.