One of the things I sorely missed from matplotlib for a very long time, was a violin plot implementation. Many a time, I thought about implementing one myself, but never found the time.
Today, browsing through Matplotlib's documentation, I found the recently added fill_betweenx function. Finally it seemed to have become a piece of cake to implement a violin plot. I Googled for violin plot and Python, to no avail. So I decided to write it myself.
Violin Plots are very similar to Box and whiskers plots, however they offer a more detailed view of a dataset's variability. It's frequently a good idea to combine them on the same plot. So here is what I came up with:
# -*- coding: utf-8 -*-
from matplotlib.pyplot import figure, show
from scipy.stats import gaussian_kde
from numpy.random import normal
from numpy import arange
def violin_plot(ax,data,pos, bp=False):
create violin plots on an axis
dist = max(pos)-min(pos)
w = min(0.15*max(dist,1.0),0.5)
for d,p in zip(data,pos):
k = gaussian_kde(d) #calculates the kernel density
m = k.dataset.min() #lower bound of violin
M = k.dataset.max() #upper bound of violin
x = arange(m,M,(M-m)/100.) # support for violin
v = k.evaluate(x) #violin profile (density curve)
v = v/v.max()*w #scaling the violin to the available space
pos = range(5)
data = [normal(size=100) for i in pos]
ax = fig.add_subplot(111)
The next step now is to contribute this plot to Matplotlib, but before I do that, I'd like to get some comments on this particular implementation. Moreover, I don't know if it'd be acceptable for Matplotlib to add Scipy as a dependency. But since re-implementing kernel density estimation for a simple plot would be overkill, maybe the destiny of this implementation will be to live on as an example for others to adapt and use.
WARNING: This code requires maplotlib 0.99 (maybe 0.99.1rc1) to work because of the fill_betweenx function.