Author Archives: James Casbon

something about me

Variant Call Format: really? 4

1000 genomes are making their genotypes available in variant call format (vcf). Now as others have noticed, vcf isn’t the prettiest format around. There are a few things to dislike:

The data is in ‘wide’ format which means that a file is fifteen screens wide and hides rare variation in a load of noise [...]

Looking into PiCloud’s Sandbox 2

EDIT: this was due to python 2.7 incompatibility and incorrect documentation. These examples all work with python 2.6
PiCloud looks very interesting. Execute on demand python could give you a much greater level of control over cloud computing use. Now, of course, sandboxing python is not easy, despite some well known implementations.
So PiCloud [...]

Friday quiz: where and when does this describe? 1

But these evils, though great, were small compared to
those far more deep-seated signs of disease which now
showed themselves throughout the country. One of these
was the obliteration of thrift from the minds of the ***
people. The *** are naturally thrifty; but, with such
masses of money and with such uncertainty as to its future
value, the ordinary motives [...]

Siôn Simon’s Comments on Tom Watson in the DEB debate: Beyond Irony? 0

Siôn Simon decided to caricature the opponent’s of the Digital Economy Bill with a piece of ‘fan fiction’ based on Star Wars. As chilling effects notes, copyright law is not clear about the use of characters in derivative works. However, I’m sure as a staunch defender of copyright he will have cleared [...]

pivot tables with sqlalchemy 0

If your database doesn’t support pivots, here is a quick technique to get pivot columns with sqlalchemy

import operator
from sqlalchemy.sql import case, func, select

def pivot_report(report, pivot_on=None, pivot_columns=None, pivot_func=func.sum,
non_pivot_columns=None, group_by=None):
“”" produce a pivot [...]

object ceremony, dynamic languages, JSON and algebraic data types 0

The reason most people end up using a dynamic language is to avoid the boilerplate associated with object creation. You know, typing “FileWriter fout = new FileWriter(”fred.txt”);” gets boring quickly. I think this is a good enough reason to move to another language on its own. This boilerplate is also sometimes called [...]

cogent: the unsung hero of bioinformatics and python 0

I recently started using cogent – the COmparative GENomics Toolkit and discovered that it is an excellent piece of kit. A google search for ‘python ensembl‘ doesn’t even show it at all, yet it definitely has the best bindings for ensembl avaiable in python – they’re based on sqlalchemy making it easy enough to [...]

Installing python bioinformatics tools with virtualenv and pip 0

Python seems to have developed a decent set of tools for quickly building development environments. I want to store my notes on how to get a good environment for bioinformatics set up quickly.
First of all, if you haven’t already, install virtualenv and pip. Both are easy installable. Now install virtualenv wrapper.
Now we [...]

More money… 0

My company has met its goals and secured a second tranche of VC money. To celebrate, we’ve even got a website: Population Genetics Technologies.

Making textmate virtualenv aware 1

So I am using textmate for my python development, but I wanted it to pick up any virtualenv configured in a project. Here’s how to hack the python bundle…
First off, the run script command needs to be aware of the virtualenv stuff. So open up the bundle editor, and replace this:
is_test_script = ENV["TM_FILEPATH"] [...]