Thursday, February 19, 2015

Ice coding

It's winter in Michigan. If you've never experienced a Midwest winter, you're missing out on one of the most remarkable tests of endurance a native Californian can undertake. Flowers are blooming up and down the west coast, but Ann Arbor is a world of ice, wind, and road salt. At the moment I write this, Mt. Everest base camp is a actually few degrees warmer (at -9F) than us. We're around -12F. Just going outside is dangerous.

Since frostbite and ice fishing just aren't my things, I tend to spend a lot of the winter coding. This winter, my team has been working on solving some big problems in scientific data analysis, and we've made big progress in the last few weeks. Our task is to combine many smaller evolutionary trees (e.g. the tree of dog life, the tree of fish life, the tree of mushroom life, etc.) into one massive tree of all life. It's a nontrivial problem—many of the trees conflict with one another in subtle ways, and choosing how best to combine them becomes technically challenging for even small numbers of trees and relatively few species. We're attempting to combine hundreds of trees and produce a tree of life with over 2 million tips.

I've been working on this project for a couple of years now, but we've recently had some methodological breakthroughs. Some of them are great examples of the application of classic algorithms, as well as much less classic techniques we've had to invent, particularly with regard to merging taxonomic information with evolutionary trees. My next few posts will focus on these challenges and some of the solutions I've come up with to address them.

For now, I'm going to let the dog out and brave the short trip to the bus stop. Even though it's bitterly cold, it's also beautifully sunny today, so besides my hair freezing a few seconds after I walk out the door, it should be a nice walk.