Anna Nowogrodzki, Nature:
On 10 April, astrophysicists announced that they had captured the first ever image of a black hole. This was exhilarating news, but none of the giddy headlines mentioned that the image would have been impossible without open-source software. The image was created using Matplotlib, a Python library for graphing data, as well as other components of the open-source Python ecosystem. Just five days later, the US National Science Foundation (NSF) rejected a grant proposal to support that ecosystem, saying that the software lacked sufficient impact.
It’s a familiar problem: open-source software is widely acknowledged as crucially important in science, yet it is funded non-sustainably. Support work is often handled ad hoc by overworked graduate students and postdocs, and can lead to burnout. “It’s sort of the difference between having insurance and having a GoFundMe when their grandma goes to the hospital,” says Anne Carpenter, a computational biologist at the Broad Institute of Harvard and MIT in Cambridge, Massachusetts, whose lab developed the image-analysis tool CellProfiler. “It’s just not a nice way to live.”
Scientists writing open-source software often lack formal training in software engineering, which means that they might never have learnt best practices for code documentation and testing. But poorly maintained software can waste time and effort, and hinder reproducibility. Biologists who use computational tools routinely spend “hours and hours” trying to get other researchers’ code to run, says Adam Siepel, a computational biologist at Cold Spring Harbor Laboratory in New York, and a maintainer of PHAST, a tool used for comparative and evolutionary genomics. “They try to find it and there’s no website, or the link is broken, or it no longer compiles, or crashes when they’ve tried to run it on their data.”
But there are resources that can help, and models to emulate. If your research group is planning to release open-source software, you can prepare for the support work and the questions that will arise as others begin to use it. It isn’t easy, but the effort can yield citations and name recognition for the developers, and improve efficiency in the field, says Wolfgang Huber, a computational biologist at the European Molecular Biology Laboratory in Heidelberg, Germany. Plus, he adds, “I think it’s fun.”