Massive 3-D Cell Library Teaches Computers How to Find Mitochondria

Using images of 6,000 stem cells, scientists have created the first-ever deep learning model to predict how cells are organized.

Graham Johnson is an artist with a curious muse: the human cell. He’s the Matisse of mitochondria, the Goya of the Golgi apparatus. Twenty years ago he graduated from a quiet corner of Johns Hopkins where students draw cadavers instead of cutting them up. At first, Johnson stuck to the medical illustrator canon, animating cells in a classic, cartoonish style. But he dreamed of constructing three-dimensional, data-driven models that could capture all their beautiful complexity. For that, he’d need computers, lots of them. And some really powerful microscopes.

Johnson found them both at the Allen Institute for Cell Science---a Seattle-based research center established in late 2014 by Microsoft co-founder Paul Allen. (Before getting recruited to the center, Johnson completed a PhD in computational biology and molecular graphics.) Today, he and the institute’s team of nearly 50 cell biologists, microscopy specialists, and computer programmers revealed what they’ve been working on the past two years: the Allen Cell Explorer. It’s the largest public collection of human cells ever visualized in 3-D, which serves as fuel for the project’s engine: the first-ever deep learning model to predict how cells are organized.

To create their model of the organic shapes and structures inside the cell, the Allen team trained deep learning algorithms on 3-D images of more than 6,000 induced pluripotent human stem cells. But first they had to make those images. They dyed each cell’s outer membrane and nuclear DNA to stand as lighthouses in a sea of cellular noise. Then they used Crispr/Cas9 gene editing to fluorescently tag well-known proteins in structures like microtubules and mitochondria. Powerful microscopes captured the multicolored light display.

StemCell1.jpg

The Allen team then used computer software to stitch dozens of these flat images into 3-D renderings of cells---just like radiologists do with CT scans. These are the images you can spin around in the platform’s browser tool; click here and see the mitochondria, click there and see the nucleus. “After decades of reductionist biology, it’s exciting to put the pieces back together into a whole picture of the cell,” says Johnson, who directs the institute’s animated cell group.

From that library, Johnson’s group took hundreds of measurements---from distances between structures to protein density. Their algorithms used those numbers to kick out predictions of where certain structures should go in a cell. If you give the model pictures of the nucleus and the cell membrane and the microtubules, it can tell you where to find the mitochondria, for instance.

“It’s sort of like if you’ve got some images of the wheels of cars,” says Susanne Rafelski, who leads the institute’s assay development team. “Just from that limited information, it can predict the make and model of those cars.” The website currently showcases comparisons between the model’s predictions and 2-D image data, but future iterations will allow users to generate and explore cells in three dimensions. Down the road, Rafelski says they’ll be able to zoom around time as well as space---seeing what happens to a cell through its life when it’s growing, dividing, damaged, or dying.

But the explorer isn’t just for digital tinkering. Its cell catalog contains detailed information for all of the institute’s fluorescently-tagged human stem cell lines, which any scientist can order online at the cost of distribution (around $600). Rafelski and Johnson are hoping that the suite of resources they’ve compiled convinces more cell biologists and drug developers to conduct their studies with stem cells instead of the cancer cell-derived cultures that are prevalent today.

Cancer cell lines are widely used to test drugs and vaccines for everything from HIV to heart disease. (HeLa, the most commonly used human cell line, is involved in more than 11,000 patents.) And while those are easy to work with and keep alive for a long time---HeLa is 66 years old this year---they also have high mutation rates, which mean scientists using the same cell lines years apart might get wildly different results. “We’re really trying to make a standardization tool,” says Rafelski. “There’s so much variability when we develop prototypes to make different disease models. It’d be much better if we were all testing on the same basic cells.”

The timing is auspicious; stem cells are getting easier (and cheaper) to work with. And scientists are getting better and better at nudging them into becoming different kinds of cells, from neurons and cardioblasts to pancreas precursors and liver buds. Which means that Johnson may soon have more muses than he bargained for. But for now, he’s concentrating on the masterwork in front of him.