Jan 19, 2017 8:00 AM

Fighting Cancer's Crisis of Confidence, One Study at a Time

The Reproducibility Project announces its initial cancer study results today.

Every year the US government spends $5 billion on cancer research. That’s not counting outgoing Vice President Joe Biden’s $1 billion cancer moonshot, or private projects like Sean Parker’s $250 million cancer immunotherapy institute. And yet more than 8 million people still die every year from the disease—despite the frequent refrain that a cure is just around the corner. Exhibit A: this 2003 WIRED article, “The End of Cancer.”

It’s a tempting prediction. Scientists today are exploring more promising new technologies than ever before: whole-genome sequencing, liquid biopsies, mRNA vaccines, AI-powered imaging analysis. But that doesn’t necessarily mean they’re more promising. No number of flashy new disruptors can fix cancer research’s real problem: much of its data can’t be trusted, because it was never validated. That’s why a group called the Open Science Collaboration is reexamining the results from the 29 most important cancer papers of the last few years. Today, it published the first findings from its newest reproducibility project.

Do they restore faith in the foundations of cancer research? Not exactly. But it is a start.

Revolutionizing Reproducibility

Scientists from two huge pharmaceutical companies, of all people, began signaling cancer’s reproducibility crisis a little over five years ago. In 2011, a team from Bayer Healthcare reported when they tried to replicate the results of basic cancer studies---something drug companies do routinely to direct new drug development, they could only validate 25 percent of them. Shortly after, the former head of cancer research at Amgen published a paper in Nature saying that his scientists could only replicate six out of 53 “landmark” cancer studies. The reports shocked the cancer community, and kicked off a media storm questioning the legitimacy of cancer science---and science in general.

Brian Nosek saw this as a challenge. A University of Virginia psychologist, Nosek founded the Center for Open Science in 2011 to investigate the reproducibility of canonical psychology studies. But a few years later he turned his attention to cancer biology, and with $2 million from replication advocates John and Laura Arnold he assembled a team to reconstruct the individual experiments of the 50 most influential cancer biology papers published between 2010 and 2012.

It started out simply enough. Nosek enlisted microbiologist Tim Errington to manage the project and Elizabeth Iorns at the rent-a-researcher firm Science Exchange to farm out tasks to her network of 900 privately-contracted labs. They began emailing authors to collaborate on an approach to replicating the key findings of each paper, which ranged from investigations into prostate cancer-fighting microRNA to the impact of gut microbes in colon cancer. But that’s where the project began to stumble. Cells and mice and peptides, it turned out, are difficult to share. You can’t just go to the fridge and then pop down to your local Fed-Ex.

“The norm for these kinds of studies is to not include all the raw data or a detailed protocol,” says Iorns. Instead, she and Errington had to track down that information from each of the original authors, a time-consuming process neither party much enjoyed. Not only did some of the researchers find the whole thing a nuisance, but sometimes the labs didn’t even know who did what on the original paper, as graduate students or post-docs who did the bulk of the work had since moved on. “I was surprised how much institutions are not set up to support reproducibility,” says Errington. “They’re actually set up to protect materials against sharing.” Losing more time and money to this process than expected, last summer the project downsized from 50 to 29 papers.

So far, the Center has completed seven of those studies, and eLife published the first five fully analyzed efforts today. In stark terms, one failed to replicate, two were inconclusive due to technical difficulties, and two generally supported the initial findings.

But Errington and Nosek caution that those surface judgments don’t mean much. Mostly they just show how hard it is to interpret replication results. Studies can fail a reproducibility test for a number of reasons, none of which mean the original results are false. Teasing that out is where it gets interesting.

Controlling for Cancer

Take, for example Levi Garraway’s 2012 paper about melanoma. Garraway and his team showed that a mutation called PREX2 sped up the growth of human tumor cells that were transplanted onto mice. The finding, originally published in Nature, has been cited 422 times, including in papers evaluating PREX2 as a potential target for small molecule therapies for skin cancer.

When Errington’s replicating team tried the same experiment with the same strain of mice and melanoma cells from Garraway’s lab at the Dana-Farber Cancer Institute, they found that the PREX2 mutations made no difference in tumor growth. But that’s because all the mice started getting tumors within a week or two, regardless of whether they had the mutation. The team couldn’t observe any differences, because their tumors were progressing at a turbo-charged speed.

Does that mean the study failed? Not really. Because the model didn’t act the way they expected, Errington’s team couldn’t even test the hypothesis. The cells could have changed in the intervening years, or the mice could have been raised differently, or any number of other things could have happened. There’s no way to know for sure.

Contrast that to a 2011 paper written by researchers at Stanford and the Lucille Packard Children’s Hospital describing a computer model to predict if already-approved drugs would also be good at fighting cancers. They validated the approach by testing those drugs against controls on different kinds of tumors. When the replication team retested the same drugs on the same kinds of cancer cells, they found similar effects---but they weren’t as significant as the original findings. Now the conversation researchers can have is about statistical cutoffs, not whether or not the model works. And that’s a significant narrowing of scientific uncertainty.

Which is after all, what science is all about. “Humans love certainty, but the evidence rarely provides it,” says Nosek. “Each study reduces the incompleteness of our understanding about the world. And we have to just do it over and over until we get to a place where we get consistent evidence.”

So, five studies down, only a 4.5 million more to go. But who’s counting?