{shortcode-e03301cc41f6d8ce3924ce63893fa50ccf7a6ff1}
In 2012, the biotechnology firm Amgen attempted to replicate “landmark” research on blood disorders and cancer. They chose 53 studies that had been cited and circulated widely, all of which described novel findings or approaches for cancer treatment.
But they were unsuccessful. Despite attempting to work closely with each study’s original authors to accurately replicate the experiments’ materials and procedures, only six of the studies could be replicated.
This isn’t an isolated incident: One study found that of 83 highly cited publications on psychiatric treatments, only 43 had been subject to replication, and 27 were contradicted or found to have substantially smaller effects than initially reported. Another study claims that up to $28.2 billion is spent annually in the U.S. on preclinical research that can’t be reproduced. These and other statistics sent shock waves across the scientific community, prompting debates to address the so-called “replicability crisis” in science that continues to this day.
I won’t pretend I can satisfyingly explain or answer these phenomena in a column. But it is worthwhile to explore what has led to this lack of replicability. How do we mend it? And more broadly speaking, considering the inherent heterogeneity and uncertainty in science, is it possible?
When it comes to questions of replicability, it can be easy to think of instances of fraudulent conduct, like the scandal Harvard faced in 2018 when it turned out that 31 publications on heart muscle regeneration via stem cell treatment by Harvard Medical School researcher Piero Anversa contained fabricated or falsified data. These instances of dishonesty will continue happening.
But oftentimes studies can’t be replicated for decisions that are more unintentionally or even unconsciously made.
For example, the “same” cell lines can be dramatically different across labs. According to Phillip Sharp, a professor of biology at MIT and winner of the 1993 Nobel Prize in Physiology or Medicine, current models using cell lines vary immensely from lab to lab and experiment to experiment.
“A lot of the lack of reproducibility in some cancer models, which is really a major part of literature now, is the use of cells that we call HeLa, or other cancer cells,” he said in an interview. Sharp co-chaired a 2009 National Academies of Sciences report on data integrity, accessibility, and stewardship.
According to him, these cells are “genetically unstable by their nature” and tending to them “in different conditions, different nutrients, different times, different temperatures, whatever” can often produce wildly different results — “not one in a million but one in 10!”
While there are efforts to use more shared cell lines or animal models with immune systems that can better model human disease, a lack of replicability stems from a deeper, systemic incentive structure, according to Jonathan Kimmelman, director of the Biomedical Ethics Unit at McGill University.
“Oftentimes, incentives in research are structured in a way that is not necessarily maximally aligned to producing unbiased and precise estimates around causal claims,” Kimmelman said in an interview.
“If your salary, if your ability to earn grants is going to be based on publications in New England Journal of Medicine or publications in Nature, et cetera, there are lots of different ways that you might, as a scientist, subconsciously interpret some of your findings in ways that are more sympathetic to that positive result than they should be,” he said.
There are scholars that say there isn’t a replicability crisis. A 2019 report from the Committee on Reproducibility and Replicability in Science from the National Academies of Sciences, Engineering and Medicine stated that “there seems to be an emerging consensus that it is not helpful, or justified, to refer to psychology as in a state of crisis.” Per the report, “we don’t have enough information to provide an estimate with any certainty for any individual field or even across fields in general.”
Sure, we still have to figure out the exact extent of science’s lack of replicability, but that shouldn’t be an excuse to downplay the issue. That same committee notes that finding previous studies not replicable can even be helpful when it reveals “inherent but uncharacterized uncertainties in the system being studied.”
Sharp says science is “a process by which the human mind has discerned that we can interrogate the physical external world to learn material truths.” According to him, “we're always interpreting.”
If ten chefs follow the same recipe but get nine different dishes, perhaps that reveals more about the complexity and unpredictability of science. Better understanding this inherent complexity and seeing science’s limitations, then, even if uncomfortable, is a worthwhile investigation.
So what mechanisms can ensure scientific integrity? The committee’s report offers a few solutions, many of which seem more like common sense than glamorous. Publish more negative results; establish requirements for thorough and easily accessible code and data, especially as computational research increases; implement training for record-keeping and ethical consideration; and create and secure open-source infrastructure.
These mechanisms won’t be enough, but they are a necessary start. Science’s lack of replicability is an issue with incredible nuance, one that warrants much more consideration than a declaration to publish all negative results or bolster funding incentives. Scientific integrity is paramount for researchers, administrators, and journals alike.
“Science has its value based on what it could do to promote human knowledge,” Insoo Hyun, a bioethics professor at Case Western University and faculty member at Harvard Medical School, told me. “It kind of has this consequentialist goal: better produce something worthwhile.”
Therefore, a lack of replicability, whether intentional or not, becomes unethical.
“You want your scientific experiment to be meaningful,” Hyun said. “But if it's not — if it's not set up to actually help expand knowledge or translated benefits for patients and society — you're actually doing a great ethical disservice.”
Julie Heng ’24 is a Crimson Editorial editor. Her column runs on alternate Tuesdays.
Have a suggestion, question, or concern for The Crimson Editorial Board? Click here.
Read more in Opinion
From Harvard to Homer: Lessons Learned from Eight Months in Rural Alaska