{shortcode-138280fbc5bfe75eb3a958bc26f38c8de99245c9}
Researchers at a Harvard Medical School laboratory are uncertain how they will continue supporting a large public genetic database after its primary source of funding expired last month.
The Allen Ancient DNA Resource is a manually curated collection of genetic data from thousands of ancient and present-day individuals, covering more than 1.2 million positions in the genome. The project’s eight-year grant from the Paul G. Allen Family Foundation ended in September, leaving its future unclear.
David E. Reich ’96, professor of genetics at HMS who leads the AADR, said the lab is still looking for a funding source and is at risk of shutting down the publicly accessible resource that has been downloaded more than 67,000 times by researchers across the world.
“There are no specific sources of funding that are currently supporting the database,” Reich wrote. “We cannot sustain this for more than a short term and would have to end the AADR without renewal of a funding source.”
The most recent version was published in September 2024. Reich’s lab anticipated an updated release for the spring of 2025, but it was postponed due to ongoing funding uncertainties.
His research team submitted a proposal to the National Institutes of Health to renew a grant that has sustained the Reich Lab’s research for fourteen years, with the specific intention to maintain the AADR.
Reich wrote that the proposal received an “excellent score” and would have been guaranteed funding under normal circumstances. But due to reductions in federal awards to Harvard, he expects that funding will not be continued.
The AADR originated from the Reich lab’s efforts to reconstruct human genetic history and make ancient DNA data broadly accessible. Reich said the lab initially used its own archaeological and genetic data but expanded to include data from other studies after a surge of ancient DNA research in 2010.
“We had an initiative within the laboratory to create a centralized data set that allowed us to access all this information in a very uniform way, in flat text files that were then linked to the actual genetic data,” Reich said.
The lab received support from the Allen Foundation in 2017 to curate global ancient DNA data through the AADR. The dataset also includes archaeological information, radiocarbon dates, and details such as individuals’ estimated age at death.
Originally accessible through the lab’s website, the data now sit in a “professional” repository that receives more than 200 downloads a day. The Reich lab’s bioinformatics director Shop Mallick said the project filled a major gap in the field by increasing accessibility to analysis of human ancient DNA datasets.
“This seems to be a very unique thing in the human ancient DNA community, to create a centralized repository that people can use to just analyze from without going through the mechanical work of bringing different data sets together.,” Mallick said.
“It's a very unique resource that people are able to use just to leverage all the different studies that have been created simultaneously,” he added.
According to Reich, each release adds newly published data — about 50 papers’ worth a year — while improving the quality and consistency of existing material. The current version includes about 1.24 genome million positions, and Reich said the lab plans to expand the next version to roughly 2 million, though that would still only represent a subsection of research studies in the field.
“The AADR is an opinionated data set,” Reich said. “It’s not a comprehensive version of all of the data that's ever been produced in the literature.”
Researchers worldwide rely on the database for large-scale genetic analysis. Harald Ringbauer, a population geneticist at the Max Planck Institute for Evolutionary Anthropology, said the AADR was essential to his work.
“It's really an amazing database that David produced that he basically tries to keep up to date with almost all of the ancient DNA records for many labs around the world,” Ringbauer said. “It’9s very useful, because we have it all in one place.”
Reich said he hopes other researchers will continue to use the wide variety of modalities in the dataset creatively.
“I would like people to use it in any way that their creativity allows to study population history, to study natural selection, to understand familial structure, to correlate with economic data, to correlate it with social data,” he said. “Hopefully the uniformity in the data set will be useful.”
In the face of limited funding, Reich said the lab views the database as a shared resource for the broader scientific community and feels a responsibility to maintain and expand it.
“We recognize that these samples, which are so precious and so important for understanding our own history, can be queried in many other ways from other people with other expertise,” he said.
“Not only do we have a responsibility to continue sharing this, but we have responsibilities to try and share it in different ways that expand beyond the form it’s currently in,” he added.
—Staff writer Nari Shin can be reached at nari.shin@thecrimson.com.