Genome-Wide Association Methodology Accurately Flags Covid-19 Gamma Variant, HSPH Research Shows


A longstanding methodology used to associate human genetic variations with disease risk can help flag more deadly or contagious variants of SARS-CoV-2, including the novel gamma variant, according to new research from the Harvard School of Public Health.

Over the past year, Christoph Lange, a professor at Harvard Medical School, and his research group analyzed more than 7000 SARS-CoV-2 genomes from an online database to identify the most highly pathogenic strains. The researchers detected a mutation in a variant later identified as the gamma variant P.1 in Brazil. The group found that P.1 was linked to higher mortality rates, transmissibility, and pathogenicity.

Nan M. Laird, a biostatistics professor at HSPH, who contributed to the study, said the group’s research was an original usage of an established method to predict disease outcomes.

“Historically, the genome-wide association analyses have looked at whether or not an individual's genetic makeup can predict disease in that individual,” Laird said. “What we’re doing here is quite different because we’re asking whether or not mutations in the virus can affect the course of the disease in the individual.”


Georg Hahn, a biostatistics instructor at HSPH and the lead author of the study, said this methodology could be used as a “pre-warning system” to flag and monitor any viral mutants.

Chloe Wu, MIT graduate student who contributed to the study, noted that this approach and similar genetic methodologies have significant potential to identify and prevent the spread of more lethal Covid-19 variants as the pandemic rages on.

“Continuing to apply similar approaches as this pandemic continues to progress would be interesting to see,” Wu said. “The virus is continuing to mutate, and so if you can get an early sense of which mutations might end up being problematic, then we can potentially start to put some of our resources and energy into trying to contain that early on.”

However, the researchers said that current epidemiological databases pose logistical problems for future research with this methodology.

Their research was based on publicly available data in the GISAID database, an open-source platform where users can upload genetic sequences and any relevant clinical or epidemiological data. The database holds over 3 million submissions, but due to a lack of standardization, researchers may find repetitive submissions that exaggerate findings.

Coming up with standardized pipelines for submissions to this database would make the data much easier to use, according to Hahn.

Lange said the research group aims to continue applying this methodology as the pandemic progresses, in hopes of identifying novel strains before they become prevalent.

“We will continue to apply it to see if other new strains surface and if we can detect anything new,” Lange said.

—Staff writer Ariel H. Kim can be reached at

—Staff writer Anjeli R. Macaranas can be reached at