According to researchers from Harvard T.H. Chan School of Public Health and MIT, the use of genome-wide association studies (GWAS) methodology to assess whole-genome sequencing data of SARS-CoV-2 mutations and COVID-19 mortality data can help identify highly pathogenic variants of the virus that should be flagged for containment.
Image Credit: Harvard T.H. Chan School of Public Health
Before the P.1 variant was revealed, the researchers used this biostatistical methodology to identify a mutation in the P.1 variant, or Gamma, as being connected to increased mortality and, presumably, higher transmissibility, higher infection rates, and higher pathogenicity.
The approach used by the researchers was published in the journal Genetic Epidemiology on June 23rd, 2021.
Based on our experience, GWAS methodology might provide suitable tools that could be used to analyze potential links between mutations at specific locations in viral genomes and disease outcome. This could enable better real-time detection of novel, deleterious variants/new viral strains in pandemics.”
Christoph Lange, Study Senior Author and Professor, Biostatistics, Harvard Chan School
The first instances of the P.1 variant in Brazil were published in January 2021, and the variant produced a spike in cases in Manaus, Brazil, within a few weeks.
In May 2020, the city had already been severely impacted by the pandemic, and researchers believed that the city’s residents had acquired population immunity because so many people in the area had generated antibodies to the virus during that initial wave.
Instead, P.1, which includes many mutations in the spike protein used by the virus to bind to and invade a host cell, triggered a second wave of infections, appeared to have stronger transmissibility, and was more likely to cause mortality than the previous varieties detected in the area.
The Harvard Chan School and MIT team adapted GWAS technology, which is routinely used to associate certain genetic variations with specific diseases, to separate apart the relative pathogenicity of several SARS-CoV-2 mutations in September 2020, many months before the first P.1 patient was identified.
In 7,548 COVID-19 patients, the researchers investigated for correlations between each mutation of the SARS-CoV-2 virus’s single-stranded RNA with mortality.
The researchers used data from the GISAID (global initiative on sharing avian influenza data) database, which comprises the genetic sequence as well as clinical and epidemiological information about SARS-CoV-2 and influenza viruses.
The researchers discovered a mutation that affects the spike protein and is connected to a large increase in mortality in COVID-19 patients at locus 25,088bp in the virus’s genome. The variant with this mutation was flagged by the team, and it was eventually identified as part of P.1.
According to the team, their biostatistical methodology should have broader applicability beyond the P.1 variation and SARS-CoV-2.
We expect that this approach would work in similar scenarios involving other diseases, provided the quality of the data collected in public databases is sufficiently high.”
Georg Hahn, Research Associate and Instructor, Biostatistics, Harvard Chan School, and co-first author of the paper
Hahn, G., et al. (2021) Genome-wide association analysis of COVID-19 mortality risk in SARS-CoV-2 genomes identifies mutation in the SARS-CoV-2 spike protein that colocalizes with P.1 of the Brazilian strain. Genetic Epidemiology. doi.org/10.1002/gepi.22421.