Advances in technology have supported the development of updated reference genomes used for sequencing, ever since the Human Genome Project has mapped the whole human genome in the last 20 years.
Image Credit: Baylor College of Medicine.
However, while the GRCh38 (hg38) human reference genome was released over seven years ago, the older GRCh37 (hg19) reference is still extensively used in many clinical and research laboratories.
Now, in a new research work published in the American Journal of Human Genetics journal, researchers from Baylor College of Medicine’s Human Genome Sequencing Center have detected genetic variant inconsistencies between the two references, providing guidelines for laboratories to leverage a better human reference genome.
There’s a big push to update genomic sequencing resources to use the hg38 reference because the belief is that hg38 is a significant improvement over hg19. We wanted to identify the differences in sequencing readouts between the two references for labs that are still using hg19.”
Moez Dawood, Study Co-First Author and Student, Medical Scientist Training Program, Baylor College of Medicine
The Baylor team examined exome sequencing samples obtained from over 1,500 participants in the Baylor-Hopkins Center for Mendelian Genomics. They detected as many as 206 genes with discordant variations between hg38 and hg19. This also included eight genes associated with Mendelian disorders and 53 linked to common disease phenotypes.
The researchers observed that 73% of the discordant variations were concentrated inside the genome sections with familiar assembly issues, which they termed DISCordant Reference Patches (DISCREPs).
This study isn’t a theoretical comparison of the two references; we looked at exome data from study participants and examined the impact of using the updated reference on Mendelian genes and pathogenic variants. We wanted to provide the list of 206 genes enriched with discordant variants and bring this issue to the attention of the labs working on these genes.”
Dr Aniko Sabo, Study Senior Author and Assistant Professor, Human Genome Sequencing Center
“For variant interpretation in the 206 genes enriched for discordant variants, reference assembly differences should be accounted for in the analysis, especially when lifting over variant coordinates from one reference to the other,” stated Dr. He Li, the co-first author of the study and a postdoctoral associate at Baylor College of Medicine during the research.
A substantial amount of time and resources are needed for the transition from the hg19 reference to the hg38 reference. Using this large-scale assessment of sequencing data, the researchers are aiming to reduce the load on laboratories that are contemplating the transition. The study measures both the advantages and disadvantages of the new reference and confirms its utility in laboratory settings.
It’s one thing to make a better reference. It’s quite another to integrate it into useful practice. Some labs have been hesitant to use the new reference, but this study provides reassurance and guidance for those who are considering moving over.”
Dr Richard Gibbs, Study Senior Author and Director, Human Genome Sequencing Center
Gibbs is also the Wofford Cain Chair and Professor of Molecular and Human Genetics at Baylor.
Li, H., et al. (2021) Exome variant discrepancies due to reference genome differences. American Journal of Human Genetics. doi.org/10.1016/j.ajhg.2021.05.011.