Scientists have developed a powerful, inclusive new tool for genomic research that boosts efforts to develop more precise treatments for many diseases by leveraging a better representation of the genetic diversity of people around the world.
The new tool will allow researchers to compare natural variations in our genes against genome sequences collected from a diverse group of people. Until now, scientists have compared these variations with a "reference genome" primarily sequenced from a few volunteers (~70% from one person) living near laboratories involved in the Human Genome Project almost 20 years ago. This represented genomes from a small number of people in a small number of countries.
The new software tool, called "Giraffe," enables the use of a reference point that is far more diverse and inclusive. Instead of relying on a single reference genome, Giraffe uses a "pangenome" that incorporates information about genome sequences from people around the world. This will give scientists a much more global perspective and help them understand why diseases often strike certain groups disproportionately.
A major advantage of Giraffe is that it enables fast and sensitive comparison of short-read human genome sequences to a pangenome, which is essential for the widespread use of reference graphs that reduce bias in the human genome reference. Since the current effort in genomics is to move from a European-Caucasian base to a global representation, Giraffe can better define genetic variation in non-white populations and, as a result, have a major impact on precision medicine and application to understanding the genetic risk of disease."
Stephen S. Rich, PhD, Researcher, University of Virginia School of Medicine's Center for Public Health Genomics
A Giraffe's-eye view
Rich and UVA's Aakrosh Ratan, PhD, were part of a team of scientists who developed the new tool through the Trans-Omics for Precision Medicine (TOPMed) program backed by the National Institutes of Health's National Heart, Lung and Blood Institute.
Giraffe will make it easier for scientists to understand genetic variation in different populations. Instead of the default point of comparison being a single reference genome, it becomes more than 5,000 people from many different backgrounds. That will help scientists better detect important patterns in a global population approaching 8 billion. It will also reduce unintentional biases in genomic data widely used by doctors and scientists.
Giraffe will prove especially helpful when scientists are examining larger, more complex stretches of our genetic code. It will make it much easier for scientists to compare these large "structural variants," as the swathes are known. That will help scientists understand what the structural variants do and their role in diseases. That will, ultimately, help guide the development of new treatments.
"Giraffe has made a great impact on the discovery of structural variants, large and complicated regions in the human genome that could not be resolved by standard, short-read sequencing," said Ratan, who, like Rich, is part of the Center for Public Health Genomics and UVA's Department of Public Health Sciences. "This is critical as structural variation has been shown to be important for the risk of autism and other neuropsychiatric disorders, as well as many cancers. Giraffe is especially useful for detecting structural variation across diverse ethnic groups."
There are many other benefits as well, Rich added. "Giraffe has been shown to reduce bias, increase the speed of analysis, and improve discovery of large blocks of variation in the human genome across diverse ancestries. This single software tool increases inclusiveness and, hopefully, reduces health disparities in genomic studies by enabling the use of a more global pangenome reference."
The researchers have described Giraffe in the scientific journal Science. The research team consisted of Jouni Siren, Jean Monlong, Xian Chang, Adam M. Novak, Jordan M. Eizenga, Charles Markello, Jonas A. Sibbesen, Glenn Hickey, Pi-Chuan Chang, Andrew Carroll, Namrata Gupta, Stacey Gabriel, Thomas W. Blackwell, Ratan, Kent D. Taylor, Rich, Jerome I. Rotter, David Haussler, Erik Garrison and Benedict Paten.
Sirén, J., et al. (2022) Pangenomics enables genotyping of known structural variants in 5202 diverse genomes. Science. doi.org/10.1126/science.abg8871.