Modern genetic research often works with what are known as reference genomes. Such a genome comprises data from DNA sequences that scientists have assembled as a representative example of the genetic makeup of a species.
To create the reference genome, researchers generally use DNA sequences from a single or a few individuals, which can poorly represent the complete genomic diversity of individuals or sub-populations. The result is that a reference does not always correspond exactly to the set of genes of a specific individual.
Until a few years ago, it was very laborious, expensive and time-consuming to generate such reference genomes. For this reason, researchers concentrated on human genomes and the most important biological model organisms, such as the roundworm C. elegans.
However, as researchers now have access to fast sequencing machines, sophisticated algorithms that assemble DNA sequence readouts into complete chromosomes, and much greater computing power, creating reference genomes for other species has become increasingly practical. If researchers are to better understand evolution and other fundamental questions of biology, they need high-quality reference genomes for as many species as possible.
This includes livestock. For domestic cattle (Bos taurus), only a single reference genome was available until recently: from a Hereford cow called Dominette. Researchers had previously compared other DNA sequences of cattle against this reference to detect genetic variations and define corresponding genotypes. However, as it did not contain any genetic variants by which individuals differ, the previous reference did not reflect the diversity of the species.
A research team led by Hubert Pausch, Assistant Professor of Animal Genomics at ETH Zurich, has now filled this gap: with the genomes of three further breeds of domestic cattle, including the Brown Swiss (Original Schweizer Braunvieh), two closely related (sub-)species such as the zebu and the yak, and the existing reference genome for domestic cattle, the researchers have created a "pangenome". The study detailing these findings has just been published in the scientific journal PNAS.
This cattle pangenome integrates sequences contained in the six individual reference genomes.
This means we can reveal very precisely which sequences are missing, for example, in the Hereford based reference genome, but are present in, say, our Brown Swiss genome or the genomes of other cattle breeds and species."
Hubert Pausch, Assistant Professor of Animal Genomics, ETH Zurich
New genes and functionalities discovered
In this way, the ETH researchers discovered numerous DNA sequences and even whole genes that were missing in the previous reference genome of the Hereford cow. In a further step, the researchers investigated the transcripts of these genes (messenger RNA molecules), which allowed them to classify some of the newly discovered sequences as functionally and biologically relevant. Many of the genes they discovered are connected with immune functions: in animals that had contact with pathogenic bacteria, these genes were stronger or less active than in animals that had no contact with the pathogens.
This project was made possible by a new sequencing technology that has been available at the Functional Genomics Center Zurich for a year now. With this new technology, the researchers are able to precisely read out long DNA sections, reducing the complexity of the computing process needed to correctly assemble the analysed sections.
"The new technology simplifies the genome assembly process. Now we can create reference genomes quickly and precisely from scratch," Pausch says. In addition, such analyses also cost less, meaning that researchers can now generate genomes in reference quality from many individuals of a species.
The ETH researchers are collaborating closely with the Bovine Pangenome Consortium, which wants to create a reference genome of at least one animal from every cattle breed worldwide. It also plans to analyse the genetic makeup of wild relatives of domestic cattle in this way.
More targeted breeding possible
The consortium and ETH professor Pausch hope that the reference genome collection will help them make useful discoveries such as genetic variants that are no longer present in domesticated animals, but that their wild relatives still possess. This would provide clues as to which genetic characteristics were lost as a result of domestication.
"Things get really exciting when we compare our indigenous cattle with the zebu or with breeds that are adapted to other climate conditions," Pausch explains. This lets researchers find out which genetic variants make animals in tropical environments more heat tolerant. The next step could be to deliberately use crossbreeding to introduce these variants into other cattle breeds or precisely introduce them through genome editing. However, that is still a long way off. For the present, researchers can benefit from the greater speed and precision that the new cattle pangenome brings to the process of detecting the genes and DNA variants that differ between cattle breeds.
Crysnanto, D., et al. (2021) Novel functional sequences uncovered through a bovine multiassembly graph. Proceedings of the National Academy of Sciences. doi.org/10.1073/pnas.2101056118.