Two decades after the Human Genome Project delivered the first preliminary human genome sequence, researchers have published the first full, gapless sequence of a human genome. According to researchers, knowing the whole spectrum of human genomic variability and the genetic connections to specific disorders requires obtaining a comprehensive, gap-free sequence of the nearly 3 billion bases (or “letters”) in human DNA.
The Telomere to Telomere (T2T) team led by experts from the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health, the University of California, Santa Cruz, and the University of Washington, Seattle, carried out the research. The National Human Genome Research Institute (NHGRI) primarily funded the study.
Analyses of the entire genome sequence will considerably improve the understanding of chromosomes, such as more precise maps for five chromosome arms, opening up new research avenues. This contributes to the understanding of how chromosomes properly segregate and multiply in basic biology.
The T2T consortium utilized the now-complete genome sequence as a starting point to find nearly 2 million new variations in the human genome. These findings add to the understanding of the genetic variations found in 622 medically important genes.
Generating a truly complete human genome sequence represents an incredible scientific achievement, providing the first comprehensive view of our DNA blueprint. This foundational information will strengthen the many ongoing efforts to understand all the functional nuances of the human genome, which in turn will empower genetic studies of human disease.”
Eric Green, MD, PhD, Director, National Human Genome Research Institute
The now-complete human genome sequencing will be especially useful for research aimed at establishing thorough views of human genomic diversity, or how people’s DNA differs. Such knowledge is essential for understanding the genetic contributions to particular diseases and, in the future, for employing genome sequencing as a standard aspect of clinical care.
For scientific studies, many research groups have already commenced using a pre-release version of the completely human genome sequence.
The complete sequencing relies on the Human Genome Project’s work, which mapped around 92% of the genome, and subsequent studies. To comprehend the intricate sequence, thousands of scientists have developed improved laboratory tools, computer methodologies, and strategic approaches. Six publications containing the entire sequence, as well as companion papers in numerous other journals, are published in Science.
The remaining 8% is comparable in size to a full chromosome and contains several genes and repetitive DNA. Scientists used a rare cell line that contains two identical copies of each chromosome, unlike ordinary human cells, which have two slightly different copies of each chromosome.
The majority of the newly inserted DNA sequences were found at the repetitive telomeres (long, trailing ends of each chromosome) and centromeres, according to the researchers (dense middle sections of each chromosome).
Ever since we had the first draft human genome sequence, determining the exact sequence of complex genomic regions has been challenging. I am thrilled that we got the job done. The complete blueprint is going to revolutionize the way we think about human genomic variation, disease, and evolution.”
Evan Eichler, PhD, Researcher, University of Washington School of Medicine
Eichler was also the co-chair of the T2T consortium.
The cost of sequencing a human genome using “short-read” technology, which yields several hundred bases of DNA sequence at a time, has dropped dramatically since the Human Genome Project ended. Short-read approaches alone, therefore, still leave certain gaps in completed genome sequences.
Growing investments in new DNA sequencing technology to create longer DNA sequence readings without sacrificing accuracy go hand-in-hand with the huge drop in DNA sequencing prices.
Two additional DNA sequencing technologies have emerged in the last decade, both of which provide substantially longer sequence reads. With low precision, the Oxford Nanopore DNA sequencing technique can read up to 1 million DNA letters in a single read, but the PacBio HiFi DNA sequencing method can read roughly 20,000 letters with near-perfect accuracy.
To obtain the full human genome sequence, scientists in the T2T consortium used both DNA sequencing approaches.
Using long-read methods, we have made breakthroughs in our understanding of the most difficult, repeat-rich parts of the human genome. This complete human genome sequence has already provided new insight into genome biology, and I look forward to the next decade of discoveries about these newly revealed regions.”
Karen Miga, PhD, Co-Chair, T2T Consortium
Miga’s research group at the University of California, Santa Cruz, is funded by NHGRI.
According to consortium co-chair Adam Phillippy, PhD, whose research group at the National Human Genome Research Institute spearheaded the final effort, sequencing a human’s entire genome should become affordable and easier in the near future.
“In the future, when someone has their genome sequenced, we will be able to identify all of the variants in their DNA and use that information to better guide their healthcare,” Phillippy concluded. “Truly finishing the human genome sequence was like putting on a new pair of glasses. Now that we can see everything, we are one step closer to understanding what it all means.”
Aganezov, S., et al. (2022) A complete reference genome improves analysis of human genetic variation. Science. doi.org/10.1126/science.abl3533.