The latest advances in next-generation sequencing technologies have revolutionized the field of medical genetics. Genomic medicine and medical genetics are relatively new fields that involve using genomic technologies for diagnostics and therapeutic decision-making. These genomics-based approaches are becoming increasingly feasible in clinical settings as the cost of whole-genome sequencing data continues to fall.
Image Credit: PopTika/Shutterstock.com
Long-read sequencing is an emerging approach capable of capturing the complexity of the genome by identifying structural variations at the whole genome level. This strategy exhibits a series of advantages compared to short-read sequencing methods, which are not only associated with the sequence length but also speed and fragmentation bias.
Long-Read Sequencing vs. Short-Read Sequencing
Long-read sequencing -often referred to as third-generation sequencing- is a DNA sequencing technique that can determine the nucleotide sequence of long pieces of DNA that range in size from tens to hundreds of kilobases (kb) simultaneously at the same time. This technique enables efficient sequencing of long sequences of DNA over 10 kb in one single read, whereas short-read sequencing methods yield much smaller pieces of a few hundred base pairs in size (often ranging from 150 bp to 300 bp).
But the advantages of the long-read sequencing technologies do not end in fewer reads. This approach can be readily applied to a real-time environment to sequence long pieces of DNA without the need for amplification, making it faster than traditional short-read sequencing methods.
Short read sequencing methods pose some technical hurdles that are especially heightened for large and complex genomes like the human genome. A single set of human chromosomes, i.e., the haploid human genome, is approximately 2.9 billion base pairs (2,900 Mb) long, of which more than one-half consists of repetitive sequences. This highly redundant nature of the human genome makes more difficult the assemblage and annotation of nucleotide sequences obtained from short-read sequencing.
Read Next: Types of Gene Editing
The Importance of the Long-Read Sequencing Technology in Genomic Medicine
Whole-exome sequencing is a technique focused on targeted sequencing of all the protein-coding regions of the genome (i.e., exome analysis). This technique is considered a simple, cost-effective, and efficient approach for diagnosing many genetic disorders. However, whole-exome sequencing cannot detect different diseases since pathogenic mutations may also fall in the genome's non-coding (regulatory) regions.
Over the past decade, high-throughput whole-genome sequencing methods have revealed that the human genome contains over 20,000 structurally variable genomic regions. Many of these genomic regions include several thousands of insertions and deletions (collectively referred to as indels) that are on the order of dozens of megabase pairs in size. Genetic polymorphisms located in such variable regions can often represent a challenge for detection using short-read sequencing platforms.
In addition to the misidentification of structurally variable genomic regions, short-read sequencing methods also have problems assembling and determining which alleles are on the same chromosome, a technical hurdle known as allele phasing. Moreover, long-read sequencing technology is one of the most promising approaches for the characterization of genetic variation in large and complex genomes and entire genomic regions. This sequencing technology also identifies single nucleotide polymorphisms (SNPs) in loci associated with an increased risk of developing genetic and multifactorial diseases.
Image Credit: Vitalii Vodolazskyi/Shutterstock.com
The Future of Long-read Sequencing in Genomic Medicine
It is important to highlight that long-read sequencing still faces technical challenges for full application in genomic medicine. The most important problem may be the construction of suitable genomic libraries. Nonetheless, it is expected that such technical issues hampering the adoption of the long-read sequencing technology from lab bench to clinic can be solved in the next few years.
Personalized genomic medicine is an emerging concept involving the utilization of data from a patient's genome for treating genetic diseases in a personalized manner, associating a particular genotype with a given phenotype or health outcome. Long read sequencing has come to convert personalized genomics into something easier and more efficient for clinical settings. The development of more accurate and inexpensive long-read sequencing data will allow better access to this technology while delivering a modern concept of treatment that better fits patients' needs.
- De Coster, W., Weissensteiner, M. H., & Sedlazeck, F. J. (2021). Towards population-scale long-read sequencing. Nature Reviews Genetics, 22(9), 572-587.
- Glusman, G., Cox, H. C., & Roach, J. C. (2014). Whole-genome haplotyping approaches and genomic medicine. Genome medicine, 6(9), 1-16.
- Mantere, T., Kersten, S., & Hoischen, A. (2019). Long-read sequencing emerging in medical genetics. Frontiers in genetics, 10, 426.
- Wang, Y., Li, X., Wang, C., Gao, L., Wu, Y., Ni, X., ... & Jiang, J. (2021). Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read-and Illumina short read sequencing platforms. BMC genomics, 22(1), 1-16.
- Wei, P. L., Hung, C. S., Kao, Y. W., Lin, Y. C., Lee, C. Y., Chang, T. H., ... & Lin, J. C. (2020). Characterization of fecal microbiota with clinical specimen using long-read and short-read sequencing platform. International journal of molecular sciences, 21(19), 7110.
- Zhuo, Z., Rong, W., Li, H., Li, Y., Luo, X., Liu, Y., ... & Xiao, F. (2021). Long-read sequencing reveals the structural complexity of genomic integration of HBV DNA in hepatocellular carcinoma. NPJ genomic medicine, 6(1), 1-8.