The human genome is the entire set of nucleic acid sequences for humans. It is encoded as DNA in the 23 chromosome pairs in the nucleus of the cell and a small circular DNA molecule found within mitochondria. In most cases, these are identified separately as the mitochondrial genome and nuclear genome.
Image Credit: Gio.tto/Shutterstock.com
Trials in deciphering the human genome
In February 2001, The Human Genome Project published the first draft sequence and initial analysis of the human genome. Just one day after the announcement of The Human Genome Project, Celera Corporation published its genome assembly.
The Human Genome Project was the most complete assembly, representing approximately 90% of the human genome with just 25% in curated form. The assembly generated by Celera Corporation was less accurate because of the direct shotgun approach used.
Since this date, the international collaboration made a great effort to turn this first draft into a complete genome sequence with high accuracy and maximum coverage.
In 2004, Human Genome Project's sequencing was completed, with only 341 gaps in the sequence, covering approximately 99% of the euchromatic genome with an error rate of approximately 1 event per 100,000 bases.
Most of the remaining euchromatic gaps are correlated to segmental duplications. The improvement in the human reference genome has not stopped, and until today, the Genome Reference Consortium is improving it.
Among all vertebrates, the human genome was the first to be sequenced with such high precision, and near completion level.
Molecular organization of the human genome
Human DNA is packaged into chromosomes, which are threadlike structures of nucleic acids and protein existed in the cell nucleus, carrying the genetic data in the form of genes. Diploid organisms, such as humans, carry two sets of genetic information; one set is inherited from the father and one from the mother.
The human genome is composed of 46 chromosomes, with approximately 3 billion base pairs. The human 46 chromosomes are arranged in 23 pairs, including 22 pairs of autosomal chromosomes, plus the 23rd pair of sex chromosomes (XY) in the male, and (XX) in the female.
Therefore, each human somatic cell has 22 pairs of autosomal chromosomes and 1 sex chromosome pair. The 22 pairs of chromosomes are numbered roughly in descending order, with the exceptions of chromosomes 21 and 22, the former being the smallest human chromosome. These are all large linear DNA molecules found within the nucleus of cells.
In the smallest human chromosomes, DNA molecule contains approximately 50 million nucleotide pairs, whereas the largest chromosomes have nearly 250 million nucleotide pairs. The human genome also includes a small DNA molecule found within individual mitochondria.
The human genome is commonly classified into coding and non-coding DNA sequences. Exons or coding sequences of DNA, which are made up of protein-coding genes, represent only a small percentage of the genome (<2%), whereas noncoding DNA regions (introns) do not encode proteins, representing more than 98% of the human genome.
Some parts of noncoding DNA regions contain genes for RNA molecules with significant biological functions such as ribosomal RNA (rRNA) and transfer RNA (tRNA).
ENCODE (Encyclopedia of DNA Elements), which is a contemporary genome research, aims to explore the whole human genome, using various and novel experimental tools to unveil the functional role and evolutionary origin of noncoding DNA regions.
Different regulatory sequences can control gene expression in the human genome. Although some studies reported that these sequences represent up to 8% of the genome, extrapolations from the ENCODE project found that 20-40% of the human genome could be gene regulatory sequences. Enhancers, which are a type of non-coding DNA found in the human genome, regulate when and where genes are likely to be expressed.
Mobile elements in the human genome can be divided into DNA transposons or retrotransposons, according to their mechanism of action. DNA transposons move in the human genome by a cut and paste mechanism, while retrotransposons mobilize through a copy and paste mechanism via an RNA intermediate in a retrotransposition process.
Human mobile elements represent nearly 45% of the human genome. All elements are the most abundant transposable genetic elements with approximately 50,000 active copies.
Nearly half of the human genome consists of repetitive DNA sequences. About 8% of the human genome consists of tandem repeats, which are short DNA sequences that are repeated head-to-tail with a variable propensity like “CAGCAGCAG…..”.
The length of these tandem sequences may be from two nucleotides to tens of nucleotides. Since tandem repeats are highly variable, they are extremely valuable tools in forensic DNA analysis and genealogical DNA testing.
Image Credit: koya979/Shutterstock.com
- Brown, T.A., 2002. The human genome. In Genomes. 2nd edition. Wiley-Liss.
- ENCODE Project Consortium, 2012. An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414), pp.57-74.
- Gannett, L., 2008. The human genome project.
- Giani, A.M., Gallo, G.R., Gianfranceschi, L. and Formenti, G., 2020. Long walk to genomics: History and current approaches to genome sequencing and assembly. Computational and Structural Biotechnology Journal, 18, pp.9-19.
- International Human Genome Sequencing Consortium, 2004. Finishing the euchromatic sequence of the human genome. Nature, 431(7011), p.931.
- Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., ... & Funke, R. (2001). Initial sequencing and analysis of the human genome.
- National Research Council, 1988. Mapping and sequencing the human genome. National Academies Press.
- Pidpala, O.V., Yatsishina, A.P., and Lukash, L.L., 2008. Human mobile genetic elements: structure, distribution, and functional role. Cytology and genetics, 42(6), pp.420-430.
- Prak, E.T.L., and Kazazian, H.H., 2000. Mobile elements and the human genome. Nature Reviews Genetics, 1(2), pp.134-144.
- Schneider, V.A., Graves-Lindsay, T., Howe, K., Bouk, N., Chen, H.C., Kitts, P.A., Murphy, T.D., Pruitt, K.D., Thibaud-Nissen, F., Albracht, D. and Fulton, R.S., 2017. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Research, 27(5), pp.849-864.
- Solyom, S., and Kazazian, H.H., 2012. Mobile elements in the human genome: implications for disease. Genome medicine, 4(2), p.12.