The genetic code has evolved to a three-nucleotide codon recognized by the complementary three-nucleotide anticodon (part of a seven-nucleotide transfer RNA (tRNA) sequence that facilitates interaction). The evolution of the genetic code follows specific anticodon preferences for the 2nd and 3rd anticodons - C > G > U >> A. Preferences are more extreme for the 3rd anticodon position because the 2nd anticodon position is central and easier to read.
Image Credit: majcot/Shutterstock.com
Cytosine is highly favored due to its pyrimidine nature (smaller) and its ability to form three hydrogen bonds with guanine. Guanine often supersedes Uracil due to its ability to form three rather than two hydrogen bonds. Adenine is highly disfavored due to these reasons and shaped the evolution of the genetic code. The 1st anticodon (corresponds to the third or “wobble” codon) demonstrates a similar pattern - G > (U/C) >>>> A. This is regulated by the strong propensity to recognize pyrimidine-purine pairs.
Taking it back to the start
The evolution of the genetic code starts with the Last Universal Common Ancestor (LUCA). The genetic code centers around the evolution of the tRNA anticodon that recognizes the mRNA codon. Archeal tRNAs are similar to primordial tRNA associated with primordial life. tRNA consists of a defined cloverleaf-like structure. Its structure evolved from the ligation of two different 31 nucleotide mini-helices (one D-loop mini-helix and two anticodon loop or T-loop mini-helices. The D-loop possesses a single C nucleotide to recognize a G nucleotide of the mRNA; the anticodon loop expresses an AAA repeat that recognizes a UUU mRNA codon.
It is posited that the early genetic code coded solely for glycine: all codons and anticodons recognize this amino acid. Glycine is the simplest amino acid. It also retains the most favorable anticodon positions (GCC UCC CCC). The primordial tRNA is similar to the tRNA glycine.
Polyglycine is posited as an early polypeptide cross-linker that stabilized proto-cells. It is a component in bacterial peptidoglycan cell walls. Hemolithin, recovered from meteorite samples, is a polyglycine peptide from outer space, indicating that a polyglycine world existed, even beyond an Earth environment.
The anticodon evolution
tRNA anticodon evolution shaped the expansion of the genetic code. This was facilitated by evolution in other critical elements, such as aminoacyl-tRNA synthetases (aaRS) which direct amino acid moieties to the ribosome. The code was sectored initially by ribozyme aaRS enzymes. Once the code evolved, the protein aaRS enzymes replaced them. More ancient tRNAomes are very diverse while it is more defined in more advanced species as the genetic code became defined and the protein-based structures replaced the RNA-based ones.
Ribozyme aaRS enzymes are capable of rapid mutation and replication, enabling modification of the genetic code over time. The evolution of aaRS enzymes facilitated the diversification of the genetic code. There are two families of aaRS enzymes, with several subclasses (A-E). aaRS-II enzymes evolved first; the aaRS-I family proceeded with it.
The expansion of the genetic code
The first significant expansion of the genetic code was from one to four amino acids – glycine, alanine, aspartic acid, and valine. These amino acids all coded for cytosine in the 3rd anticodon, generating the most favorable outcome. The second anticodon is the most critical position: each of these amino acids differed in their 2nd anticodon. This results in sectoring the genetic code defined by their middle anticodon.
The expansion to an eight amino acid genetic code warranted recognition of two anticodon positions. This allowed for the introduction of leucine, proline, glutamic acid, and arginine. Interestingly, the 1st, 3rd, and 4th columns (A, U, and C, respectively) form a different pattern from the 2nd column due to a different wobble position. Leucine, serine, and arginine became dominant, possessing six positions. Serine also jumped to Columns 2 to 4, requiring only a single nucleotide change (GGU→GCU), allowing for the more favorable 3rd codon. This was achieved due to its flexible aaRS enzyme with an expandable variable loop.
The development of a functional EF-Tu GTPase latch facilitated the expansion of the 3rd anticodon to four potential base permutations. In Column 2, this enabled the further sectoring to incorporate proline and threonine. Due to these amino acids' hydrophobic and neutral nature, there is no further sectoring of codons. Threonine RS-IIA, proline RS-IIA, and serine RS-IIA are closely related to aaRS enzymes by structure and sequence.
Column 1 contains valine, isoleucine, and leucine. These are all hydrophobic amino acids that exhibit conserved aaRS enzymes. Phenylalanine is a late addition to the code, possessing the unfavorable Row 1 position. Methionine functions as the start codon for translation. It shares a one codon sector with isoleucine. Different modification enzymes have been employed to prevent misreading by the tRNAIle to differentiate between AUA (Ile) and AUG (Met) codons.
Column 3 is the most innovative as it is determined by the 1st and 2nd anticodon positions. The development of the three-nucleotide code accelerated this process. The alternate striped pattern of aspartate and glutamate gave rise to the incorporation of histidine, lysine, glutamine, and asparagine. Asparagine and glutamine are believed to have entered the genetic code via the amination of the related aspartate and glutamate amidotransferases.
The unfavorable Row 1 was later occupied by tyrosine and a STOP codon. Glycine retained the most favorable position in Column 4. Arginine occupied Row 2-3 until serine succeeded in translocating. Tryptophan, cysteine, and a STOP codon are late entries into the genetic code relegated to disfavored Row 1.
Image Credit: ktsdesign/Shutterstock.com
- Lei, L. and Burton, Z. F. (2020) ‘Evolution of Life on Earth: tRNA, Aminoacyl-tRNA Synthetases and the Genetic Code’, Life (Basel, Switzerland), 10(3), p. 21. doi: 10.3390/life10030021.
- Lei, L. and Burton, Z. F. (2021) ‘Evolution of the genetic code’, Transcription, 12(1), pp. 28–53. doi: 10.1080/21541264.2021.1927652.