Improved efficacy of advanced genome editing with machine learning

Researchers at the Wellcome Sanger Institute have developed a new tool to estimate the probability of successfully introducing a gene-edited DNA sequence into a cell’s genome using the prime editing method.

Improved efficacy of advanced genome editing with machine learning

Image Credit: Adobe Stock

Prime editing, a development of CRISPR-Cas9 gene editing technology, offers enormous promise to heal human genetic diseases including cystic fibrosis and cancer. Yet, it is still unclear what factors affect how effectively modifications work.

Thousands of distinct DNA sequences that were inserted into the genome utilizing prime editors were evaluated in the study, which was published in Nature Biotechnology.

After that, a machine learning system was trained using these data to assist researchers in creating the optimum fix for a particular genetic issue, which should hasten attempts to introduce prime editing into clinical settings.

CRISPR-Cas9 was the first easily programmable gene editing tool, and it was created in 2012. Researchers were able to cut DNA at any point in the genome using these “molecular scissors” to add, delete, or change specific DNA sequence segments.

With the use of technology, it has become possible to identify the genes that play a role in a variety of diseases, from cancer to rare diseases, and to create medicines that can correct or silence those genes or detrimental mutations.

Base editors, an innovation that is built on CRISPR-Cas9, are referred described as “molecular pencils” since they can change a single DNA base. The newest gene editing technologies were developed in 2019 and are known as prime editors. They have earned the moniker “molecular word processors” due to their ability to precisely execute search and replace operations on the genome directly.

The ultimate goal of these technologies is to fix dangerous gene mutations in humans. The causal relationship between diseases and more than 16,000 minor deletion variants, in which a few DNA bases have been removed from the genome, has been established.

This includes cystic fibrosis, where a mere three DNA nucleotides are deleted in 70% of cases. Chemotherapy and a bone marrow transplant were unsuccessful in treating a patient’s leukemia in 2022, but base-edited T-cells were successful.

In this new study, 3,604 DNA sequences with lengths ranging from one to 69 DNA bases were created by scientists at the Wellcome Sanger Institute. Three separate human cell lines were given these sequences by employing diverse prime editor delivery methods in varied DNA repair contexts.

The cells’ genomes were sequenced a week later to determine if the alterations had been successful or not.

To identify common characteristics in each edit’s success, the insertion efficiency, or success rate, of each sequence was evaluated. Both the type of DNA repair mechanism used and the length of the sequence were shown to be crucial variables.

The variables involved in successful prime edits of the genome are many, but we’re beginning to discover what factors improve the chances of success. Length of sequence is one of these factors, but it’s not as simple as the longer the sequence the more difficult it is to insert. We also found that one type of DNA repair prevented the insertion of short sequences, whereas another type of repair prevented the insertion of long sequences.

Jonas Koeppel, PhD Student, Wellcome Sanger Institute

The researchers used machine learning to identify patterns in these data, such as length and the type of DNA repair involved, that predict insertion success. The algorithm was tested on new data after being trained on the pre-existing data, and it was discovered to properly predict insertion success.

Put simply, several different combinations of three DNA letters can encode for the same amino acid in a protein. That is why there are hundreds of ways to edit a gene to achieve the same outcome at the protein level. By feeding these potential gene edits into a machine learning algorithm, we have created a model to rank them on how likely they are to work. We hope this will remove much of the trial and error involved in prime editing and speed up progress considerably.

Juliane Weller, PhD Student, Wellcome Sanger Institute

To better understand if and how prime editing could be used to treat all known human genetic diseases, the team’s next steps will be to create models for each condition. Several Sanger Institute research teams as well as its partners will be involved in this.

The potential of prime editing to improve human health is vast, but first we need to understand the easiest, most efficient and safest ways to make these edits. It’s all about understanding the rules of the game, which the data and tool resulting from this study will help us to do.

Dr Leopold Parts, Group Leader, Function of Human DNA and its Variation, Wellcome Sanger Institute

Journal reference:

Koeppel, J., et al. (2023). Prediction of prime editing insertion efficiencies using sequence features and DNA repair determinants. Nature Biotechnology.


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoLifeSciences.
Post a new comment
You might also like...
Reassessment of the Role of Background Mutations in Genome Editing