FLSHclust Unveils 188 Novel CRISPR-Associated Gene Modules

Employing a groundbreaking algorithm named FLSHclust (“flash clust”), scientists have unearthed 188 uncommon and previously undiscovered CRISPR-associated gene modules, among billions of protein sequences. This includes the identification of a novel type VII CRISPR-Cas system.

Genetic engineering and gene manipulation concept. Hand is replacing part of a DNA molecule. 3D rendered illustration of DNA.

Image Credit: vchal/Shutterstock.com

This innovative approach and its revelations open up fresh possibilities for utilizing CRISPR systems and gaining insights into the extensive functional diversity of microbial proteins. CRISPR systems have been pivotal in developing a diverse range of biomolecular methods, notably CRISPR/Cas-mediated genome editing.

The revelation of previously unknown CRISPR systems holds promise for advancing biotechnologies, leading to safer and more efficient genomic therapeutics.

While the CRISPR toolbox has expanded through computational searches of protein sequence databases, the prevalent algorithmic methods have become impractical for navigating the exponentially growing datasets containing billions of proteins.

In response to this limitation, Han Altae-Tran and collaborators devised FLSHclust (fast locality-sensitive hashing-based clustering) – an algorithm designed for clustering proteins based on sequence similarity. Unlike current methods, FLSHclust can rapidly and efficiently analyze extensive protein sequence databases.

To validate their methodology, Altae-Tran et al. utilized FLSHclust to explore rare CRISPR systems within an 8.8 terrabase pair metagenomic database housing 8 billion proteins and 10.2 million CRISPR arrays. The analysis brought to light 188 previously unknown CRISPR-associated genes.

Furthermore, the researchers pinpointed and characterized a novel class of Cas-14 containing CRISPR system, specifically type VII, which operates on RNA. The newly identified systems were deemed rare, with many confined to a single cluster among the nearly 130,000 CRISPR-linked clusters unveiled by FLSHclust.

The discovery of previously unknown cas genes and CRISPR systems substantially expands the known CRISPR diversity, emphasizing the functional versatility of CRISPR whereby previously undiscovered proteins and domains are often recruited, either replacing preexisting components or conferring newly identified functions to the preexisting scaffold of Cas proteins.”

Han Altae-Tran, Massachusetts Institute of Technology

Altae-Tran added, “Taken together, the results of the work reveal unprecedented organizational and functional flexibility and modularity of CRISPR systems but also demonstrates that most variants are rare and only found in relatively unusual bacteria and archaea.”

Journal reference:

Altae-Tran, H., et al. (2023) Uncovering the functional diversity of rare CRISPR-Cas systems with deep terascale clustering. Science. doi.org/10.1126/science.adi1910.


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoLifeSciences.
Post a new comment

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
RNA Technology Enables Precise Control of Gene Networks