Study Shows How AI and Single Cell Science are Changing Modern Medicine

Recent technological advances have pushed the boundaries of cellular biology far beyond what was conceivable even decades ago. While DNA remains a cornerstone of cell study, it is now joined by other molecules, including RNA and proteins.

Together, these simple units form a complex narrative about life.

At Yale School of Medicine (YSM), researchers are innovating advanced methods for gathering and analyzing cellular data to understand human development, aging, disease, and more. They are fine-tuning methods to pick up smaller and smaller signals and incorporating artificial intelligence (AI) to process the vast quantities of data produced by modern techniques.

From Many Cells to Single Cells

Studying cells is both a way to understand how things work and a route to resolving problems by identifying therapeutic targets.

"Omics" refers to the characterization of entire sets of biological molecules. Genomics, or whole genome study, came first, followed by transcriptomics, which looks at copied RNA "transcripts" that leave the nucleus; proteomics, or comprehensive protein study; and more.

At first, omics were applied to tissue samples containing many different types of cells. Experiments produced results that reflected an average of all cells present, such as common proteins in the liver, for example, or genes that are active in lung tissue. The volume of data was unprecedented, but it lacked the specificity to answer nuanced questions about specific cells.

Now, researchers have the technology to analyze cells type by type.

Single-cell sequencing, commercialized in 2013, allows researchers to catalog the types of cells within a tissue and ask questions about each one, highlighting complex dynamics within organs and tissues, and the tremendous amount of diversity each person contains.

As researchers improved their ability to interpret single-cell data, they began to consider the importance of the cell's physical environment.

"Cells often need to be interpreted and measured in their spatial context," says Siyuan (Steven) Wang, PhD, associate professor of genetics and cell biology at YSM. A cell's behavior is determined not only by its type but also its interactions with other cells. The same type of neuron in different regions of the brain, for example, may function differently.

Wang's lab uses imaging to view cells in three dimensions, as they exist in the body. The approach is complex; it involves sending small probes into individual cells to bind to specific targets. The researchers can then image the probes to reveal the locations and identities of their targets, often genes or transcripts.

A significant part of their research program focuses on visualizing the shape of the genome. In other words, how DNA is folded up in the cell nucleus. Because DNA is tightly coiled, the location of a gene can determine when, or if, it gets expressed. Last year, Wang and colleagues published the first single-cell 3D genome atlas for cancer and showed how the shape of the genome changes during cancer progression.

We found that just by having this new single-cell 3D genome data, we can better identify cancer-driving genes, and these are basically new therapeutic targets. We showed that cancer cells preferentially use 3D genome changes to control expression of some very important genes."

Siyuan (Steven) Wang, PhD, Associate Professor, Genetics and Cell Biology, Yale School of Medicine

Small Protein Fragments With Big Implications

Etienne Caron, PhD, has a hypothesis about how immune cells in the body recognize illness or injury. His idea has evolved over years of studying the small protein fragments called peptides that sit on the outside of all cells.

Cells modify their surface peptides to reflect their internal state, and immune cells use these fragments to monitor for problems. Tumor cells, for example, present certain peptides that only exist in tumors, thereby flagging cancer to the immune system. However, these tumor-specific peptides are nested within many other benign surface proteins, called self-peptides, that don't trigger a reaction.

"For many years, people have thought that the tumor-specific peptides are the only ones that matter. I believe that the self-peptides-the normal peptides-play a critical role in shaping how immune cells will react when they recognize a tumor-specific peptide," says Caron, an assistant professor of immunobiology at YSM.

Caron's field, "immunopeptidomics," is one of the newer omics, and it encompasses all the immune peptides that sit on cell surfaces in the body. Caron, a pioneer in the field, initiated and chaired the Human Immunopeptidome Project-an international collaboration focused on peptide research-for five years before joining the faculty at Yale in 2023.

"When I started, there were only five labs in the world doing this work. I saw an opportunity to build a community because, with so few researchers, collaboration makes much more sense than competition," says Caron.

Caron's hypothesis would have likely been impossible to test a decade ago, but today's technology is much more sensitive. A technique called mass spectrometry separates molecules in a biological sample by their mass-to-charge ratio, and researchers use it to study these samples. The surface peptides that form the immunopeptidome are both short and relatively scarce, which makes them difficult to capture. But thanks to advancements in mass spectrometry, Caron can take smaller samples and find even smaller and more elusive fragments.

"If we can see those protein fragments, then we can start educating the immune system," says Caron.

For example, a cell that is infected by a virus will display surface proteins that betray the infection. Designing drugs that can target those proteins could help promote faster recovery. And the tumor-specific peptides in cancer are promising targets for vaccines, which could train the immune system to target and kill cancerous cells. The possible applications for this relatively new branch of science are endless.

"Before it was just based on predictions, but now that we can measure the physical molecules, we can make the science much more accurate and ultimately more efficient," says Caron.

Immunopeptidomics and the Placenta

Since coming to Yale, Caron has met numerous researchers who are interested in applying immunopeptidomics to their work. Liza Konnikova, MD, PhD, an associate professor of neonatal-perinatal medicine and immunobiology at YSM, plans to use immunopeptidomics to understand how the immune system develops.

For the immune system, becoming a good gatekeeper means learning how to recognize potential threats. That means knowing how to distinguish its own cells from foreign ones, and how to identify pathogens, or disease-causing agents, circulating in the body. The immune system familiarizes itself with a pathogen through peptides on the cell surface. Once familiar, it can recognize threats faster in the future.

Konnikova and colleagues recently demonstrated that this process, called immune programming, begins earlier than researchers once thought. They found a specific type of immune cell involved with long-term memory in fetal intestinal tissue samples just halfway through pregnancy and wondered if proteins produced by bacteria-either pathogenic or beneficial-could travel across the placenta from mother to baby.

"People have presumed that this immune programming doesn't happen until after birth, which is true in mice, but in humans it seems more and more likely that it happens during pregnancy," says Konnikova.

To identify the factors contributing to immune programming, Konnikova's group is now studying the signals these cells send and receive. With immunopeptidomics, they are looking for evidence that proteins from microbes help program the fetal immune system.

Konnikova's lab also uses single-cell sequencing methods to address other developmental questions. Because this approach can generate valuable information from small samples, it is especially useful when biological material is limited.

"I think that this technology has really transformed how people study early life because the samples are tiny, tiny, particularly from preterm infants," says Konnikova.

How AI is Making Big Data Bite-Sized

Whether the samples are tiny or huge, the volume of data produced by these techniques can be overwhelming. Researchers were once limited by the rate of DNA sequencing, but now the data are so plentiful it sometimes buries meaningful results.

To contend with this issue, biologists often call upon AI, which excels at pattern recognition, making it a valuable tool for sequence- and image-based studies. With the right training, AI algorithms can sift through vast quantities of single-cell, transcriptomic, and even immunopeptidomic data many times faster than a human can. But although biology was quick to adopt AI, not all AI was designed with science in mind.

Smita Krishnaswamy, PhD, develops deep learning methods that can detect patterns and make predictions based on biomedical and neuroscientific data. Krishnaswamy started her career as a computer scientist and pivoted with the goal of developing computational methods for medicine, joining the faculty at Yale in 2015.

"We're looking for problems that are challenging to solve with existing techniques," says Krishnaswamy, an associate professor of genetics and of computer science at YSM. "Biological data is often noisy and very high dimensional. A lot of our solutions are based on the idea that there is a latent structure within that data."

Despite its complexity, biological data is not random. Genes and proteins interact in specific ways, which defines the underlying "shape" of the data. Krishnaswamy's group has developed algorithms that sort through the noise to visualize the data, and others that de-noise data.

"Learning the shape of the data is key. It sets us up to study the dynamics of the cells," says Krishnaswamy.

Like people, cells are constantly moving and changing in response to stimuli. Tracking one person or cell is challenging enough, but studying millions at once is unrealistic. Cells are easier to study in isolation, or frozen in time, which also limits what researchers can gather about their future progression.

Deep learning methods have aided Krishnaswamy's group in understanding cellular dynamics and representing how cells interact with other cells and change over time. Their algorithms can capture cells over a much longer time scale, showing, for example, how rapid cell division drives the development of an embryo.

One of her recent projects traced two different cell populations from the same tissue sample, one of which formed a tumor and the other didn't. This allowed them to work backwards and find genomic differences between the cells that might be responsible for their divergent fates.

An Algorithm for ALS

Sai Zhang, PhD, assistant professor of biomedical informatics and data science at YSM, uses AI to integrate different kinds of biological data. Zhang works with human genetic data and gene expression patterns to understand the consequences of a genetic mutation.

This approach helps resolve a common point of confusion in genome studies: association vs. causation. When researchers look for disease-causing genes or proteins, they often compare genetic data from many people with the same condition. Trends emerge, but it is difficult to determine whether these commonalities are causing disease or if it is just a coincidence.

"With genetic data, we can identify a bunch of variants associated with a disease, but we don't know which is causal because they are highly correlated, they always appear together," says Zhang.

By integrating the functional datasets showing expression patterns, the researchers can pick out genes that play a role in disease from the batch of related variants. Those genes become drug targets.

Zhang's lab just completed this process with amyotrophic lateral sclerosis (ALS). They trained an algorithm to scan genetic and gene expression data to find genes associated with a particularly deadly form of ALS. The model flagged a gene called CCDC14.

"We predicted that higher expression level of CCDC14 is bad and can reduce patient survival," says Zhang. "And when we tested it in cells, we saw that reducing expression of that gene restored the cells to a normal, healthy state."

Although promising, the researchers weren't sure how their results would translate to living animals. They designed a small molecule to target the gene and tested it in mouse models for ALS. The mice that received this therapeutic lived longer than their counterparts. The researchers are still examining how the drug has this effect and plan to continue testing it as a potential treatment.

Single-cell approaches enable Zhang's lab to achieve an even more nuanced understanding of the effects of mutations. By integrating genetic data with single-cell data, they can determine which mutations matter for which cells. They described this approach in a Nature Biotechnology paper last year.

"The same mutation is present in every cell in the human body, but they have different functions," says Zhang. Understanding the context in which a mutation becomes problematic helps researchers fine-tune drug development to reach the right target.

Efforts to understand why people get sick are motivated by a desire to restore health. Along the way, they reveal details about human biology that contribute to an ever-expanding understanding of basic science.

"If you want to design an effective drug you need to know the biology of the disease," says Zhang. "When you figure out why it is developed and which genes are disrupted, then you can design therapeutics that work."

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoLifeSciences.

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Pronuclei Size Competition Maintains Gene Regulation in Early Embryos