In this interview, we speak to Professor Trey Ideker and Yue Qin about their latest in cell biology and how artificial intelligence could be used to discover new components within cells.
Please can you introduce yourself and tell us what inspired your latest research into artificial intelligence (AI) and cell biology?
My name is Trey Ideker. I am a professor at the University of California San Diego in the Department of Medicine. Our latest research in AI for cell biology was just published last week in the journal Nature, in a study led by Yue Qin, a graduate student in my laboratory. The inspiration for this study comes from thinking about cell biology and how we can make it more systematic, like genomics.
Although the first human genome was very expensive to produce, it pioneered a systematic approach by which many more human genomes could then be sequenced. Nowadays, your human genome costs less than $1000 and a day to produce.
What if we could apply the same systematic mapping technology to the rest of cell biology, not just the DNA in the nucleus? What if we could push a button and read out the exact structure of any one of your cells? What techniques would be required? Which of these are available now, which have yet to be invented?
Yue’s paper demonstrates an approach to map cells systematically, like genomes. It relies on two major data types that are commonly used in cell biology but are rarely combined. These two technologies are cellular imaging, on the one hand, and biochemical purifications, on the other.
We can trace most human diseases using our understanding of cells and their components. How is this?
Cancer mutations activate components of cells that signal them to proliferate (many of these components are so-called protein complexes). Alternatively, they deactivate the components responsible for the repair of DNA, so that even more mutations can occur.
Likewise, many of the genetic variants that cause autism affect proteins of the synapse, a major subcellular organelle. Many metabolic disorders are caused by problems with another organelle, the mitochondrion. Nearly all diseases can be traced to different parts of the cell like this.
Image Credit: Natali _ Mis/Shutterstock.com
Despite us having a pretty good understanding of cells and their components, there is still a lot of information we have yet to uncover concerning cells. Why this is and do you believe that as technology and techniques continue to evolve, we will keep making new discoveries?
As I mentioned above, there has not been a reliable technology for mapping cells. For that reason, much of cell biology (and the cell’s components and functions) remains to be discovered. This fact does seem remarkable since all of those biology textbooks (including Molecular Biology of the Cell and others) seem so authoritative. But the truth is there is still much we don’t know.
Can you describe how you carried out your latest research into cell components? What did you discover?
Using protein fluorescent imaging and protein biophysical association, we leveraged AI techniques to fuse both data types into a Multi-Scale Integrated Cell (MuSIC v1) which resolves 69 subcellular protein systems and about half of the systems lack previous documentation. By looking at those putatively novel systems, we have already made several interesting new discoveries.
For example, collaborating with our colleague Gene Yeo, we have identified a new complex of proteins that binds RNA and likely plays an important role in splicing, which is an essential cellular event during the translation of RNA to protein.
Image Credit: Charlotte Curtis, UC San Diego Health Sciences
Your latest multidisciplinary research combined microscopy, biochemistry, and AI techniques. Why did you choose to combine these disciplines and what advantages did combining these techniques have in your research?
Microscopy and biochemistry are two popular ways of characterizing proteins as they provide complementary information. Microscopy gives you a global view of the cellular position of a protein, often in reference to cellular landmarks such as the nucleus; and the physical protein associations, which are often assayed using biochemistry techniques, provide a very localized view of protein position relative to other nearby proteins.
Given the complementary view provided by these two platforms, we were intrigued by the question of how to properly combine them so that we can systematically chart the intricate cellular organization at vastly different scales. AI can be really helpful due to its ability in recognizing patterns from a massive amount of data whilst learning the protein-protein relationship shared by the local and global views provided by the two types of data.
Do you hope that your research will show other researchers the importance of combining techniques to gain a deeper insight and encourage more researchers to adopt similar approaches in their future work?
Of course! This is exactly what we hope our study could inspire researchers to do. As we have demonstrated in our study, both protein confocal images and protein biophysical associations are essential for gaining a complete understanding of cellular architecture, and each data modality informs us about cells at different scales.
In the past, integrating data of such distinct qualities and resolutions has been technically difficult and extremely hard to achieve at a systematic level, but our study presented an elegant solution that opens up the wide possibility of incorporating other diverse types of data into building proteome-wide cell maps.
How could your latest research be used to help further our understanding of human disease? What further research needs to be carried out before this can happen?
The multi-scale cell map we’ve prototyped in our current study looks only at one cell type – human embryonic kidney cell (HEK293). While HEK293 is a popular cellular model often used by scientists for understanding biological mechanisms that are disease-relevant, a more direct approach for understanding human disease would be building such multi-scale cell maps for pathological cells and tissues. This will enable us to identify potential disease-specific protein assemblies and understand, for example, how numerous genotypic perturbations converge on certain protein communities and at which scale this occurs.
Image Credit: cono0430/Shutterstock.com
Artificial intelligence and deep learning are continuing to emerge as powerful techniques within the scientific sector. What role do you think AI plays in scientific research and do you see its role changing over the next 10 years?
In much current research, AI mainly summarizes patterns from the existing data. Domain experts then examine AI-derived patterns and generate hypotheses by picking up unfamiliar or surprising ones.
Over the next 10 years, it is possible that AI can advance from pattern recognization to a hypothesis generation role, further lowering the barrier between different domains.
What are the next steps for your research?
The clear next step is to blow through the entire human cell and then move to different cell types, people and species.
Eventually, we might be able to better understand the molecular basis of many diseases by comparing what’s different between healthy and diseased cells.
Where can readers find more information?
About Professor Trey Ideker
Trey Ideker, Ph.D. is a Professor in the Departments of Medicine, Bioengineering, and Computer Science at UC San Diego. Additionally, he is the Director or Co-Director of the National Resource for Network Biology (NRNB), the Cancer Cell Map Initiative (CCMI), the Psychiatric Cell Map Initiative (PCMI), and the UCSD Bioinformatics Ph.D. Program, and former Chief of Genetics in the Department of Medicine.
Dr. Ideker received Bachelor’s and Master’s degrees from MIT in Electrical Engineering and Computer Science and his Ph.D. from the University of Washington in Molecular Biology under the supervision of Dr. Leroy Hood. He currently serves on the Editorial Boards for Cell, Cell Reports, Molecular Systems Biology, and PLoS Computational Biology and is a Fellow of AAAS and AIMBE. He is a member of the Web of Science Highly Cited Researchers list, reserved each year for the top 1% of scientists by citations.
He was named a Top 10 Innovator by Technology Review and was the recipient of the Overton Prize from the International Society for Computational Biology. His work has been featured in news outlets such as NPR, BBC, New York Times, Scientific American, Smithsonian, Discover, Forbes magazine, Popular Mechanics, and People Magazine.
The Ideker Laboratory seeks to map the molecular networks governing cancer and neurological disorders and to use these maps in artificially intelligent systems for precision medicine. The laboratory also produces the Cytoscape ecosystem of network analysis tools, which has been cited over 25,000 times.
About Yue Qin
Yue obtained her Bachelor of Science in Bioinformatics at UC San Diego, where she then continued her Ph.D. training in Bioinformatics and Systems Biology program, mentored by Dr. Trey Ideker.
Her research develops machine learning approaches to build structurally descriptive and functionally predictive models for human cells. She was awarded the NCI Predoctoral to Postdoctoral Fellow Transition Award (F99/K00) in 2021.