Recent research emphasizes that machine learning can identify “genes of importance” that support crops to grow with reduced fertilizer. Machine learning can also foretell further traits in plants and disease outcomes in animals, depicting its applications outside agriculture. The research was published in the journal Nature Communications.
Corn (maize) growing in the New York University (NYU) Rose Sohn Zegar Greenhouse on the roof of the NYU Center for Genomics & Systems Biology. Image Credit: New York University Coruzzi Lab.
Employing genomic data to foretell outcomes in medicine and agriculture is equally a promise and a challenge for systems biology. Scientists have been experimenting to evaluate how to better exploit the large amount of genomic data present to foretell the response of organisms to changes in nutrition, pathogen exposure, and toxins—which would in turn guide on disease prognosis, crop improvement, epidemiology, and public health.
Yet, precise prediction of these complex outcomes in medicine and agriculture from genome-scale information stays a major challenge.
Scientists from the New York University (NYU) and associates in the United States and Taiwan addressed this challenge through machine learning—a kind of artificial intelligence employed to identify patterns in data.
We show that focusing on genes whose expression patterns are evolutionarily conserved across species enhances our ability to learn and predict ‘genes of importance’ to growth performance for staple crops, as well as disease outcomes in animals.”
Gloria Coruzzi, Study Senior Author and Carroll & Milton Petrie Professor, Department of Biology, Center for Genomics and Systems Biology, New York University
According to Chia-Yi Cheng of NYU’s Center for Genomics and Systems Biology and National Taiwan University and the lead author of the study, “Our approach exploits the natural variation of genome-wide expression and related phenotypes within or across species.”
He further added, “We show that paring down our genomic input to genes whose expression patterns are conserved within and across species is a biologically principled way to reduce the dimensionality of the genomic data, which significantly improves the ability of our machine learning models to identify which genes are important to a trait.”
The scientists, as a proof of principle, showed that genes whose responsiveness to nitrogen is evolutionarily conserved between two different plant species—Arabidopsis, a small flowering plant commonly employed as a model organism in plant biology; and corn varieties, America’s largest crop—substantially enhanced the capability of machine learning models to foretell genes of significance for how effectively plants utilize nitrogen.
Nitrogen is an important nutrient for plants and the major component of fertilizer. Crops that utilize nitrogen more efficiently show improved growth and require less fertilizer, which has both environmental and economic benefits.
The scientists carried out experiments that substantiate eight master transcription factors as genes of importance to nitrogen use efficacy. They demonstrated that altered gene expression in Arabidopsis or corn can enhance plant growth in fewer nitrogen soils—both tested in the laboratory at NYU and in the cornfields of the University of Illinois.
Now that we can more accurately predict which corn hybrids are better at using nitrogen fertilizer in the field, we can rapidly improve this trait. Increasing nitrogen use efficiency in corn and other crops offers three key benefits by lowering farmer costs, reducing environmental pollution, and mitigating greenhouse gas emissions from agriculture.”
Stephen Moose, Study Author and Alexander Professor, Crop Sciences, University of Illinois
Furthermore, the scientists demonstrated that this evolutionarily informed machine learning approach can be used on other traits and species by foretelling further traits in plants, including biomass and yield in both corn and Arabidopsis. They also demonstrated that this method could foretell genes of importance to drought resistance in another staple crop (rice) and disease outcomes in animals by examining mouse models.
Coruzzi states, “Because we showed that our evolutionarily informed pipeline can also be applied in animals, this underlines its potential to uncover genes of importance for any physiological or clinical traits of interest across biology, agriculture, or medicine.”
Many key traits of agronomic or clinical importance are genetically complex and hence it’s difficult to pin down their control and inheritance. Our success proves that big data and systems level thinking can make these notoriously difficult challenges tractable.”
Ying Li, Study Author and Faculty, Department of Horticulture and Landscape Architecture, Purdue University
Cheng, C.-Y., et al. (2021) Evolutionarily informed machine learning enhances the power of predictive gene-to-phenotype relationships. Nature Communications. doi.org/10.1038/s41467-021-25893-w.