A team of New York University computer scientists has created a neural network that can explain how it reaches its predictions. The work reveals what accounts for the functionality of neural networks-; the engines that drive artificial intelligence and machine learning-; thereby illuminating a process that has largely been concealed from users.
The breakthrough centers on a specific usage of neural networks that has become popular in recent years-; tackling challenging biological questions. Among these are examinations of the intricacies of RNA splicing-; the focal point of the study-; which plays a role in transferring information from DNA to functional RNA and protein products.
"Many neural networks are black boxes-; these algorithms cannot explain how they work, raising concerns about their trustworthiness and stifling progress into understanding the underlying biological processes of genome encoding," says Oded Regev, a computer science professor at NYU's Courant Institute of Mathematical Sciences and the senior author of the paper, which appears in the Proceedings of the National Academy of Sciences. "By harnessing a new approach that improves both the quantity and the quality of the data for machine-learning training, we designed an interpretable neural network that can accurately predict complex outcomes and explain how it arrives at its predictions."
Regev and the paper's other authors, Susan Liao, a faculty fellow at the Courant Institute, and Mukund Sudarshan, a Courant doctoral student at the time of the study, created a neural network based on what is already known about RNA splicing.
Specifically, they developed a model-; the data-driven equivalent of a high-powered microscope-; that allows scientists to trace and quantify the RNA splicing process, from input sequence to output splicing prediction.
Using an 'interpretable-by-design' approach, we've developed a neural network model that provides insights into RNA splicing-;a fundamental process in the transfer of genomic information. Our model revealed that a small, hairpin-like structure in RNA can decrease splicing."
Oded Regev, computer science professor at NYU's Courant Institute of Mathematical Sciences
The researchers confirmed the insights their model provides through a series of experiments. These results showed a match with the model's discovery: Whenever the RNA molecule folded into a hairpin configuration, splicing was halted, and the moment the researchers disrupted this hairpin structure, splicing was restored.
The research was supported by grants from the National Science Foundation (MCB-2226731), the Simons Foundation, the Life Sciences Research Foundation, an Additional Ventures Career Development Award, and a PhRMA Fellowship.
Source:
Journal reference:
Liao, S. E., et al. (2023) Deciphering RNA splicing logic with interpretable machine learning. PNAS. doi.org/10.1073/pnas.2221165120.