A research team led by experts at the University of California, Riverside, has developed a computational workflow for analyzing massive data sets in metabolomics, the study of tiny chemicals found within cells, biofluids, tissues, and entire ecosystems.
The group most recently used this new computational technique to examine contaminants found in Southern Californian seawater. The group quickly identified probable contamination sources and recorded the chemical profiles of coastal habitats.
We are interested in understanding how such pollutants get introduced into the ecosystem. Figuring out which molecules in the ocean are important for environmental health is not straightforward because of the ocean’s sheer chemical diversity. The protocol we developed greatly speeds up this process. More efficient sorting of the data means we can understand problems related to ocean pollution faster.”
Daniel Petras, Assistant Professor and Research Lead, Department of Biochemistry, University of California, Riverside
Petras et al. explained in the journal Nature Protocols that their procedure is intended for both novice and expert researchers, making it a valuable tool for students and scientists in their early stages of careers.
In addition to this computational approach, a graphical user interface web application makes metabolomics data analysis user-friendly for non-experts, allowing users to obtain statistical insights into their data in a matter of minutes.
This tool is accessible to a broad range of researchers, from absolute beginners to experts, and is tailored for use in conjunction with the molecular networking software my group is developing. For beginners, the guidelines and code we provide make it easier to understand common data processing and analysis steps. For experts, it accelerates reproducible data analysis, enabling them to share their statistical data analysis workflows and results.”
Mingxun Wang, Study Co-Author and Assistant Professor, Department of Computer Science and Engineering, University of California, Riverside
According to Petras, the research report is special since it is a sizable instructional resource that was put together by a virtual research group known as Virtual Multiomics Lab, or VMOL. VMOL is an international community of more than 50 scientists that is open-access and community-driven.
It seeks to make chemical analysis more approachable and democratic so that researchers from all over the world can use it, regardless of their resources or backgrounds.
I am incredibly proud to see how this project evolved into something impactful, involving experts and students from across the globe. By removing physical and economic barriers, VMOL provides training in computational mass spectrometry and data science and aims to launch virtual research projects as a new form of collaborative science.”
Abzer Pakkir Shah, Doctoral Student and Study First Author, University of California, Riverside
The group created all of the free, open-source software. In 2022, the team founded VMOL and started the software development during a summer course at the University of Tübingen focused on non-targeted metabolomics.
Petras anticipates that scientists conducting clinical investigations in the realm of microbiome science, biomedical researchers, and environmental researchers will find particular value in the approach.
“The versatility of our protocol extends to a wide range of fields and sample types, including combinatorial chemistry, doping analysis, and trace contamination of food, pharmaceuticals, and other industrial products,” he said.
Source:
Journal reference:
Pakkir Shah, A. K., et al. (2024) Statistical analysis of feature-based molecular networking results from non-targeted metabolomics data. Nature Protocols. doi.org/10.1038/s41596-024-01046-3.