New Probabilistic Model Unifies the Generation and Inference for Single-Cell, Spatial Omics Data

UCLA researchers have developed an "all-in-one," next-generation statistical simulator capable of assimilating a wide range of information to generate realistic synthetic data and provide a benchmarking tool for medical and biological researchers who use advanced technologies to study diseases and potential therapies. Specifically, the new computer-modeling – or "in silico" – system can help researchers evaluate and validate computational methods.

Single-cell RNA sequencing, called single-cell transcriptomics, is the foundation for analyzing genetic makeup (genome-wide gene expression levels) of cells. The introduction of additional "omics" offered detail on a range of molecular features, and in recent years, spatial transcriptomic technologies made it possible to profile gene expression levels with spatial location information of cell "neighborhoods," showing precise locations and movements of cells within tissue.

Thousands of computational methods have been developed to analyze single-cell and spatial omics data for a variety of tasks, making method benchmarking a pressing challenge for method developers and uses."

Jingyi Jessica Li, PhD, UCLA researcher and professor in statistics, biostatistics, computational medicine, and human genetics

Li is also affiliated with the Gene Regulation research area at the UCLA Jonsson Comprehensive Cancer Center. Li leads a research group titled the Junction of Statistics and Biology.

"Although simulators have evolved and become more powerful, there are numerous limitations. Few can generate realistic single-cell RNA sequencing data from continuous cell trajectories by mimicking real data, and most lack the ability to simulate data of multi-omics and spatial transcriptomics. We introduced the scDesign3, which we believe is the most realistic and versatile simulator to date, to fill the gap between researchers' benchmarking needs and the limitations of existing tools," said Li, senior author of a study published May 11 in Nature Biotechnology.

The UCLA researchers say they believe scDesign3 "offers the first probabilistic model that unifies the generation and inference for single-cell and spatial omics data. Equipped with interpretable parameters and a model likelihood, scDesign3 is beyond a versatile simulator and has unique advantages for generating customized in silico data, which can serve as negative and positive controls for computational analysis, and for assessing the goodness-of-fit of inferred cell clusters, trajectories, and spatial locations in an unsupervised way." Goodness-of-fit is a measure of how well a statistical model fits a set of observations.

According to the authors, the system's "transparent modeling and interpretable parameters can help users explore, alter, and simulate data. Overall, scDesign3 is a multi-functional suite for benchmarking computational methods and interpreting single-cell and spatial omics data."

This study was led by Li's student Dongyuan Song, a 4th-year Ph.D. student in the UCLA Interdepartmental Bioinformatics Ph.D. program.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoLifeSciences.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Study Reveals How Simple Molecules Could Have Formed Early Cell Membranes