Pseudotime trajectory inference is a computational method for studying and representing, in one dimension, the dynamic changes in a cell or cell population over time.
Comparing trajectories can often provide critical information, but the currently used approaches are often limited in their ability to capture complex patterns such as sequential mismatches.
A recent study published in Nature Methods introduced a Bayesian information-theoretic framework known as Genes2Genes, which can accurately infer alignments in simulated and real datasets of single-cell trajectories and capture sequential mismatches and matches in individual genes.
Study: Gene-level alignment of single-cell trajectories. Image Credit: aycan balta/Shutterstock.com
Background
Single-cell ribonucleic acid sequencing (scRNAseq) has substantially improved our ability to study biological processes at a single-cell level by allowing scientists to observe and measure the activity of thousands of genes within each cell.
This has helped decipher how cells transition from one state to another, such as changes in response to drug treatments or during development.
To explore these temporal changes in dynamic cellular processes, researchers use a method called pseudotime trajectory inference, which creates a one-dimensional timeline of cellular events or changes.
However, comparing two or more timelines, such as between diseased and healthy cells or treatment and control groups, has proved challenging.
Traditional methods used for comparing timelines, such as dynamic time warping, align timelines by matching similar points in time, which assumes that every point in time has a perfect match and is, therefore, unable to resolve mismatches in trajectories that could indicate unobserved cellular states or pertinent differences.
About the Study
In the present study, the researchers presented the new Genes2Genes framework that improves upon dynamic time warping by identifying mismatches and matches at the single-gene level.
The study also presented a proof-of-concept experiment using Genes2Genes, where the framework was used to align in vivo and in vitro T cell development trajectories and other simulated and real datasets.
Three major experiments were conducted to assess the alignment of cell trajectories using Genes2Genes and compare it with that of two existing methods CellAlign and TrAGEDy. In the first experiment, a simulated dataset of 3,500 cell trajectories was created in seven alignment patterns, which included convergence, divergence, and matching.
Each of the trajectories contained 300 cells distributed across Pseudotime, and 15 different interpolation times were used to analyze the patterns. Accuracy rates were then used to optimize the Genes2Genes parameters to determine how well the method was able to align the mismatched and matched regions between trajectories.
In a second experiment, the researchers used a real scRNAseq dataset of 1,845 cells derived from murine pancreatic β-cells and applied Genes2Genes and TrAGEDy to the dataset. They then introduced perturbations into the dataset by simulating mismatches either as changes or deletions to the cell trajectories at different time points.
Two other datasets, one lung epithelial cell from healthy individuals and idiopathic pulmonary fibrosis patients and another comparing in vivo and in vitro human T-cell development, were also used to assess the performance of Genes2Genes.
The researchers also conducted a third experiment for negative control using two completely unrelated trajectories, which should show no matches. All the methods were applied to both trajectories to ensure that they generated a 100% mismatch and to establish the benchmark for the ability to detect non-related trajectories.
Major Findings
The study showed that the Genes2Genes framework accurately identified gene expression patterns and effectively detected mismatches, with the performance being significantly better than the two other methods TrAGEDy and CellAlign.
In the simulated dataset containing 3,500 gene pairs, Genes2Genes was able to detect different alignment patterns with 98% to 100% accuracy, which was significantly higher than that of the other two methods.
Genes2Genes also consistently outperformed the other methods in the real biological datasets and successfully identified gene expression mismatches and matches between the alignments. In situations where small mismatches can indicate important cellular changes, this ability to identify mismatches accurately is critical.
Even when there were substantial differences between the cells, as in the case of in vivo and in vitro β-cells in the murine pancreatic development dataset, Genes2Genes was able to effectively capture gene expression mismatches, making it a more suitable method for capturing trends in gene expression than TrAGEDy.
In the dataset comparing healthy lung epithelial cells with those from idiopathic pulmonary fibrosis patients, Genes2Genes was able to detect the expected mismatches, especially in the late stages of cell development.
It was also able to detect the abnormal basaloid cells that are characteristic of idiopathic pulmonary fibrosis, with the early mismatches for some genes highlighting their potential use as biomarkers or therapeutic targets.
Genes2Genes was also able to reveal key differences in the trajectories of in vivo and in vitro T-cell development, such as the absence of tumor necrosis factor (TNF) signaling in vitro cells. Furthermore, the genes sex-determining region Y-box 4 (SOX4) and forkhead box protein P1 (FOXP1) were identified as potential targets to refine T-cell development based on differences in gene expression.
Conclusions
Overall, the study showed that the Genes2Genes framework was highly effective in identifying patterns and mismatches in gene expression trajectories in various biological systems.
Its performance surpassed that of existing methods in both simulated and real datasets, and the findings provided valuable insights for improving cell-based experimental models and identifying potential therapeutic targets.
Journal reference:
-
Sumanaweera, D., Suo, C., Cujba, A., Muraro, D., Dann, E., Polanski, K., Steemers, A. S., Lee, W., Oliver, A. J., Park, J., Meyer, K. B., Dumitrascu, B., & Teichmann, S. A. (2024). Gene-level alignment of single-cell trajectories. Nature Methods. doi:10.1038/s41592024023784. https://www.nature.com/articles/s41592-024-02378-4