GET Model Unveils Gene Regulation with Unmatched Accuracy Across Cell Types

Transcriptional regulation is a fundamental biological process where proteins interact with deoxyribonucleic acid (DNA) to control gene activity. Despite its critical role, understanding these regulatory mechanisms across diverse human cell types remains challenging.

A recent study published in Nature introduced a computational model called the General Expression Transformer (GET), which was designed to predict gene expression with remarkable accuracy.

The researchers demonstrated that GET could utilize chromatin accessibility and sequence data to predict gene activity and identify regulatory elements and transcription factor interactions. This offers insights into cell-specific gene regulation and potential links to disease-related genetic variations.

Blue 3d DNA molecule helix with red spot.Study: A foundation model of transcription across human cell types. Image Credit: Butusova Elena/Shutterstock.com

Background

Gene expression is a coordinated result of complex networks involving transcription factors, chromatin structure, and DNA sequences. While the fundamental mechanisms of transcriptional regulation are largely conserved, cell-specific variations make it difficult to generalize findings across cell types.

Moreover, although computational models, such as Expecto and Enformer, have attempted to predict gene expression, they often require training on specific cell types, limiting their generalizability.

Additionally, existing methods struggle to incorporate diverse regulatory contexts, such as distal regulatory elements or chromatin accessibility patterns. Recent advancements in foundation models, which are trained on vast datasets for myriad tasks, show promise in addressing these challenges.

However, no foundation model to date has effectively predicted transcriptional regulation across the chromatin landscape or a wide range of cell types and conditions.

The Current Study

The present study applied the GET model to investigate transcriptional regulation across 213 human cell types. The researchers designed GET to interpret chromatin accessibility data and DNA sequence information and employed a two-stage process involving pretraining and fine-tuning.

In the pretraining phase, the model was taught to identify regulatory patterns by analyzing chromatin accessibility data, such as assay for transposase-accessible chromatin using sequencing (ATAC-seq) from diverse cell types.

The researchers achieved this through self-supervised learning, where the model predicted masked chromatin features, enabling it to capture complex regulatory relationships.

The fine-tuning process involved integrating paired chromatin accessibility and ribonucleic acid (RNA) sequencing data, which allowed GET to translate regulatory information into gene expression predictions.

The model used embeddings and an attention-based architecture to identify interactions between regulatory elements and transcription factors within local genomic regions. Additional normalization techniques and clustering methods were applied to enhance the model's generalizability and ensure robustness across diverse cell types.

Various analyses, including benchmarking GET against other models and in silico experiments, were conducted to assess the model’s ability to predict regulatory activity in unseen cell types.

The study also evaluated GET's predictions for precision and scalability, leveraging various experimental datasets, including fetal chromatin accessibility atlases and multi-omic sequencing data.

Major Findings

The study demonstrated that the GET model can effectively predict gene expression with high accuracy across diverse human cell types, including previously unseen ones. The benchmarking results demonstrated that GET outperformed existing models in predicting transcriptional activity, especially in contexts requiring the identification of distal regulatory elements.

For example, GET achieved experimental-level precision in predicting expression from chromatin accessibility data.

Notably, GET identified unique regulatory interactions and elements, such as distal enhancers influencing fetal hemoglobin levels in erythroblasts. These findings highlighted GET's ability to uncover previously unknown regulatory mechanisms, as earlier models missed these regions.

In B cells, GET revealed interactions between specific transcription factors, such as nuclear receptors, and paired box gene 5 (PAX5), which codes for a transcription factor that regulates B cell development. This expounds on the functional implications of genetic variants associated with leukemia risk.

The model also displayed exceptional versatility in adapting to diverse sequencing platforms and assay types, including single-cell and bulk chromatin accessibility datasets.

Additionally, GET successfully predicted regulatory element activity in zero-shot experiments, which are machine learning scenarios where an artificial intelligence model is trained to identify and classify objects or concepts it has not previously encountered or seen examples of. This emphasized the model’s utility in contexts lacking prior experimental data.

The researchers also demonstrated GET's ability to map transcription factor interactions and construct structural catalogs of regulatory networks. For instance, GET identified motif-motif interactions and established correlations with protein-protein interactions, providing insights into both direct and cofactor-mediated transcription factor interactions.

Conclusions

Overall, the findings showed that through its interpretable framework, GET provided detailed maps of transcription factor interactions and regulatory networks.

Furthermore, the model’s capacity to analyze extensive datasets, adapt to new sequencing platforms, and predict gene expression without prior knowledge of specific cell types underscored its potential for comprehensively exploring transcriptional regulation.

Its adaptability and scalability also make it a promising tool for exploring transcriptional dynamics and understanding disease-associated genetic variations.

Journal reference:

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Sidharthan, Chinta. (2025, January 20). GET Model Unveils Gene Regulation with Unmatched Accuracy Across Cell Types. AZoLifeSciences. Retrieved on January 20, 2025 from https://www.azolifesciences.com/news/20250120/GET-Model-Unveils-Gene-Regulation-with-Unmatched-Accuracy-Across-Cell-Types.aspx.

  • MLA

    Sidharthan, Chinta. "GET Model Unveils Gene Regulation with Unmatched Accuracy Across Cell Types". AZoLifeSciences. 20 January 2025. <https://www.azolifesciences.com/news/20250120/GET-Model-Unveils-Gene-Regulation-with-Unmatched-Accuracy-Across-Cell-Types.aspx>.

  • Chicago

    Sidharthan, Chinta. "GET Model Unveils Gene Regulation with Unmatched Accuracy Across Cell Types". AZoLifeSciences. https://www.azolifesciences.com/news/20250120/GET-Model-Unveils-Gene-Regulation-with-Unmatched-Accuracy-Across-Cell-Types.aspx. (accessed January 20, 2025).

  • Harvard

    Sidharthan, Chinta. 2025. GET Model Unveils Gene Regulation with Unmatched Accuracy Across Cell Types. AZoLifeSciences, viewed 20 January 2025, https://www.azolifesciences.com/news/20250120/GET-Model-Unveils-Gene-Regulation-with-Unmatched-Accuracy-Across-Cell-Types.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoLifeSciences.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
New Approach Aims to Make CRISPR Therapy Safer and More Effective