Generative AI Decodes the 3D Structure of DNA

Although every cell in the body contains the same genetic sequence, only a subset of genes is expressed in each cell. This selective gene expression, which differentiates a brain cell from a skin cell, is regulated by the three-dimensional structure of genetic material that determines gene accessibility.

MIT chemists have now developed a generative AI-based method to predict these 3D genome structures significantly faster than current experimental techniques. Their approach can generate thousands of structures within minutes, offering a powerful tool for studying how genome organization influences gene expression.

Predicting 3D Genome Structures

“Our goal was to predict the three-dimensional genome structure from the underlying DNA sequence. Now that we can do that—on par with cutting-edge experimental techniques—it opens up many exciting opportunities,” said Bin Zhang, Associate Professor and senior author of the study from MIT’s Department of Chemistry.

The study, published in Science Advances, was co-authored by MIT graduate students Greg Schuette and Zhuohan Lao.

From Sequence to Structure

Cells manage to fit nearly two meters of DNA within a nucleus just one-hundredth of a millimeter in diameter, thanks to chromatin—a complex of DNA and proteins with multiple levels of the organization. DNA strands wrap around histone proteins, forming structures resembling beads on a string.

Epigenetic modifications, chemical tags attached to specific DNA sites, influence chromatin folding and gene accessibility. These modifications help determine which genes are active in different cell types or at different times.

For the past two decades, scientists have used experimental methods like Hi-C to map chromatin structures. This technique captures DNA segments that are physically close in the nucleus by breaking DNA into fragments and sequencing them. Hi-C can be applied to single cells or large populations, but it is time-consuming, often requiring a week to generate data from a single cell.

AI-Powered Structural Predictions

To overcome these limitations, Zhang and his team developed an AI model that rapidly analyzes DNA sequences and predicts their chromatin conformations.

“Deep learning excels at pattern recognition. It enables us to analyze long DNA segments—thousands of base pairs—and extract crucial structural information,” Zhang explained.

Their model, ChromoGen, consists of two key components:

  1. A deep learning model trained to analyze DNA sequences and chromatin accessibility data, which varies by cell type.

  2. A generative AI model trained on over 11 million chromatin conformations to predict physically accurate structures.

Using experimental data from 16 human B lymphocyte cells obtained via Dip-C (a variation of Hi-C), the model successfully captured sequence-structure relationships. The first component informs the generative model about how different cell environments affect chromatin formation, allowing the AI to generate multiple possible structures for a given DNA sequence.

“Predicting genome structure is complex because there isn’t a single correct solution. Each genomic region can adopt a variety of conformations,” said Greg Schuette, study co-author. “Our model predicts this high-dimensional distribution with remarkable accuracy.”

Rapid and Accurate Analysis

Once trained, the model operates at unprecedented speed. “Whereas traditional experiments might take six months to analyze a few dozen structures in one cell type, our model can generate a thousand structures in just 20 minutes using a single GPU,” Schuette noted.

Comparing over 2,000 AI-predicted structures with experimental data, the researchers found strong alignment between their model’s results and actual chromatin conformations. The model also successfully generalized to cell types it wasn’t specifically trained on, suggesting broad applicability in exploring chromatin structures across different cell types.

Future applications include studying how chromatin variations influence gene expression and investigating how DNA mutations affect chromatin conformation, potentially shedding light on disease mechanisms.

“There are many exciting questions we can address with this model,” Zhang said.

The researchers have made ChromoGen and its associated data publicly available, allowing scientists worldwide to explore genome organization and its role in health and disease.

Source:
Journal reference:

Zhang, B., et al. (2025) ChromoGen: Diffusion model predicts single-cell chromatin conformations. Science Advances. doi/10.1126/sciadv.adr8265

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoLifeSciences.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
New Breakthrough Pushes DNA-Nanoparticle Motors Closer to Natural Protein Speed