Using artificial intelligence, scientists at Yale University, the Broad Institute of MIT and Harvard, and The Jackson Laboratory have created thousands of novel DNA switches that can precisely regulate a gene's expression in many cell types. Their novel method makes it possible to regulate gene expression in the body in previously unattainable ways, which could improve human health and medical research.
What is special about these synthetically designed elements is that they show remarkable specificity to the target cell type they were designed for. This creates the opportunity for us to turn the expression of a gene up or down in just one tissue without affecting the rest of the body.”
Ryan Tewhey, Ph.D., Associate Professor and Study Co-Senior Author, The Jackson Laboratory
Scientists can now change the DNA inside living cells thanks to genome editing tools and other gene therapy techniques. It has been challenging to alter genes solely in particular cell types or tissues as opposed to the entire organism.
This is partly due to the continuous difficulty in comprehending the DNA switches, known as cis-regulatory elements (CREs), that regulate gene expression and repression.
According to research published in the journal Nature, Tewhey and his colleagues created novel, never-before-seen synthetic CREs that successfully activated genes in blood, liver, or brain cells without activating those genes in other cell types.
Tissue- and Time-Specific Instructions
Even if an organism's cells all have the same genes, not all of them are required in every cell or at all times. For example, CREs assist in preventing the activation of genes necessary for early development in adults or the utilization of genes necessary for the brain by skin cells. CREs are distinct, regulatory DNA sequences that are frequently found close to the genes they regulate; they are not a part of genes.
The human genome contains hundreds of distinct CREs, each with a slightly different function, according to scientists. However, it has been challenging to understand the syntax of CREs “with no straightforward rules that control what each CRE does. This limits our ability to design gene therapies that only affect certain cell types in the human body,”explained Rodrigo Castro, Ph.D., a Computational Scientist in the Tewhey lab at JAX and Study Co-First Author.
This project essentially asks the question: ‘Can we learn to read and write the code of these regulatory elements? If we think about it in terms of language, the grammar and syntax of these elements are poorly understood. And so, we tried to build machine learning methods that could learn a more complex code than we could do on our own.”
Steven Reilly, Ph.D., Assistant Professor and Study Senior Author, Genetics, Yale University
The team used hundreds of thousands of DNA sequences from the human genome to train a model using deep learning, a sort of artificial intelligence (AI), by measuring the CRE activity in three different cell types blood, liver, and brain in the lab.
Thanks to the AI model, the researchers were able to forecast the activity for each sequence out of the nearly limitless conceivable combinations. By examining these predictions and finding new patterns in the DNA, the researchers learned how the grammar of the CRE sequences in the DNA affects the amount of RNA that would be produced, which is a proxy for the amount of gene activation.
To efficiently design thousands of entirely new CREs with desired characteristics, such as activating a specific gene in human liver cells but not in human blood or brain cells, the team, which included Pardis Sabeti, MD, DPhil, Co-Senior Author of the study, core institute member at the Broad Institute, and Professor at Harvard developed a platform called CODA (Computational Optimization of DNA Activity).
The researchers enhanced the program's capacity to forecast the biological impact of each CRE and made it possible to design unique CREs never before observed in nature by combining iterative “wet” and “dry” research methods. They first built and then validated computational models using experimental data.
Natural CREs, while plentiful, represent a tiny fraction of possible genetic elements and are constrained in their function by natural selection. These AI tools have immense potential for designing genetic switches that precisely tune gene expression for novel applications, such as biomanufacturing and therapeutics, that lie outside the scope of evolutionary pressures.”
Sager Gosai, Ph.D., Postdoctoral Fellow and Study Co-First Author, Broad Institute of MIT and Harvard
Pick-and-Choose Organ
Tewhey and his associates examined the novel artificial intelligence (AI) - designed synthetic CREs by introducing them into cells and assessing their ability to both prevent gene expression in other cells and activate genes in the target cell type. They found that compared to naturally occurring CREs that are known to be connected with the cell types, the novel CREs were much more cell-type specific.
“The synthetic CREs semantically diverged so far from natural elements that predictions for their effectiveness seemed implausible. We initially expected many of the sequences would misbehave inside living cells,” said Gosai.
“It was a thrilling surprise to us just how good CODA was at designing these elements,” said Castro.
Tewhey and his colleagues investigated the reasons behind the synthetic CREs' superior performance over naturally occurring CREs and found that the cell-specific synthetic CREs included combinations of sequences that both repressed or turned off the gene in the target cell types and sequences that were responsible for expressing genes in the target cell types.
Lastly, the team successfully tested a number of the synthetic CRE sequences in mice and zebrafish. For example, a fluorescent protein in developing zebrafish livers could be activated by one CRE, but not in any other part of the fish.
“This technology paves the way toward the writing of new regulatory elements with pre-defined functions. Such tools will be valuable for basic research but also could have significant biomedical implications where you could use these elements to control gene expression in very specific cell types for therapeutic purposes,” said Tewhey.
Source:
Journal reference:
Gosai, S. J., et al. (2024) Machine-guided design of cell-type-targeting cis-regulatory elements. Nature. doi.org/10.1038/s41586-024-08070-z.