AI Tool for Accelerating Biological Discovery

Brian Hie leads the Laboratory of Evolutionary Design at Stanford, focusing on the intersection of artificial intelligence and biology. Recently, Hie considered a compelling question: If a tool like ChatGPT can generate original sentences by analyzing patterns in vast collections of written text, what would happen if written words were substituted with genetic code?

Evo, a generative AI model that creates genetic code, is the solution to that ostensibly straightforward query. In a study published in the journal Science, Hie and his associates from the Arc Institute and the University of California, Berkeley presented Evo.

According to Hie, scientists could use Evo to understand better how viral and microbial genomes function, create previously unimagined proteins (drugs), and rewire microbes to perform amazing tasks like consuming microplastics from the oceans or enhancing photosynthesis for carbon sequestration and increasing crop yields.

Instead of having to use brute force testing or mining promising sequences from nature, all of which are quite unpredictable, we now have an AI model for generating systems of interest, allowing researchers to focus only on the most promising possibilities. Evo puts the genomes of whole lifeforms within reach and accelerates the bioengineering design process.”

Brian Hie, Assistant Professor, Arc Institute, Stanford University

Evo may even result in new treatments, a better understanding of genetic illnesses, and a deeper comprehension of evolution itself, all accomplished on a computer rather than in a lab.

Natural Insight

Nature itself serves as the source of inspiration. DNA contains the instructions for all life. A better understanding of the intricate interactions between DNA, RNA, and proteins and how these interactions have changed over time will result in deeper knowledge and the capacity to rewire microbes into practical technologies.

However, things are not as simple as they appear. The genomes of even the most basic microorganisms contain millions of base pairs.

Enhancing the resolution to the scale of individual nucleotides, the building blocks of DNA, and increasing the length of sequences models can process at once from about 8,000 base pairs to over 131,000 base pairs (referred to as the “context window”) are two of Evo's main improvements over comparable existing tools.

The 300 billion nucleotide genomes of 2.7 million prokaryotic and phage organisms, 80,000 microbes, and smaller DNA loops called plasmids were used to train Evo. However, the team had to omit the genomes of viruses known to infect humans and some other organisms to prevent the use of Evo for the development of bioweapons.

According to Hie, Evo can produce DNA sequences of over a million base pairs, more than seven times the context window of 131,000 base pairs, and can learn how slight variations in nucleotide sequences impact the evolutionary fitness of entire organisms. The researchers point out that the smallest “minimal” bacterial genomes are roughly 580,000 base pairs long.

Proof of Concept

Hie and associates prompted Evo to produce unique synthetic CRISPR-Cas molecular complexes and systems as a proof of concept of Evo's design capabilities. Proteins and RNA work together to modify DNA in CRISPR-Cas systems, which resemble tiny molecular machines.

Evo developed a completely working, unidentified CRISPR system in response to that prompt, which was verified following the testing of 11 potential designs. According to Hie, Evo's CRISPR investigation is the first instance of simultaneous protein-RNA codesign with a language model.

Hie is already working on extending his research beyond the microbial world to human and other genomes, improving Evo's capacity to process larger genomic sequences, and gaining more control over its outputs.

Evo opens up a lot of very interesting research at the intersection of machine learning and biology. It creates opportunities for discoveries that were previously unimaginable and accelerates our ability to engineer life itself.”

Brian Hie, Assistant Professor, Arc Institute, Stanford University

Source:
Journal reference:

Nguyen, E., et al. (2024) Sequence modeling and design from molecular to genome scale with Evo. Science. doi.org/10.1126/science.ado9336.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoLifeSciences.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Ancient DNA Reveals Neandertal Gene Flow into Modern Humans