Study develops a memory-intensive approach to visualize genomes

Download PDF Copy

Reviewed

De GruyterApr 27 2020

At Ulm University in Germany, a new method has been devised for developing pan-genome subgraphs at various granularities without the need to wait several hours or days for the software to process the complete genome.

Genome Visualization

Image Credit: Tartila/Shutterstock.com

With the method developed by Kadir Dede and Dr. Enno Ohlebusch, researchers can now form visualizations of pan-genomes on various scales much faster.

The study titled “Dynamic construction in pan-genome structures” has been reported in De Gruyter’s open access journal Open Computer Science. Researchers must be in a position to visualize the sections they have to investigate for the analysis of specific parts of a genome, which consumes a huge amount of processing power and time.

The Computational Pan-Genomics Consortium urges the researchers to make sure that all information present in a data structure is easily accessible for human eyes by visualization support on various scales. But a pan-genome graph can include thousands to millions of nodes, which cannot be easily visualized by naked eyes.

Dede and Ohlebusch used 10 human genomes in an experiment to draft a graph that includes a portion of the large repetitive central exon of the human MUC5AC gene. Researchers had to develop an entire index structure of the genomes earlier, which requires around 8.5 hours and 38.5 GB of memory.

With Dede and Ohlebusch’s method, the researchers have to just calculate two bit-vectors (on which the subgraph construction is based) and the subgraph that contained the reference path and its neighborhood.

In the place of more than eight hours, the software developed the subgraph, which included the computation of the bit vectors that need around 10 minutes, in only 24.5 minutes, and needed only 39.6 GB of main memory. The subgraph itself needed just 15 KB of memory.

Based on solid theory, Dede and Ohlebusch present a new method for the flexible and efficient exploration of suspicious genomic regions, highlighting for example pathogenic genes that distinguish new variants of a virus from all previously known genomes.”

Dr Jens Stoye, Professor and Head of the Genome Informatics Team, Faculty of Technology, Bielefeld University

Source:

De Gruyter

Journal reference:

Dede, K & Ohlebusch, E (2020) Dynamic construction of pan-genome subgraphs. Open Computer Science. doi.org/10.1515/comp-2020-0018.

Posted in: Genomics | Life Sciences News