Genomics, transcriptomics, proteomics, metabolomics, and others... The "omics" technologies keep revolutionizing medicine and biomedical sciences.
They generate a lot of valuable data but pose almost as many challenges. How can the individual research institutions store the huge volumes of data? How can they safely share these with other national and international centers? Can they ensure that the privacy of patients is always respected? Can different research centers use the collective data analysis experience?
Is it possible to establish a platform that will offer all of the needed infrastructure and resources? The GHGA is one of the nine consortia funded by the German Research Foundation (DFG) within the National Research Data Infrastructure (NFDI) initiative and one built specifically to answer these questions.
DRESDEN-concept Genome Center is one of four DFG-funded next generation sequencing (NGS) competence centers that now work within the GHGA consortium. Over the course of the next five years, scientists from Dresden, together with their GHGA colleagues, will contribute to shaping international standards for the exchange of human omics data. The initial focus is on developing, harmonizing and optimizing processes of data collection. Metadata, i.e., additional information about the sequencing data, is a keyword here.
We need to establish good foundations but it is a nuanced task. On one hand, we would like to have as much information as possible to ensure that datasets are comprehensive and can be analyzed for as many purposes as possible. But at the same time, we have to ensure that the privacy of patients is still the number one priority."
Mathias Lesche, Researcher, DRESDEN-concept Genome Center
To address the legal and ethical questions researchers are working closely with lawyers who specialize in national and international privacy regulations.
The Center for Information Services and High Performances Computing (ZIH) of the TU Dresden, together with four other German High Performance Computing (HPC) centers, has taken on the task of developing the technical infrastructure.
"We will use state-of-the-art HPC, Cloud and storage technologies to build a distributed infrastructure accessible for all interested researchers and clinicians. The platform will integrate the ability to access, analyze, manage, and archive the human omics data," says Dr. Ralph Mueller-Pfefferkorn, head of the department for Distributed and Data Intensive Computing at ZIH.
The goal is to follow the so-called FAIR guiding principles and establish a framework where data is findable, accessible, interoperable, and reusable (FAIR). "We want to provide a comprehensive set of tools for data management and processing and allow researchers to reuse each other's data. We also want to be a connection hub that will foster novel initiatives and let researchers and clinicians set up new collaborations," adds Dr. Mueller-Pfefferkorn.
"We are very pleased that we have the TU Dresden on board as part of the GHGA. Our consortium is formed by leading centers for genome and HPC research in Germany. Together we want to maximize our potential by bringing together genome datasets in a standardized way," says Prof. Oliver Stegle from the German Cancer Research Center (DKFZ) in Heidelberg, who acts as speaker of the GHGA Board of Directors. As part of the NFDI initiative, GHGA will collaborate with other consortia focused on chemistry, biology, and other disciplines to establish a Germany-wide interdisciplinary data infrastructure for research. In Dresden, the operative phase of the project will start on March 1st, 2021.