With a new method developed at the University of Bergen, researchers can now refine the analysis of proteins taking into account population diversity. The study is published in the journal Nature Methods.
Analyzing the sequence of proteins in cohort studies is done by comparing participant data against protein sequences predicted from the human genome.
– Today, the same reference proteins are used for all participants, says associate professor Marc Vaudel at the department of Clinical Science of the University of Bergen – but we are all different! We found that the small genetic changes that make us who we are create a bias: for those who differ from the reference, current informatic methods are blind to parts of their proteins.
To solve this problem, the researchers in Bergen developed new models to build sequences from large genetic panels. Their new method captures the protein sequences that are most likely to be observed in the population.
– Accounting for a greater diversity among cohort participants is key to refine medical research to individual profiles, adds PhD candidate Jakub Vašíček who designed the method.
However, many groups of people are not well represented in genetic panels and diversity is challenging to integrate in current data models.
– The road is still long before we can treat all patients fairly, but we are making good progress, Vašíček adds.
The new method has just been published in the prestigious journal Nature Methods.
Source:
Journal reference:
Vašíček, J., et al. (2024). ProHap enables human proteomic database generation accounting for population diversity. Nature Methods. doi.org/10.1038/s41592-024-02506-0.