Scientists from the University of Oxford’s Big Data Institute have taken a big leap towards mapping the totality of genetic relationships between humans, which will be a single genealogy that tends to trace the ancestry of everyone.
Human migration out of Africa. Image Credit: Wohns, et al., Science (2022)
The last twenty years have seen amazing developments in human genetic research, producing genomic data for hundreds of thousands of people, which included thousands of prehistoric people. This uplifts the exciting potential of finding the origins of human genetic diversity to generate a complete map of how people around the world are related to each other.
To date, the key challenges to this objective were working towards combining genome sequences from several different databases and designing algorithms to handle data of this size. A new method published by scientists from the University of Oxford’s Big Data Institute, however, could easily integrate data from various sources and balance to accommodate millions of genome sequences.
We have basically built a huge family tree, a genealogy for all of humanity that models as exactly as we can the history that generated all the genetic variation we find in humans today. This genealogy allows us to see how every person's genetic sequence relates to every other, along all the points of the genome.”
Dr Yan Wong, Evolutionary Geneticist, Big Data Institute, University of Oxford
Dr Wong is also one of the principal authors of this study.
As individual genomic regions are inherited only from one parent, which is either the mother or the father, the ancestry of each point on the genome can be thought of as a tree. The set of trees called a “tree sequence” or “ancestral recombination graph,” relates genetic regions back in time to ancestors, where the genetic variation appeared at the beginning.
Essentially, we are reconstructing the genomes of our ancestors and using them to form a vast network of relationships. We can then estimate when and where these ancestors lived. The power of our approach is that it makes very few assumptions about the underlying data and can also include both modern and ancient DNA samples.”
Dr Anthony Wilder Wohns, Study Lead author and Postdoctoral Researcher, Broad Institute of MIT and Harvard
He carried out this research as part of his PhD at the Big Data Institute.
The research combined data on modern and prehistoric human genomes from eight different databases and added a total of 3,609 individual genome sequences received from 215 populations.
The prehistoric genomes included samples that were found worldwide with ages ranging between 1,000s and over 100,000 years. The algorithms forecasted where common ancestors will be present in the evolutionary trees to describe the genetic variation pattern. The resulting network comprised around 27 million ancestors.
The authors utilized the network to guess where the forecasted common ancestors had resided after adding position data on these sample genomes. The findings efficiently recaptured main events in human evolutionary history, together with the migration out of Africa.
The genealogical map is already a very rich resource; however, the research team plans to make it furthermore understandable by continuing to include genetic data as it gets available. Since tree sequences store data in an extremely effective way, the dataset can easily accommodate millions of extra genomes.
Dr Yan Wong added, “This study is laying the groundwork for the next generation of DNA sequencing. As the quality of genome sequences from modern and ancient DNA samples improves, the trees will become even more accurate and we will eventually be able to generate a single, unified map that explains the descent of all the human genetic variation we see today.”
While humans are the focus of this study, the method is valid for most living things; from orangutans to bacteria. It could be particularly beneficial in medical genetics, in separating out true associations between genetic regions and diseases from spurious connections arising from our shared ancestral history.”
Dr Anthony Wilder Wohns, Study Lead author and Postdoctoral Researcher, Broad Institute of MIT and Harvard
Source:
Journal reference:
Wohns, A. W. et al. (2022) A unified genealogy of modern and ancient genomes. Science. doi.org/10.1126/science.abi8264.