What is Ancient DNA (aDNA), and what is the importance of accessing it?
Ancient DNA research first emerged in 1984, where DNA fragments from an extinct zebra species were extracted, sequenced, and analyzed. For the first time, researchers could look at genetic information on a molecular level to understand evolution.
Since then, major advancements have allowed an analysis of samples that date back to half a million years ago and further. Genomes have been sequenced from a multitude of extinct animal and human species, such as woolly mammoths, Vikings, Neanderthals, etc.
The study of aDNA is extremely important as not only does it allow extinct species to be identified, but it provides a greater understanding of the environments and evolution of these species into the modern world. This has been possible with the development of next-generation sequencing and computational biology.
Without computational biology and bioinformatics, it would be extremely difficult to process and analyze aDNA to identify its origins and conduct further research. This article will discuss some examples of the role computational biology/bioinformatics plays in accessing aDNA.
Image Credit: Microgen/Shutterstock.com
How is aDNA accessed and analyzed?
The first aDNA sample was accessed and analyzed using bacterial cloning, and the DNA was amplified. The next milestone in accessing aDNA came with the development of the polymerase chain reaction (PCR), which allowed the routine amplification of DNA.
In addition to PCR technology, Sanger sequencing has been a standard method to sequence DNA, determining the nucleotide order. However, the major issue was that only a small amount of DNA could be processed through Sanger sequencing. The early 20th century saw the development of next-generation sequencing (NGS), which revolutionized the field of aDNA study. It has made it possible to increase the amount of DNA bases that can be sequenced which solves the issues of sequencing costs.
How can computational biology/bioinformatics be used to study aDNA?
Ever since the first aDNA sample was extracted and analyzed, the field of aDNA study has only advanced in technological advancements, enabling a larger capacity of aDNA to be studied, processed, and analyzed. This is largely due to NGS and bioinformatics platforms that allow a plethora of samples to be stored and accessed to generate patterns of genetic variation through history. These systems allow high-throughput sequencing and analysis of data.
Additionally, the study of aDNA has now become so vast that it requires a great collaborative effort between samples and aDNA laboratories across the globe, reinforcing the importance of computational/bioinformatic systems as the easiest and most efficient way to access records and samples would be digital. Computational tools such as CASCADE and mapDATAge allow new ways of visualizing and studying aDNA.
The use of CASCADE and mapDATAge to access aDNA
Custom-made Archiving System for the Conservation of Ancient DNA Experimental data (CASCADE)
Cascade is a laboratory information management system (LIMS) that allows the study of aDNA to become more automated and efficient in the collaboration and coordination of samples. The development of this platform meets the increased capacity of sample processing, allowing multiple steps of the experimental procedure to be carried out and tracked.
CASCADE enables users to create aDNA libraries and amplify and sequence them on a single platform. With CASCADE, it is thought that it will improve how aDNA is conserved, shared, and traced, allowing wider accessibility worldwide.
Image Credit: bluesroad/Shutterstock.com
mapDATAge
As mentioned, the field of aDNA study has been revolutionized by the development of next-generation technologies, identifying genetic variation patterns through genetic markers, single nucleotide polymorphisms (SNPs), and the entire genome. This allows researchers to re-construct the environments of the planet's previous inhabitants to study aspects such as migration.
Even though these patterns can be identified, there was no way to visualize these patterns on a geographical scale through space and time until the development of mapDATAge. This software allows users to explore aDNA in terms of geographical patterns of different alleles, ancestry, etc., by setting specific geographic, spatial, and temporal parameters.
An example of its application where a study prepared three different files of aDNA that provided geographical and time-stamped data. The first file looked at lactose tolerance in ancient and modern Europeans, the frequency of an allele that causes it, and other related information. The second and third datasets looked at genetic profiles of ancient horses and alleles of previously published data on ancient horses that showed genetic loci responsible for movement, stature, and coat color phenotypes, respectively.
After choosing the specific visualization, time, and geographical areas of interest, a panel/map was created to show the change in horse genomics and the expansion of certain genes. Therefore, this software plays an important role in accessing aDNA as it provides a deeper insight into how aDNA has forged the modern-day world by being able to visualize genomic changes through space and time.
Computational biology/bioinformatics has paved a new path in aDNA research, making it more accessible and automated, and has improved the accuracy of results. However, there are still various issues that arise when research becomes automated. NGS can generate large amounts of sequence data, often meaning that relevant DNA/genes must be identified.
For example, if aDNA from bacteria is being studied and relevant DNA must be identified, the search is carried out by using the genome of a closely related species and microbial sequences present in the database and identifying similarities.
The problem arises when similarity is not detected or detected incorrectly due to DNA damage of aDNA, resulting in unreliable matches. Additionally, the dataset of microbial sequences may not have enough similarity to the extinct species to allow confident matching and detection. Despite these issues, computational tools/bioinformatics greatly aid in accessing aDNA to further the field of study. To continue these advancements, significant developments must be made to make platforms user-friendly and accessible that help maintain a collaborative environment across the world.
Sources:
- Dolle, D., et al. (2020). 'Cascade: A custom-made archiving system for the conservation of ancient DNA experimental data' Frontiers in Ecology and Evolution, 8. DOI: 10.3389/fevo.2020.00185 Available at: https://www.frontiersin.org/articles/10.3389/fevo.2020.00185.
- Hagelberg, E., Hofreiter, M. & Keyser, C. (2015). 'Introduction. Ancient DNA: The first three decades' Philos Trans R Soc Lond B Biol Sci, 370 (1660), p. 20130371. DOI: 10.1098/rstb.2013.0371.
- Librado, P., et al. (2021). 'The origins and spread of domestic horses from the western eurasian steppes' Nature, 598 (7882), pp. 634-640. DOI: 10.1038/s41586-021-04018-9.
- Linderholm, A. (2015). 'Ancient DNA: The next generation – chapter and verse' Biological Journal of the Linnean Society, 117 (1), pp. 150-160. DOI: 10.1111/bij.12616 Available at: https://doi.org/10.1111/bij.12616 (Accessed: 12/28/2022).
- Liu, X. & Orlando, L. (2022). 'Mapdatage: A shinyr package to chart ancient DNA data through space and time' Bioinformatics, 38 (16), pp. 3992-3994. DOI: 10.1093/bioinformatics/btac425 Available at: https://doi.org/10.1093/bioinformatics/btac425 (Accessed: 12/28/2022).
- Orlando, L., et al. (2021). 'Ancient DNA analysis' Nature Reviews Methods Primers, 1 (1), p. 14. DOI: 10.1038/s43586-020-00011-0 Available at: https://doi.org/10.1038/s43586-020-00011-0.
- Prüfer, K., et al. (2010). ‘Computational challenges in the analysis of ancient DNA’ Genome Biology, 11 (5), p. R47. DOI: 10.1186/gb-2010-11-5-r47 Available at: https://doi.org/10.1186/gb-2010-11-5-r47.
- Rizzi, E., et al. (2012). 'Ancient DNA studies: New perspectives on old samples' Genetics Selection Evolution, 44 (1), p. 21. DOI: 10.1186/1297-9686-44-21 Available at: https://doi.org/10.1186/1297-9686-44-21.
Further Reading