The contribution of gene flow, selection, and genetic drift to five thousand years of human allele frequency change

Time series
Covariances
Selection
Admixture
Human evolution
Ancient DNA
Authors

Simon, A.

Coop, G.

Doi

Citation (APA 7)

Simon, A., & Coop, G. (2024). The contribution of gene flow, selection, and genetic drift to five thousand years of human allele frequency change. Proceedings of the National Academy of Sciences, 121(9), e2312377121. https://doi.org/10.1073/pnas.2312377121

Abstract

Genomic time series from experimental evolution studies and ancient DNA datasets offer us a chance to directly observe the interplay of various evolutionary forces. We show how the genome-wide variance in allele frequency change between two time points can be decomposed into the contributions of gene flow, genetic drift, and linked selection. In closed populations, the contribution of linked selection is identifiable because it creates covariances between time intervals, and genetic drift does not. However, repeated gene flow between populations can also produce directionality in allele frequency change, creating covariances. We show how to accurately separate the fraction of variance in allele frequency change due to admixture and linked selection in a population receiving gene flow. We use two human ancient DNA datasets, spanning around 5,000 y, as time transects to quantify the contributions to the genome-wide variance in allele frequency change. We find that a large fraction of genome-wide change is due to gene flow. In both cases, after correcting for known major gene flow events, we do not observe a signal of genome-wide linked selection. Thus despite the known role of selection in shaping long-term polymorphism levels, and an increasing number of examples of strong selection on single loci and polygenic scores from ancient DNA, it appears to be gene flow and drift, and not selection, that are the main determinants of recent genome-wide allele frequency change. Our approach should be applicable to the growing number of contemporary and ancient temporal population genomics datasets.