Вы здесь

The genetics of human population history: men versus women

Gone are the days when peering back into human history was limited to archaeological finds. Now the field of evolutionary genetics brings new insights into human population history through examining mitochondrial and Y chromosome DNA, providing clues to both our maternal and paternal past, respectively. Subsequent comparisons of maternal and paternal population histories have elucidated the genetic impact of sex-biased processes, such as migration. However the robustness of these comparisons has been questioned due to the different molecular methods applied to generate sequences from mitochondrial versus Y chromosome DNA. In a recent study in Investigative Genetics, Mark Stoneking from the Max Planck Institute for Evolutionary Anthropology, Germany, and colleagues, present a simple method to remove this bias in sequence generation. Here Stoneking explains where this bias comes from, what their results revealed about the relative genetic contributions of men and women to human populations, and how their approach will impact future studies into human demography.

What led to your interest in human evolutionary genetics?

I started at university as an undeclared major, having no idea what I might want to study or do with my life. I took a few courses in anthropology, got interested in that, and through anthropology got exposed to human population genetics. That in turn got me interested in genetics, and so I went to Penn State University to pursue a master’s degree in genetics and thereby learn more about some of the different areas encompassed by genetics. I ended up working on the evolutionary genetics of salmonid fish, using protein electrophoresis to study genetic variation, which I thought was very cool. At about the time I was finishing my master’s degree the first mitochondrial DNA (mtDNA) studies were coming out, and that to me seemed like the up and coming thing, so I went to Berkeley, to Allan Wilson’s lab, to learn how to work with mtDNA.

At the time I didn’t really care what organism I worked with, I just wanted to learn about mtDNA; of the various projects that Allan had going on with mtDNA, following up on work that another graduate student, Becky Cann, had initiated on human mtDNA variation seemed most interesting to me. As I worked on human mtDNA variation, my interest in anthropology was re-awakened, and I came to realise that when it comes to evolutionary genetics, humans are an ideal organism as there is so much other information from archaeology, fossils, and linguistics that we can tap into for new ideas to test and for comparison with the genetic results. Plus, a lot more people care about human evolution than about salmonid fish evolution! And so it was as a grad student that I decided I would stay in human evolutionary genetics.

What did you aim to learn from your recent study in Investigative Genetics?

The main goal of this study was to develop and test an approach for generating unbiased Y chromosome sequences for demographic analysis. The usual way people analyse human Y chromosome variation is to genotype particular single nucleotide polymorphisms (SNPs) of interest, sometimes in combination with Ychromosome short tandem repeat (Y-STR) loci. While this is a very informative approach for gaining insights into Y chromosome variation, it is heavily biased (you only find out about the SNPs you genotype), and the data are not amenable to making demographic inferences. So we wanted to take advantage of the next-generation sequencing platforms to develop a rapid and efficient approach for obtaining unbiased Y chromosome sequence information – in the same way that we routinely obtain unbiased mtDNA sequence information – that can then be used to make inferences about divergence times, population size changes, etc.

In your study, what was the benefit of using both Y chromosome DNA and mitochondrial DNA?

The Y chromosome is paternally inherited, and so tells us about the paternal history of humans. MtDNA is maternally inherited, and so tells us about the maternal history of humans. By comparing the two, sometimes they tell you the same thing, but often they don’t, because males and females are often doing different things when it comes to migration, reproduction, etc. So comparing Y chromosome to mtDNA variation is a powerful approach for learning about sex-biased processes during human evolution.

Your study investigates the effective population size of out of Africa migrations and of female to male ratios. What is the difference between an effective population size and an actual population size?

Evolution doesn’t care about you, it only cares about your genes and what you pass on to your offspring. Similarly, population geneticists don’t care about the number of people in a population, they care about how much genetic variation there is, and how much of that variation is likely to be transmitted to the next generation. The effective population size is a sort of imaginary number that is based on genetic variation: it is the size of an imaginary population, in which everyone has the same chances of reproducing, that has the same amount of genetic variation as the actual population. And since in reality not everyone has the same chance of reproducing, the effective population size is invariably smaller than the actual population size.

How did you go about calculating effective population size?

There are a number of approaches one can use, and we used two different methods. The first, called a Bayesian skyline plot, is a curve that shows changes in population size over time. The basic idea is that if you have a tree relating your sequences (which is relatively straightforward to produce from mtDNA or Y chromosome sequences), and if you have dates for the branching events in your tree (which you can get, with a lot of assumptions, using a molecular clock approach), then you can move across the tree, and at each time point you can use the number of different lineages to estimate how much variation there was at that time, which in turn can be used to estimate the effective population size.

The second method was a simulation-based approach. We started with a model of population history, estimated divergence times for that model, and then started with a prior distribution of effective population sizes. We then drew an effective population size at random (based on the prior distribution) for each population, simulated mtDNA or Y chromosome sequences assuming the model of population history and the chosen effective population sizes, ran the simulation, calculated summary statistics from the simulated data, and compared the summary statistics to those for the observed data. We then repeated this process several million times; those simulated histories that gave sequences that were most similar to what we actually observed were then taken as the best fit to the data, and the effective population sizes used in the best-fitting simulations become our estimates.

What new insights into the paternal and maternal histories of human populations do your findings provide? Were you surprised by any of your results?

We applied our method to the CEPH Human Genetic Diversity Panel, as this is a resource that is widely used, and so we thought having complete mtDNA genome sequences along with the ~500 kilobases of Y chromosome sequence from this panel would be a valuable resource. At the same time, we could check on some aspects of human history that were contentious. For example, early studies of mtDNA and Y chromosome variation in human populations had found bigger genetic differences for the Y chromosome than for mtDNA, which was suggested to reflect the prevalent patrilocal residence pattern of human populations. But later studies did not find such big differences; the different methods (and different population samples) used to assay mtDNA versus Y chromosome variation could clearly influence the results. Our method and dataset is currently the best available to address this fundamental question, and our results do support bigger genetic differences between populations for the Y chromosome than for mtDNA, although not as large as the early studies suggested. More importantly, we found substantial regional variation – in some geographic regions the mtDNA differences are bigger than the Y chromosome differences. So focusing on worldwide patterns misses this important regional variation.

Another question of interest has to do with whether or not the bottleneck as human populations moved out of Africa was bigger for males than for females (or vice-versa). Different studies, using different methods and datasets, came up with different results. Our results do suggest that more females than males were involved in the migration out of Africa, and moreover that over time there has been a faster growth in female than in male effective population size. We think these results make some sense, given that in most populations a smaller proportion of males than of females tend to contribute to the next generation.

How will the methods used in your study benefit future research into human demographic history?

We are hopeful that others will use this (or other approaches) to generate more Y chromosome sequence data from other populations. The methods are changing so fast – we can already generate 2 megabases of Y chromosome sequence now for the same cost as was used to generate 500 kilobases. We are also hopeful that as more of this sort of data becomes available, other approaches for analysing the data (in particular, incorporating migration and selection into the demographic models) will be developed.

What’s next for your research?

We have changed over in the lab to this sequence-based approach to generate Y chromosome data from our own population samples, so we are excited about further insights into the comparative maternal and paternal history of human populations that will be forthcoming.

Mark Stoneking