Hostname: page-component-848d4c4894-v5vhk Total loading time: 0 Render date: 2024-07-02T21:17:19.115Z Has data issue: false hasContentIssue false

Importance sampling on coalescent histories. II: Subdivided population models

Published online by Cambridge University Press:  01 July 2016

Maria De Iorio*
Affiliation:
Imperial College London
Robert C. Griffiths*
Affiliation:
University of Oxford
*
Postal address: Department of Mathematics, Imperial College London, 180 Queen's Gate, London SW7 2BZ, UK
∗∗ Postal address: Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK. Email address: griff@stats.ox.ac.uk

Abstract

De Iorio and Griffiths (2004) developed a new method of constructing sequential importance-sampling proposal distributions on coalescent histories of a sample of genes for computing the likelihood of a type configuration of genes in the sample by simulation. The method is based on approximating the diffusion-process generator describing the distribution of population gene frequencies, leading to an approximate sample distribution and finally to importance-sampling proposal distributions. This paper applies that method to construct an importance-sampling algorithm for computing the likelihood of samples of genes in subdivided population models. The importance-sampling technique of Stephens and Donnelly (2000) is thus extended to models with a Markov chain mutation mechanism between gene types and migration of genes between subpopulations. An algorithm for computing the likelihood of a sample configuration of genes from a subdivided population in an infinitely-many-alleles model of mutation is derived, extending Ewens's (1972) sampling formula in a single population. Likelihood calculation and ancestral inference in gene trees constructed from DNA sequences under the infinitely-many-sites model are also studied. The Griffiths-Tavaré method of likelihood calculation in gene trees of Bahlo and Griffiths (2000) is improved for subdivided populations.

Type
General Applied Probability
Copyright
Copyright © Applied Probability Trust 2004 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Supported by BBSRC Bioinformatics grant 43/BIO14435.

References

Bahlo, M. and Griffiths, R. C. (2000). Inference from gene trees in a subdivided population. Theoret. Pop. Biol. 57, 7995.Google Scholar
Bahlo, M. and Griffiths, R. C. (2001). Coalescence time for two genes from a subdivided population. J. Math. Biol. 43, 397410.CrossRefGoogle ScholarPubMed
Beerli, P. and Felsenstein, J. (1999). Maximum likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach. Genetics 152, 763773.Google Scholar
Carbone, I. and Kohn, M. (2001). A microbial population–species interface: nested cladistic and coalescent inference with multilocus data. Molecular Ecology 10, 947964.Google Scholar
De Iorio, M. and Griffiths, R. C. (2004). Importance sampling on coalescent histories. I. Adv. Appl. Prob. 36, 417433.Google Scholar
De Iorio, M., Griffiths, R. C., Leblois, R. and Rousset, F. (2004). Stepwise mutation likelihood computation by sequential importance sampling in subdivided population models. Tech. Rep., Oxford University.Google Scholar
Ewens, W. J. (1972). The sampling theory of selectively neutral alleles. Theoret. Pop. Biol. 3, 87112.Google Scholar
Fearnhead, P. and Donnelly, P. (2001). Estimating recombination rates from population genetics data. Genetics 159, 12991318.Google Scholar
Griffiths, R. C. and Tavaré, S. (1994a). Ancestral inference in population genetics. Statist. Sci. 9, 307319.Google Scholar
Griffiths, R. C. and Tavaré, S. (1994b). Sampling theory for neutral alleles in a varying environment. Proc. R. Soc. London B 344, 403410.Google Scholar
Griffiths, R. C. and Tavaré, S. (1994c). Simulating probability distributions in the coalescent. Theoret. Pop. Biol. 46, 131159.Google Scholar
Griffiths, R. C. and Tavaré, S. (1996). Markov chain inference methods in population genetics. Math. Comput. Modelling 23, 141158.Google Scholar
Griffiths, R. C. and Tavaré, S. (1997). Computational methods for the coalescent. In Progress in Population Genetics and Human Evolution (IMA Vols Math. Appl. 87), eds Donnelly, P. and Tavaré, S., Springer, Berlin, pp. 165182.Google Scholar
Griffiths, R. C. and Tavaré, S. (1999). The ages of mutations in gene trees. Ann. Appl. Prob. 9, 567590.Google Scholar
Herbots, H. M. (1997). The structured coalescent. In Progress in Population Genetics and Human Evolution (IMA Vols Math. Appl. 87), eds Donnelly, P. and Tavaré, S., Springer, Berlin, pp. 231255.CrossRefGoogle Scholar
Nath, M. and Griffiths, R. C. (1996). Estimation in an island model using simulation. Theoret. Pop. Biol. 3, 227253.Google Scholar
Notohara, M. (1990). The coalescent and the genealogical process in geographically structured populations. J. Math. Biol. 29, 5975.Google Scholar
Stephens, M. and Donnelly, P. (2000). Inference in molecular population genetics. J. R. Statist. Soc. B 62, 605655.Google Scholar