Abstract
Abstract We introduce a Bayesian method for estimating hidden population substructure using multilocus molecular markers and geographical information provided by the sampling design. The joint posterior distribution of the substructure and allele frequencies of the respective populations is available in an analytical form when the number of populations is small, whereas an approximation based on a Markov chain Monte Carlo simulation approach can be obtained for a moderate or large number of populations. Using the joint posterior distribution, posteriors can also be derived for any evolutionary population parameters, such as the traditional fixation indices. A major advantage compared to most earlier methods is that the number of populations is treated here as an unknown parameter. What is traditionally considered as two genetically distinct populations, either recently founded or connected by considerable gene flow, is here considered as one panmictic population with a certain probability based on marker data and prior information. Analyses of previously published data on the Moroccan argan tree (Argania spinosa) and of simulated data sets suggest that our method is capable of estimating a population substructure, while not artificially enforcing a substructure when it does not exist. The software (BAPS) used for the computations is freely available from http://www.rni.helsinki.fi/~mjs.
Keywords
Affiliated Institutions
Related Publications
BAPS 2: enhanced possibilities for the analysis of genetic population structure
Abstract Summary: Bayesian statistical methods based on simulation techniques have recently been shown to provide powerful tools for the analysis of genetic population structure...
On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)
Summary New methodology for fully Bayesian mixture analysis is developed, making use of reversible jump Markov chain Monte Carlo methods that are capable of jumping between the ...
A Model-Based Method for Identifying Species Hybrids Using Multilocus Genetic Data
Abstract We present a statistical method for identifying species hybrids using data on multiple, unlinked markers. The method does not require that allele frequencies be known i...
Inference of Population Structure Under a Dirichlet Process Model
Abstract Inferring population structure from genetic data sampled from some number of individuals is a formidable statistical problem. One widely used approach considers the num...
Sampling-Based Approaches to Calculating Marginal Densities
Abstract Stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm can be viewed as three alternative sampling- (or Monte Carlo-) based approa...
Publication Info
- Year
- 2003
- Type
- article
- Volume
- 163
- Issue
- 1
- Pages
- 367-374
- Citations
- 874
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1093/genetics/163.1.367