Abstract
In this article, we use supermatrix data-mining methods to reconstruct a large, highly inclusive phylogeny of Cyperaceae from nucleotide data available on GenBank. We explore the properties of these trees and their utility for phylogenetic inference, and show that even the highly incomplete alignments characteristic of supermatrix approaches may yield very good estimates of phylogeny. We present a novel pipeline for filtering sparse alignments to improve their phylogenetic utility by maximizing the partial decisiveness of the matrices themselves through a technique we call "phylogenetic scaffolding," and we present a new method of scoring tip instability (i.e. "rogue taxa") based on the I statistic implemented in the software Mesquite. The modified statistic, which we call I(S), is somewhat more straightforward to interpret than similar statistics, and our implementation of it may be applied to large sets of large trees. The largest sedge trees presented here contain more than 1500 tips (about one quarter of all sedge species) and are based on multigene alignments with more than 20 000 sites and more than 90% missing data. These trees match well with previously supported phylogenetic hypotheses, but have lower overall support values and less resolution than more heavily filtered trees. Our best-resolved trees are characterized by stronger support values than any previously published sedge phylogenies, and show some relationships that are incongruous with previous studies. Overall, we show that supermatrix methods offer powerful means of pursuing phylogenetic study and these tools have high potential value for many systematic biologists.
Keywords
Affiliated Institutions
Related Publications
Terrace Aware Data Structure for Phylogenomic Inference from Supermatrices
In phylogenomics the analysis of concatenated gene alignments, the so-called supermatrix, is commonly accompanied by the assumption of partition models. Under such models each g...
RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models
Abstract Summary: RAxML-VI-HPC (randomized axelerated maximum likelihood for high performance computing) is a sequential and parallel program for inference of large phylogenies ...
New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0
PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm performing nearest neighbor interchanges to improve a reasona...
IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies
Large phylogenomics data sets require fast tree inference methods, especially for maximum-likelihood (ML) phylogenies. Fast programs exist, but due to inherent heuristics to fin...
Tree View: An application to display phylogenetic trees on personal computers
TreeView is a simple, easy to use phylogenetic tree viewing utility that runs under both MacOS (on Apple Macintosh computers) and under Microsoft Windows on Intel based computer...
Publication Info
- Year
- 2012
- Type
- article
- Volume
- 62
- Issue
- 2
- Pages
- 205-219
- Citations
- 92
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1093/sysbio/sys088