domingo, 22 de mayo de 2011

Branch Support: confidence, stability, credibility?

By Susana Ortiz

One way of assessing whether a clade present in a phylogenetic reconstruction really is part of the true configuration in the phylogeny, is evaluating its support, which may be established by estimating confidence intervals based on sampling methods (Bootstrap and Jackknife), and Bremer support, based on the length difference of trees as a stability measure. Even if, this approaches are not independent of the search strategy given that they are sensitive to its effectiveness (Freudenstein and Davis, 2010). Therefore a highly weighted clade, not necessarily means it is real, maybe is just the kind of response that fits to the resources used (e. g. search strategy). Posterior probabilities in Bayesian analyses have been used as a probabilistic measure of support (e. g. Goloboff et. al, 2003; Pickett and Randle, 2005), because it quantifies credibility, how likely a certain clade is to be correct, given the data, model and priors (Huelsenbeck et al., 2002). Comparision between Bayesian and nonoparametric Bootstrapping was proposed by Efron et al. (1996), where the bootstrap confidence level can be thought as the assessments of error for the estimated tree. However, posterior probabilities are sensitive to the prior for internal branch lengths (Yang Z., Rannala 2005), and are significantly higher than corresponding nonparametric bootstrap frequencies when the models used for analyses are underparameterized (Goloboff et. al, 2003). Despite have been several the attempts to come close the different approaches under certain conditions, this approaches are not freely assessable under all phylogenetic criteria given some restrictions not only methodological but conceptual.

Bootstrap and Jackknife are resampling techniques from the original data to infer variability of the estimate, in this case the phylogeny. The variation among trees provide an adequate indication of the uncertainty (Felsestein, 1985). Nevertheless, Bootstrap has also been proposed as a tool to assess robustness with regard to small changes in data (Holmes, 2003), it is not a test of how accurate is a topology but provides information about its stability, as well as to assess whether the data are adecuate to validate the topology (Berry and Gascuel, 1996). As for repeteability unless it is a perfectly Hennigian data set (Felsestein, 1985), is expected to have variations between replicas, so one might think that many replicates would mean a greater precition regarding the idea of which groups are monophyletic, but according to Pattengale et al. (2009), rather small number of Bootstrap replicates (typically after 100–500 replicates) producing support values that correlate at better than 99.5% with the reference values on the best ML trees.

This last, although the stopping criteria can recommend very different numbers of replicates for different datasets of comparable sizes. In the same way, the above does not mean that a clade is or is not monophyletic depending on its support, this just points out the certainty with which you can find a particular node in the topology. If this node are not in the Bootstrap consensus, it could means there is a polytomy due to multiple nodes’ resolutions maybe by incongruence between characters. Mort et al (2000), compared Bootstrap and Jackknife, their findings show the relation between the bootstrap’s values and the deletion proportion chosen in Jackknife. However, in favor of Jackknife, it has been proposed as a rapid and efficient method to identify strongly supported clades (Farris et al. 1996) and the assigment of equal deletion probabilities to characters, it reduces the problem of competition bewteen informative and noninformative characters (Freudenstein and Davis, 2010).

Bremer support (Bremer, 1984) is another alternative to measure support, although only under Parsimony criterion. This method measures the diference between the most parsimonious cladogram and suboptimal that lacks of interes clade (Grant and Kluge, 2008). So in Bremer a strongly supported branch means a large increment in the length of the suboptimal trees. The absolute (Bremer, 1984) and relative Bremer support (Goloboff and Farris, 2001) are variants depending on the type of evidence that it takes into account. The firts measures the absolute amount of favorable evidence, and second the ratio between favorable and contradictory evidence to the group, and both represent two aspects of support that can vary independently (Goloboff et al., 2003). Bremer support as a support measure has been interpreted as a stability measure, so independent to the influence to autapomorphies and lower frequencies for better supported groups, however, have raised objections to this vision, such that stability depends of the specific scenario as noted Goloboff et al. (2003) “a group stable under additions of characters may be very unstable under addition of taxa or under recoding of some charactes” but bremer as support only is based on the available evidence.

Homoplasy is another factor affecting the estimation of support, clades delimited by “unique and unreversed” or relatively less homoplastic character states are often considered more strongly supported (Grant and Kluge, 2008), although all support aproaches are not equally sensitive. According to Freudenstein and Davis (2010) The values on branches not affected by homoplasy are slightly higher for the bootstrap than the jackknife, but the addition of homoplastic characters caused support on branches affected by homoplasy to drop substantially more, as measured by the bootstrap than as measured by the jackknife different to Bremer support which takes the distribution of homoplasy into account (Sanderson, 1995). Incongruence between characters, the proportion of homoplastic characters versus homologous, additivity, and character weighing (in bootstrap) are key topics in the evaluation of support. Number of nonhomoplastic synapomorphies supporting a clade provides a numerical estimate of the support of a hypothesis but maybe it does not provide evidence than favor a hypothesis over some another alternative (Wilkinson et al. 2003). I agree with Grant and Kluge (2003) about support measures do not test phylogenetic hypotheses, they evaluate the relative degree or strength of evidence.


References

- Berry, V. and Gascuel, O. (1996). On the interpretation of bootstrap trees: Appropriate threshold of clade selection and induced gain. Molecular Biology and Evolution 13 999–1011.
- Efron B., Halloran E., and Holmes S. 1996. Bootstrap confidence levels for phylogenetic trees. Recherche, 93(14):7085–7090.
- Erixon, P B. Svennblad, T. Britton y B. Oxelman. 2003. Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylo- genetics. Systematic Biology 52: 665-673
- Farris, J.S., 1996. Jac. Computer Program Distributed by the Author. Moleky-larsystematiska laboratoriet, Naturhistoriska riksmuseet, Stockholm, Sweden.
- Felsestein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783–791.
- Freudenstein J. V., Davis J., I. 2010 Branch support via resam
pling: an empirical study. Cladistics, 26:1–14.
-
Goloboff P. A., Farris J. S., K Mari, J Ram, and C. A. Szumik. Cladistics Improvements to resampling measures of group support. Cladistics, 19:324–332, 2003. doi: 10.1016/S0748-3007(03)00060-4.
- Grant, T., Kluge, A. G. 2003. Data exploration in phylogenetic inference: scientific, heuristic, or neither. Cladistics 19, 379–418.
- Grant, T., Kluge A. G. 2008. Cladistics Clade support measures and their adequacy. Cladistics, 24:1051–1064, 2008.
- Holmes S. 2003. Bootstrapping Phylogenetic Trees :. October, 18(2):241–255, 2003.
- Huelsenbeck, J. P B. Larget , R. E. Miller, and F. Ronquist. 2002. Potential applications and pitfalls of Bayesian inference of phylogeny. Syst. Biol. 51:673–688.
- Mort, M.E., Soltis, P Soltis, D.E., Mabry, M.L., 2000. Comparison of three. S., methods for estimating internal support on phylogenetic trees. Syst. Biol. 49, 160–171.
-
Pattengale N. D., Masoud Alipour, Olaf R. P. Bininda-emonds, Bernard Memoret, and Alexandros Stamatakis. 2009. How Many Bootstrap Replicates Are Necessary ? (i):184–200.
- Pickett, C.P Randle. 2005. Strange bayes indeed: uniform topological priors imply non-uniform clade priors, Molecular Phylogenetics and Evolution 34.
- Sanderson, M.J., 1995. Objections to bootstrapping: a critique. Syst. Biol. 44, 299–320.
- Wilkinson, M., Lapointe, F.-J., Gower, D.J., 2003. Branch lengths and support. Syst. Biol. 52, 127–130.
- Yang Z., Rannala B. 2005. Branch-length prior influences Bayesian posterior probability of phylogeny. Syst Biol 54(3), 455-70.