martes, 12 de diciembre de 2017

Phylogenetic reconstruction in practice

Phylogenetic reconstruction in practice

There are thousands of data that can be analized for making a reconstruction phylogenetic, the question is: How do you make it?
For choose the best hypothesis of phylogenetics relation of inference cientific have to do two analyses, first is the analyses of caracters that is in mayoria empiric, and the second analysis is the exploration of trees's space that demands the alternative topologies valoration and a quantitative decision ruler for the optimum hipothesis selection of the phylogeny and for last evaluate the confiability of these philogenetics hypotheses (Luna, José A, & Tania Chew, 2005).

Any method used should be chosen by the researcher, taking into account the properties it possesses and the characteristics of the data to be evaluated. The basic properties of the phylogenetic methods are: the statistical consistency and the robustness. Felsenstein (1978),  described a hypothetical case in which parsimony would be statistically inconsist and the advantage of Likelihood has been a reason to abandon parsimony, nevertheless other studies shows the maximum likelihood inconsistency in under realistics conditions (Farris, 1999). In this way currently we know the Farris zone in which for the case of four-taxon with two-edge rate and a three-edge rate, the portion of the parameter space where parsimony is immune and likelihood is inconsistent is termed ‘‘long-branch repulsion’’,  the region of poor performance is called the Farris Zone (Siddall, 1998). In contrast the zone where there is “long-branch attraction” is called the Felsenstein Zone.

Then there are circumstances in which the methods of estimation are not statistically consistent as: the compatibility methods and Farris’s parsimony method for estimating unrooted Wagner trees, either to if parallelism is expected to occur frequently and these methods pass the test of consistency when paralellism is rare (Felsenstein, 1978)

For evaluating the robustness of phylogenetic method, searchers have proposed several way to do it (Holder, Zwickl, & Dessimoz, 2008), (Bak, Otu, Tasci, Meydan, & Sezerman, 2013), (Tang, Humphreys, Fontaneto, & Barraclough, 2014) based on the properties that posses a method of get a correct result despite that his parameters have been violated.  

Besides of this to find the most parsimonious trees, the technique must be chosen according to the size of the data; there are traditional techniques: the first is wagner tree where the trees are created by sequentially adding the taxa and at the most parsimonious available branch, the next technique is branch-swapping that evaluates the parsimony of each of a series of branch-rearrangements of a tree; there are 3 branch-swapping algorithms of which "tree bisection reconnection" or TBR is the most used, however it has limitations when it comes to a quantity of taxa greater than 40, for data sets larger than 50 and less than 150 a strategy is to perform TBR + RAS or "random addition sequence" (Goloboff,1999)

Data sets can be so large that they have regions or sectors that can be seen as sub-problems, so they will be "local" and "global" optimal, so an alternative strategy to the traditionally Ratchet, is a technique for analyzing large data sets, it is based on slightly perturbing the data to avoid that TBR gets stuck, repeating a TBR search for the perturbed data using the same tree as starting point, then using that tree for searching again under the original data (Goloboff, 2002). There are also other techniques such as: Sectoral searches, Tree-fusing, Tree-drifting and Combined methods (Goloboff, 2002) which are used depending of the characteristics of data set in order to be as accurate and efficient as possible.

As a recipe, the steps in phylogenetic reconstruction are: the collection of a set of characters, the next step is the selection of a model that postulates the value of changes between states and select optimal trees such as models of Parsimony: Wagner, Fitch, Dollo, Camin-Sokal, etc., and the probabilistic ones like: Jukes-Cantor, Kimura 2P, etc., either with programs like "PAUP" or "MrBayes", which refer to the likelihood or Bayesian probabilities of the trees in competition (Luna et al., 2005). To continue, there are three methods of phylogenetic reconstruction, such as: parsimony methods (Farris, 1983), and the two probabilistic approaches, maximum likelihood methods ("maximum likelihood", ML) (Felsenstein, 2004), and Bayesian methods (Rannala and Yang, 1996). For obtain the optimal tree in some cases it is necessary to make a consensus that can be consensus or strict. Finally, the results are displayed in programs such as FigTree.

Each set of data is unique so you must find the combination of methods as accurate as possible in terms of explanation and effectiveness so you can not remove the practice of the theory that is already in it is the basis of decisions, as affirmed by De Luna et al. (2005), current phylogenetic analyzes and the increase of data types, require knowledge in areas such as science of science, statistics, probabilistic theory and molecular evolution to achieve control both operationally and theoretically.


Bibliography
Bak, Y., Otu, H. H., Tasci, N., Meydan, C., & Sezerman, U. O. (2013). Testing robustness of relative complexity measure method constructing robust phylogenetic trees for Galanthus L . Using the relative complexity measure.
Farris, J. S. (1983). The Logical Basis of Phylogenetic Analysis.
Farris, J. S. (1999). Likelihood and Inconsistency, 204, 199–204.
Felsenstein, J. (1978). Cases in Which Parsimony or Compatibility Methods Will Be Positively Misleading, (1965), 401–410.
Goloboff, P. A. (1999). Analyzing Large Data Sets in Reasonable Times: Solutions for Composite Optima. Cladistics 15. 415-428
Goloboff, P. A. (2002). 4 Techniques for Analyzing Large Data Sets, (August). https://doi.org/10.1007/978-3-0348-8125-8
Holder, M. T., Zwickl, D. J., & Dessimoz, C. (2008). Evaluating the robustness of phylogenetic methods to among-site variability in substitution processes, (October), 4013–4021. https://doi.org/10.1098/rstb.2008.0162
Luna, E. De, José A, G., & Tania Chew, T. (2005). Sistemática biológica : avances y direcciones en la teoría y los métodos de la reconstrucción filogenética Systematic biology : advances and directions in theory and methods of phylogenetic reconstruction, 15(3), 351–370.
Siddall, M. E. (1998). Success of Parsimony in the Four-Taxon Case : Long-Branch Repulsion by Likelihood in the Farris Zone, 220, 209–220.
Tang, C. Q., Humphreys, A. M., Fontaneto, D., & Barraclough, T. G. (2014). Title : Effects of phylogenetic reconstruction method on the robustness of species delimitation using single-locus data. https://doi.org/10.1111/2041-210X.12246

Software
Swofford, D., & Douglas P.. Begle. (1993). PAUP: Phylogenetic Analysis Using Parsimony, Version 3.1, March 1993. Center for Biodiversity, Illinois Natural History Survey.
Ronquist, F., & Huelsenbeck, J. P. (2003). MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics19(12), 1572-1574.
Andrew Rambaut. FigTree. V. 1.4.2. Institute of Evolutionary Biology, University of Edinburgh.


No hay comentarios: