Currently there are three methods in the inference of phylogenetic trees: Maximum Parsimony (MP), Maximun Likelihood (ML) and Bayesian analysis. Also, there is a great variety of programs that allow to execute these methods on the data. However, the doubt about which method is the best has always been present and even after many years of discussion it has not yet been possible to reach an agreement. Here I will show the menu available to a researcher who wants to do a phylogenetic analysis, showing the three main courses, with their advantages and disadvantages.The first is the most traditional, used for a long time, but has lost strength in recent years.
Maximum Parsimony was proposed by Edwards and Cavalli-Sforza (1963) and is characterized because it does not use an explicit method of evolution, nor any other type of assumptions prior to analysis. Based on the knife of Ockham whose tree chosen is the tree that has lower cost, which is the tree that explains the data with the smaller amount of possible changes. There are several ways to do this method and they are based on different ways in which the transformations of the characters are possible, among these are: Fitch, Wagner, Camin-Sokal and the general matrix (Wiley, 2011). The main problems in using this analysis lie in the impossibility of obtaining the length value of the branches, since it does not take into account the amount of change because there is no explicit evolutionary model, in addition the parsimony has problems of inconsistency of the branches when the number of taxa is large and the exchange rates are high (Kim, 1996).
On the probabilistic and most recent side is Maximum Likelihood (ML), this method uses a previously established evolution model of the sequences, given the model this analysis finds the hypothesis that best explains the data -probability of the data given the model- (Salemi, 2009). Thanks to the model used, a new value can be estimated in the topology: the lengths of the branches, which are the amount of change accumulated since the time of the fork (Swoffod, 1996). However, this method is not absolved from uncertainties, it has been empirically demonstrated that this method can fail under certain conditions in the same way as MP (Farris, 1999).
The Bayesian method is also on the probabilistic side - weaknesses similar to those of ML -this method is the newest of the three and is the most widely used in recent times. The question is the alternative to Maximum likelihood: What is the probability of the model given the data? It is also based on previous knowledge known as priors, among which are: the evolutionary model of the sequence, heterogeneity and distribution of the data. The probabilistic space exploration is done with the help of the MCMC method where the node is chosen with greater posterior probability and the lengths of the branches based on the model are also taken into account (Holmes, 2003).
It seems to me that each one of the analysis can be useful depending on the data, the researcher must know very well his type of data and his research question, since these are the ones that determine which method to approach the problem. Although all have their shortcomings, the utility of the length of the branches at the time of dating or the philosophical support of MP are qualities that are difficult to reject.
Maximum Parsimony was proposed by Edwards and Cavalli-Sforza (1963) and is characterized because it does not use an explicit method of evolution, nor any other type of assumptions prior to analysis. Based on the knife of Ockham whose tree chosen is the tree that has lower cost, which is the tree that explains the data with the smaller amount of possible changes. There are several ways to do this method and they are based on different ways in which the transformations of the characters are possible, among these are: Fitch, Wagner, Camin-Sokal and the general matrix (Wiley, 2011). The main problems in using this analysis lie in the impossibility of obtaining the length value of the branches, since it does not take into account the amount of change because there is no explicit evolutionary model, in addition the parsimony has problems of inconsistency of the branches when the number of taxa is large and the exchange rates are high (Kim, 1996).
On the probabilistic and most recent side is Maximum Likelihood (ML), this method uses a previously established evolution model of the sequences, given the model this analysis finds the hypothesis that best explains the data -probability of the data given the model- (Salemi, 2009). Thanks to the model used, a new value can be estimated in the topology: the lengths of the branches, which are the amount of change accumulated since the time of the fork (Swoffod, 1996). However, this method is not absolved from uncertainties, it has been empirically demonstrated that this method can fail under certain conditions in the same way as MP (Farris, 1999).
The Bayesian method is also on the probabilistic side - weaknesses similar to those of ML -this method is the newest of the three and is the most widely used in recent times. The question is the alternative to Maximum likelihood: What is the probability of the model given the data? It is also based on previous knowledge known as priors, among which are: the evolutionary model of the sequence, heterogeneity and distribution of the data. The probabilistic space exploration is done with the help of the MCMC method where the node is chosen with greater posterior probability and the lengths of the branches based on the model are also taken into account (Holmes, 2003).
It seems to me that each one of the analysis can be useful depending on the data, the researcher must know very well his type of data and his research question, since these are the ones that determine which method to approach the problem. Although all have their shortcomings, the utility of the length of the branches at the time of dating or the philosophical support of MP are qualities that are difficult to reject.
Bibliography
-
Farris, J. S. (1999). Likelihood and inconsistency. Cladistics, 15(2), 199-204.
-
Holmes, S. (2003). Bootstrapping Phylogenetic Trees: Theory and Methods. Stat. Sci. 18, 241–255
-
Kim, J. (1996). General inconsistency conditions for maximum parsimony: effects of branch lengths and increasing numbers of taxa. Systematic Biology, 45(3), 363-374.
-
Salemi, M., Vandamme, A.-M., & Lemey, P. (2009). The phylogenetic handbook : a practical approach to phylogenetic analysis and hypothesis testing. Cambridge University Press.
-
Swofford, D. L., Olsen, G. J., Waddell, P. J. & Hillis, D. M. (1996). Phylogenetic inference. Molecular Systematics 407–514.
-
Wiley, E. O., & Lieberman, B. S. (2011). Phylogenetics: theory and practice of phylogenetic systematics. John Wiley & Sons.
No hay comentarios:
Publicar un comentario