lunes, 18 de diciembre de 2017

Character Weighting on Binary Data




Introduction
Homoplasy is understood as similarity between taxa because of convergence or-not parallel coancestría- (Rendall & Di Fiore, 2007)⁠, in contrast to the phylogenetic definition of homology where one character is similar due there was a common ancestor between two taxa (Vogt & Vogt, 2002)⁠. Farris (Farris, 1983)⁠ established the neative relationship between homoplasy and search for the most parsimonious tree. Therefore, taking into account the homoplasy of each character would be useful when corroborating phylogenetic hypotheses. The implicit weighing consists of assigning weight to the characters according to their adjustment to the tree that best fits the character. Therefore, the weight of a character will be a value based on its homoplasy (Goloboff, 1993).


Methods
A tree of 20 terminals without branch length was generated using the "APE" package in software R (Paradis, 2012)⁠, then  a binary morphological data was simulated in the Mesquite 3.31 software under the mk1 model (Lewis, 2001)⁠. Three matrices of 100, 500 and 1000 characters were generated. The maximum parsimony analysis with implicit weighing was made in the software TNT (Goloboff, Farris, & Nixon, 2008) under different values of concavity (k = 3, k = 9, k = 15, k = 30, k = 500 and k = 999), additionally it was compared with a search without weighing. Finally, to compare the differences between the topologies obtained, the method proposed by Robinson and Foulds in 1981 was used .


Results and Discussion
The minimum value of RF was 0.4, taking into account that RF is a value ranging from 0 to 1 it can be understood that Robinson-Foulds distance shows that the weighing method does not represent large differences with respect to the search without weighing that had a value also close to 0,4.However, with a value of K = 9 the shortest distances were obtained, although with differences too low to be able to draw conclusions about a positive contribution in obtaining a topology. The number of characters did not influence the obtaining of the topology too much since the three matrices in general had a similar performance in the search of the tree.






Figure 1: Robinson-Foulds distance trees obtained acording to each model. In red the trees obtained with the 100 characters matrix, in orange those obtained with the 500 character matrix and in yellow with the 1000 characters matrix.

In conclusion, although it is considered that in general the weighing of characters helps to improve the search of the most parsimonious tree, for the matrices simulated with this model it didn't represent big differences.

Bibliografía
Farris, J. S. (1983). The Logical Basis of Phylogenetic Analysis. Advances in Cladistics 2: Oceedings of the Second Meeting of the Willi Hennig Society, (September), 7–36. https://doi.org/10.1080/106351591007453
Goloboff, P. A. (1993). ESTIMATING CHARACTER WEIGHTS DURING TREE SEARCH. Cladistics, 9(1), 83–91. https://doi.org/10.1111/j.1096-0031.1993.tb00209.x
Goloboff, P. A., Farris, J. S., & Nixon, K. C. (2008). TNT, a free program for phylogenetic analysis. Cladistics, 24(5), 774–786. https://doi.org/10.1111/j.1096-0031.2008.00217.x
Lewis, P. O. (2001). A Likelihood Approach to Estimating Phylogeny from Discrete Morphological Character Data. Syst. Biol, 50(6), 913–925. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.467.221&rep=rep1&type=pdf
Paradis, E. (2012). Analysis of phylogenetics and evolution with R. Springer.
Rendall, D., & Di Fiore, A. (2007). Homoplasy, homology, and the perceived special status of behavior in evolution. Journal of Human Evolution, 52(5), 504–521. https://doi.org/10.1016/j.jhevol.2006.11.014

Vogt, L., & Vogt, L. (2002). Testing and weighting characters. Organisms, Diversity and Evolution, 2(4), 319–333. https://doi.org/10.1078/1439-6092-00051

No hay comentarios: