domingo, 24 de agosto de 2008

Biogeography and polytomies: a fit based approach

Most, if not all, implementations used in historical biogeography analyses nowadays do not deal with polytomic trees (i.e. Page, 1993; Ronquist, 1996; 2001). Ronquist (2001; 2002) suggested a “solution” for this problem by giving weights to all, or some, of the dichotomic resolutions into a polytomy based on the “confidence” one may have on each resolution. This approach, however, suffers, in my opinion, from one major dilemma, that is, that the same phylogeny is accepting and rejecting different hypotheses (i.e., such dichotomic resolutions contradict each other). My goal in this study is, then, to suggest a different technique for dealing with polytomies based on the fit of each resolution to the general pattern.

Methods

Two controlled data sets were used with six and seven areas, each with four dichotomic trees and three with a trichotomy. A real data set with seven areas, four dichotomic phylogenies and three phylogenies each with a trichotomy were also used. In the software TREEFITTER (Ronquist, 2002) heuristic searches were performed (hold=1000, neighbourhood=20). For each data set the first search was done weighting the polytomy resolutions. The subsequent searches were done in the following manner: a search with each resolution of a given polytomy was done separately without weighting. The resolution with the best fit in the general reconstruction was the only one held and added to the data set; the resolutions of the next phylogeny were evaluated with the same procedure. Three orders of entrance were evaluated.

Additionally, with the real data set, two polytomic phylogenies were analyzed separately. In this part, searches with two (of the three resolutions of a trichotomy) topologies weighted were compared with the searches of the remainder resolution.


Results and discussion

With the first data set the topology obtained with all searches was the same, this is a byproduct of the high congruence among? the resolved clades used. In the second data set, the search with weight recovered two topologies, but with the other method one of the topologies was found in an order entrance with a fit of 21.41. The other topology was recovered with another entrance order with a fit of 22.39. What this result shows is not only that the better way to handle polytomies in biogeographic analysis is not weighting them, but that the order of entrance in the program TREEFITTER using this polytomy treatment could be critical and has to be randomize several times.

With the real data set the weighted and the randomized searches obtained the same topology, probably because the nature of the data. In the last two explorations the results (see table. 1) were not conclusive. With the first phylogeny (hereafter A, and the second phylogeny B) the best fitted topology was obtained with resolution 1, but in the searches with the pairs when 1 was used only two nodes of the reconstruction were recovered, and when the other two resolutions were used the same topology was obtained although with a worse score. With phylogeny B, the best fit and the structure of the reconstruction was due resolution 1 which could indicate that when one resolution is highly congruent with the rest of the data that is the one which confers the structure.


Table1. Results of the comparison between the search of weighted pairs of a trichotomy and the remainder resolution. With the phylogeny A the best fits are obtained were the one resolution is present, but the same topology recovered with 1 is recovered with 2-3. With phylogeny B, in all the cases the the structure of the reconstruction where 1 is present is the same, that's why only two nodes are shared in all comparisons.

Phylogeny A

Phylogeny B

Weighted pairs

fit

Single un-weighted

fit

Shared nodes

Weighted pairs

fit

Single un-weighted

fit

Shared nodes

1-2

23.51

3

24.20

3

1-2

23.19

3

25.19

2

2-3

23.83

1

23.20

6

2-3

23.82

1

23.24

2

1-3

23.18

2

24.20

3

1-3

22.86

2

25.19

2

Although the results obtained with this explorations do not allow certainty in the causes of the results one thing has to be remarked: as parsimony is about finding the best fit of the available characters, the way the polytomies have to be resolved is not giving false confidence grades to the resolutions but on the fit with the rest of the data.


Bibliography

  1. Page, R. D. M. 1993. Component 2.0.

  2. Ronquist, F. 1996. DiVa.

  3. Ronquist, F. 2001. TreeFitter 1.3b.

  4. Ronquist, F. 2002. Parsimony analysis of coevolving species associations. In: Cospeciation (R. D. M. Page, Ed.). University of Chicago Press, Chicago.

1 comentario:

Salva dijo...

No estoy seguro si usar ajustes es la forma idónea de combatir el problema... por ejemplo, en la politomia (A B C D), unida al árbol (A (B (C D))), el mejor ajuste apoyaría la segunda topología, pero podría pasar que después uno agregara más toplogías politomicas, y una vez tomada esa decisión, las siguientes agregados solo incrementan el "apoyo" de (B (C D)), y parecería que el conjunto de datos apoya mucho ese grupo, cuando de hecho, es muy ambiguo con respecto a el!--y con casi el mismo apoyo para (D (B C))--esto más bien apoya la idea de producir también respuestas politomicas!

Peor aún, si uno no usa los árboles "primarios" uno podría "apoyar" artificialmente con grupos que nunca aparecen en estos árboles! (por ejemplo, la politomia es debida a (D (C (B A))) y (B (D (A C))), que no apoyan a (A (B (C D)))!!).

Las politomias son datos ambiguos, y por lo tanto, uno debe tratarlas de forma que en efecto existan varias soluciones que tengan exactamente el mismo fit! (por eso tenemos la solución como una politomia)... Así que el fit no puede ser la forma de solucionar el problema de las politomias.

Pd. Yo hace poco escribí un post sobre ese problema, por si quieres ojearlo ;)