domingo, 24 de agosto de 2008

Biogeography and polytomies: a fit based approach

Most, if not all, implementations used in historical biogeography analyses nowadays do not deal with polytomic trees (i.e. Page, 1993; Ronquist, 1996; 2001). Ronquist (2001; 2002) suggested a “solution” for this problem by giving weights to all, or some, of the dichotomic resolutions into a polytomy based on the “confidence” one may have on each resolution. This approach, however, suffers, in my opinion, from one major dilemma, that is, that the same phylogeny is accepting and rejecting different hypotheses (i.e., such dichotomic resolutions contradict each other). My goal in this study is, then, to suggest a different technique for dealing with polytomies based on the fit of each resolution to the general pattern.

Methods

Two controlled data sets were used with six and seven areas, each with four dichotomic trees and three with a trichotomy. A real data set with seven areas, four dichotomic phylogenies and three phylogenies each with a trichotomy were also used. In the software TREEFITTER (Ronquist, 2002) heuristic searches were performed (hold=1000, neighbourhood=20). For each data set the first search was done weighting the polytomy resolutions. The subsequent searches were done in the following manner: a search with each resolution of a given polytomy was done separately without weighting. The resolution with the best fit in the general reconstruction was the only one held and added to the data set; the resolutions of the next phylogeny were evaluated with the same procedure. Three orders of entrance were evaluated.

Additionally, with the real data set, two polytomic phylogenies were analyzed separately. In this part, searches with two (of the three resolutions of a trichotomy) topologies weighted were compared with the searches of the remainder resolution.


Results and discussion

With the first data set the topology obtained with all searches was the same, this is a byproduct of the high congruence among? the resolved clades used. In the second data set, the search with weight recovered two topologies, but with the other method one of the topologies was found in an order entrance with a fit of 21.41. The other topology was recovered with another entrance order with a fit of 22.39. What this result shows is not only that the better way to handle polytomies in biogeographic analysis is not weighting them, but that the order of entrance in the program TREEFITTER using this polytomy treatment could be critical and has to be randomize several times.

With the real data set the weighted and the randomized searches obtained the same topology, probably because the nature of the data. In the last two explorations the results (see table. 1) were not conclusive. With the first phylogeny (hereafter A, and the second phylogeny B) the best fitted topology was obtained with resolution 1, but in the searches with the pairs when 1 was used only two nodes of the reconstruction were recovered, and when the other two resolutions were used the same topology was obtained although with a worse score. With phylogeny B, the best fit and the structure of the reconstruction was due resolution 1 which could indicate that when one resolution is highly congruent with the rest of the data that is the one which confers the structure.


Table1. Results of the comparison between the search of weighted pairs of a trichotomy and the remainder resolution. With the phylogeny A the best fits are obtained were the one resolution is present, but the same topology recovered with 1 is recovered with 2-3. With phylogeny B, in all the cases the the structure of the reconstruction where 1 is present is the same, that's why only two nodes are shared in all comparisons.

Phylogeny A

Phylogeny B

Weighted pairs

fit

Single un-weighted

fit

Shared nodes

Weighted pairs

fit

Single un-weighted

fit

Shared nodes

1-2

23.51

3

24.20

3

1-2

23.19

3

25.19

2

2-3

23.83

1

23.20

6

2-3

23.82

1

23.24

2

1-3

23.18

2

24.20

3

1-3

22.86

2

25.19

2

Although the results obtained with this explorations do not allow certainty in the causes of the results one thing has to be remarked: as parsimony is about finding the best fit of the available characters, the way the polytomies have to be resolved is not giving false confidence grades to the resolutions but on the fit with the rest of the data.


Bibliography

  1. Page, R. D. M. 1993. Component 2.0.

  2. Ronquist, F. 1996. DiVa.

  3. Ronquist, F. 2001. TreeFitter 1.3b.

  4. Ronquist, F. 2002. Parsimony analysis of coevolving species associations. In: Cospeciation (R. D. M. Page, Ed.). University of Chicago Press, Chicago.

sábado, 23 de agosto de 2008

Endemism vs Richness: an example

Introduction

Biodiversity hotspots have a prominent role in conservation biology (Myers et al., 2000), but it remains controversial to what extent different types of hotspot are congruent (Bonn et al., 2002). Several authors states that the richness (species number per area) is equivalent to endemism area (Thomas & Mallorie, 1985; Soria-Auza & Kessler, 2008). However, Orme et al. (2005) disagrees with this statement. The most rich areas is not congruent with endemism centers. A simple form to estimate richness in an area is to calculate the species number per area. Several approaches and methodologies has been proposed to estimate the richness in an area. Among them, Chao (1984; 1987), Burnham & Overton (1978, 1979), Heltshe & Forrester (1983), Smith & van Belle (1984), and Raaijmakers (1987) – see DIVA-GIS manual - . An area of endemism is an area of nonrandom distributional congruence among taxa (Platnick, 1991). Several authors have developed techniques to identify areas of endemism. N.D.M. implements an optimality criterion based on the presence or absence of species in a given grid within an area (number of species that compose the area, species found nowhere else).

Methodology

Georeferenced records were collected from Bolívar and Miranda-Esquivel endemism analyses (2009). These data were organized in the DIVA-GIS v. 5.4 software. A richness analysis (species number per area) was conducted using the DIVA-GIS. The Chao 2 richness estimator (Chao, 1987) was used to estimate the species number per area. Chao 2 is based on the number of samples for an area. To create samples, DIVA-GIS divides each grid-cell into 4 or 9 sub-areas. The grid size used in the richness analyses was 1 per 1 and 0.5 per 0.5.
The endemism analyses were performed using the software N.D.M. v. 2.5 (Goloboff, 2006). The analyses were performed using a 0.5 per 0.5 grid size (n=2000; postchk; m=10; M=100) and 0.25 per 0.25 grid size (n=2000; postchk; m=10; M=15). Several searches were conducted in N.D.M. Using different parameters to identify changes in the resultanting endemic areas. The observation of endemic areas vs most richest areas were performed.

Results

The South-Western zone is the most richnest area in my study area (see Fig. 1). Medium richness level are found in the West and Central region. 90% of species number are in these zones (more that 3400 species). In the two richness analysis (1º x 1º and 0.5º x 0.5º), the rich species areas are partially different. However, the general pattern is similar between them, It showing significant richness levels in the Southern-Western regions.

The analyses using a grid size of 0.5º x 0.5º generates 53 endemic areas (Fig. 2). Among them, five endemic areas shows an maximum Endemicity Index (EI =/> 100). 43 endemic areas does spread the South-Western region from studied area. Likewise, others endemic areas is found along to Western and Central region. The second analysis shows identical endemic areas (Fig. 3). The South-Western region is the most endemic one. Further, the endemic areas with lower EI than the endemic areas in the South-Western region are found in the Western and Central region.

Endemic and richness pattern are similar on a resolution of 0.5º per 0.5º vs 1º per 1º grid size. The most richnest areas are congruent with the most endemic areas (EI => 100; Chao 2 =>3400). Further, endemic areas with EI medium (EI among 40.0 to 70.0) are placed in the same regions that the areas with a richness levels medium (Chao 2 among 1700 to 3400).







In the analyses using 0.25 per 0.25 vs 0.5 per 0.5, the endemism and richness pattern shows some incongruence. Although the general endemic/rich areas are recovered, the most endemic areas are not identified as the most richest areas (Fig. 2). Likewise, some endemic areas are not estimated as rich areas (see Apendix 1).








Discusion

In my results the fit between the endemic centers and richness is condicionated by the resolution of the used analyses. Likewise, the hierarchy resultanting from optimality criterion (Szumik & Goloboff, 2004; 2007) can be considered as equivalent to the degree of richness using the estimator Chao 2. Using several parameter, the results in N.D.M. are not affect the similarity between the analyses. So, the richness is equivalent to endemism but it similarity is subject to the resolution level of analysis.

My results that supports a rather weak relationship between richness/endemism indices agrees with the recent observation that patterns of avian species richness are determined by the distribution of widely distributed species, rather than restricted range species (Lennon et al., 2004). In Aves, the endemic species richness is thought to be a product of either refugia from past extinctions or of high rates of ecological and allopatric speciation.

This incongruence on different resolutions have important implications for understanding the ecological, evolutionary mechanisms that underlie the origin and maintenance of biodiversity (Orme et al., 2005).

Aditionally, the lack of congruence among approaches has implications for the use of areas or hotspots in conservation. If congruence among hotspots types are high then it may not matter which index of diversity was used to guide conservation policy, because any such index could act as an effective surrogate for other aspects of diversity (Orme et al., 2005).

Bibliography

Myers, N., Mittermeier, R. A., Mittermeier, C. G., da Fonseca, G. A. B. & Kent, J. (2000). Biodiversity hotspots for conservation priorities. Nature 403, 853-858.
Bonn, A., Rodriguez, A. S. L. & Gaston, K. J. Threatened and endemic species: are they good indiators of patterns of biodiversity on a national scale? Ecol. Lett. 5, 733-741 (2002).
Thomas, C. D. & Mallorie, H. C. Rarity, species richness and conservation: butterflies of the Atlas Mountains in Morocco. Biol. Conserv. 33, 95-117 (1985).
Berg, A. & Tjernberg, M. Common and rare Swedish vertebrates — distribution and habitat preferences. Biodivers. Conserv. 5, 101-128 (1996).
Jetz, W., Rahbek, C. & Colwell, R. K. The coincidence of rarity and richness and the potential signature of history in centres of endemism. Ecol. Lett. 7,
1180-1191 (2004).
Lennon, J. J., Koleff, P., Greenwood, J. J. D. & Gaston, K. J. Contribution of rarity and commonness to patterns of species richness. Ecol. Lett. 7, 81-87 (2004).
Jetz, W. & Rahbek, C. Geographic range size and determinants of avian species richness. Science 297, 1548-1551 (2002).
Orme, C. D. L., Davies, R. G., Burgess, M., Eigenbrod, F., Pickup, N., Olson, V. A., Webster, A. J., Ding, T., Rasmussen, P. C., Ridgely, R. S., Stattersfield, A. J., Bennett, P. M., Blackburn, T. M., Gaston, K. J., & Owens I. P. F. Global hotspots of species richness are not congruent with endemism or threat. Nature 436, 1016-1019 (2005).
Soria-Auza, R. W., & Kessler, M. The influence of sampling intensity on the perception of the spatial distribution of tropical diversity and endemism: a case study of ferns from Bolivia. Diversity and Distributions 14, 123–130 (2008).
Szumik, C.A., Cuezzo, F., Goloboff, P.A., & Chalup, A.E. An optimality criterion to determine areas of endemism. Systematic Biology 51, 806-816 (2002).
Szumik, C.A, & Goloboff, P.A. Areas of Endemism: An Improved Optimality Criterion. Systematic Biology 53, 968-977 (2004).

jueves, 21 de agosto de 2008

Inferring the Geographic Range Evolution



Introduction
Some methods in biogeography are based on the assumption that there is a single branching pattern among areas caused by vicariance and that this pattern is common to many different groups of organisms (Nelson, 1974; Rosen, 1976; Nelson and Platnick, 1981). Other approaches points to the reconstruction of the distribution history of individual groups (taxon biogeography) and in the search for general area relationships (area biogeography); the latter use character optimization methods which allow the reconstruction of ancestral distributions without constraining area relationships to hierarchical patterns (Bremer, 1992; Ronquist, 1994). Among this methods are the Dispersal-vicariance analysis (Ronquist, 1997) that uses a Fitch Optimization and the Dispersal-extinction-cladogenesis (DEC) model (Rei and Smith, 2008) that implements a maximum likelihood optimization. The objective of the present study was to reconstruct the ancestral distributions using both approaches and to contrast the findings.

Methods
Sequence data information for the avian genera Pipilo and Toxostoma previously published (Zink et al., 1998; 1999) were used. For Pipilo sp. a mitochondrial region control, the cytochrome b and NADH dehydrogenase subunit 2 genes were considered. For Toxostoma only the mitochondrial region control and the cytochrome b genes were Included. Each gene for each taxon was analyzed separately. The sequences were aligned with Muscle software (Edgar et al., 2004) using the default parameters. The best-fit model of nucleotide substitution was determined using a hierarchical likelihood ratio test (Posada and Crandall, 2001) as implemented in the Modeltest software (Posada and Crandall, 1998). Maximum Likelihood (ML) optimization analyses were done in phyML software (Guindon and Gascuel, 2003). The distributions of the taxa and their ancestral area were described in terms of the areas proposed by Zink et al. (2000) with minor modifications (Figure 1): California plus Baja California (area A), Sonoran desert (area B), Chihuahuan desert plus Sinaloan shrubland (area C), and the highlands of southern Mexico (area D). To reconstructs the ancestral distributions for the areagrams, a dispersal-vicariance optimization (Ronquist, 1997) was undertaken in DIVA software (Ronquist, 1996) and a Dispersal-extinction-cladogenesis (DEC) model in Lagrange (Rei and Smith, 2008).

Results and Discussion
The final data sets for each gene included six taxa for Pipilo sp. and seven taxa for Toxostoma sp. (GenBank accession numbers available upon request). The lengths of the obtained alignments with Muscle software (Edgar et al., 2004) are presented in Figure 2. The Hasegawa-Kishino-Yano plus Γ distribution model (HKY + Γ model) (Hasegawa et al., 1985) was the best fit to each data with an α (shape parameter) value of 0.3 for the mitochondrial region control gene of Pipilo sp. and 0.02 for the remainder data sets. The ML phylogenetic trees are shown in Figure 3. The same relationships were found with each gene for each genus. There were differences in the branch lengths among genes. Only one areagram resulted for each genus as show in Figure 4. In the DEC model the most likely ancestral areas for Pipilo sp. and Toxostoma sp. were the area B and area D respectively, with other areas for each genus having lower likelihoods (−ln(L) values available upon request) (Figure 5). The dispersion-vicariance optimal distribution showed as the ancestral area for Pipilo sp, the union of the areas BD and for Toxostoma the combinations AD, BD and CD. Because in the DEC model widespread ranges are the direct outcome of dispersal events, some optimization (see Figure 5 for Pipilo sp. /NADH gen) are the outcome of solely dispersion and extinction. In all the scenarios, the number of biogeographical events required for explain the actual distribution are lower in the DIVA reconstruction that in any of the DEC model reconstructions; because some cladogenesis events are explained by DIVA as a result of vicariance from a widespread ancestor, and not by dispersals followed by extinctions in the original area. Like Fitch optimization, DIVA minimizes dispersal and extinction and it is based on Allopatric speciation (vicariance) rather than on sympatric speciation. In the other hand, DEC model assumes that if an ancestor is widespread, the speciation arises either between a single area and the rest of the range (Allopatric speciation), or within a single area (sympatric speciation) (Ree et al., 2005). Nonetheless, the results presented here show that DEC model preferred the former one with one daughter species always inheriting a single-area range, and the other inheriting the remainder. To compare how, the branch lengths affect the dispersion and extinction outcome, all the branch lengths in the phylogenetic tree for the NADH gene in the genus Pipilo were set in two separately analyses to 1.000 and to 0.001. The results showed an inverse relation between branch lengths and the dispersion/extinction rates: a long branch length indicated that such taxon had less change of disperse and goes extinct. Hence, under DEC model we have to assume that the he rate of evolutionary change is equal throughout the tree and, furthermore, that we can relate such change with the potential of a taxa to expand or reduce its geographic range. Finally, seems that the restriction of one area to the root could be problematic (and maybe only could work for island scenarios where we can refer to colonization and geography range expansion to the nearest islands. A pure dispersalism approach) when trying to search for general area relationships using different hypotheses and try to fitting areagrams to them. In our data analyses, the DEC model suggested different ancestral areas for each Genus, whereas DIVA considered both possibilities in each case.





Figure 1. General distribution of the areas: California plus Baja California (area A), Sonoran desert (area B), Chihuahuan desert plus Sinaloan shrubland (area C), and the highlands of southern Mexico (area D).



Figure 2. Legth of the obtained alignments for the genera Pipilo and Toxostoma



Figure 3. Maximun Likelihood trees for the genera Pipilo and Toxostoma using different genes.



Figure 4. Areagrams for the genera Pipilo and Toxostoma.



Figure 5. Optimal ancestral distribution for the genera Pipilo and Toxostoma using dispersal-vicariance optimization and the Dispersal-extinction-cladogenesis (DEC) model approach with different genes.