The objective of this work is to evaluate the effect of differential evaluation of genes physically linked in the phylogenetic reconstruction of the “dubium - signatum clade” (Enallagma – Zygoptera) under parsimony and maximum likelihood inferences.
For this purpose sequences of a mithocondrial region which contains the COI, tRNA and COII genes were taken from the genbank (AF064995, AF064992, AF065038, AF065034 , AF065033, AF065028, AF65013), and two morphological data matrices were checked (Brown etal., 2000; May, 2002). The shared characters were put only once. Some characters were split because were referring to two characters (i.e presence of the structure and the form (see May, 2002., character 19)) or the character states of a character were the combination of different independent characters recognisable by topological correspondence (Rieppel & Kearney, 2002) (see Brown. etal., 2000, characters 1, 20 and 22. The re-codification of the characters did not affect the relationships among the species of the group suggested in previous analyses. The data was analysed by partitions and in different merges, in order to evaluate the influence in the groups and the evolutionary model, under parsimony and Maximum likelihood criteria. The parsimony analysis was done with the software NONA (Goloboff, 1999) and the characters were mapped with the program winclada (Nixon, 2002). The evolutionary models were selected using hLRT as is implemented in modeltest (Posada & Crandall, 2001) and the MH searches were done with POUP* (Swofford, 2002)
To check the performance of different models of evolution into the same sequence with a real topology as reference, data matrices under different models and with different lengths were generated using the softawe Seq-gen version 1.3.2 (Rambaud, 2007). The data matrices were analysed with parsimony and MH as in the previous part.
Results
Real data
Gene | Model | Base rate | Alfa (G distribution) |
COI | HKY + G | 0.4322 0.2264 0.0721 | 0.1220 |
COII | TrN + G | 0.3729 0.1737 0.1521 | 0.1582 |
tRNA | K80 | equal |
|
COI-tRNA | HKY+G | 0.3856 0.2315 0.1138 | 0.0930 |
tRNA-COII | TrN+G | 0.3652 0.1783 0.1584 | 0.1553 |
COI-tRNA.COII | TrN+G | 0.3762 0.1846 0.1467 | 0.1630 |
Simulations
Matrix | Length | Simulated model | Calculated model | Nodes recovered by likelihood | Nodes recovered by parsimony |
1 | 200 | HKY | K2P | all | all |
2 | 600 | HKY | HKY | all | all |
3 | 600 | F81 | F81 | all | all |
1 + 2 | 200 + 600 |
| HKY | all | all |
1 + 3 | 200 + 600 |
| HKY | all | all |
4 | 120 | F81 + G (0.0075) | F81 + G + I | none | none |
5 | 680 | HKY + G (0.0591) | HKY + G + I | all | all |
6 | 60 | JC | F81 | all | all |
7 | 60 | JC + I | JC | two | all |
4 + 5 | 120 + 680 |
| HKY + G + I | all | all |
4 + 5 + 6 | 120 + 680 + 60 |
| TrN + G + I | all | all |
4 + 5 + 7 | 120 + 680 + 60 |
| HKY + G + I | all | all |
As other authors have exposed (Brown etal., 2002, May,2002) the group showed to be monophyletic unden both phylogenetic inferences (parsimony an ML) also with high jakknife support values, which shows that the amounts of favorable evidence is greater than the contradictory evidence (Goloboff etal., 2003). The inner relationships of the group were again the same under both inference methods but the dissent with the previous hypothesis in two nodes. In the present analysis the clade (pollutum,(dubium,signatum)) was recoved just for the morphological analysis (not shown), instead dubium appears as the sister group of the rest “dubium – signatum clade” and pollutum-signatum as a group. This result could be due complex morphology among the genera Enallagma, an specially to the lack of clear character definition (i.e states that could be anything, as other colour).
Although, the different partitions gave different results these were not contradictory (except for the morpology that was congruent with the previous hypothesis), and with the increasing in character number (length of the sequences) the resolution and support were also increasing. Instead, this beheaviour is not a rule for real data because the clades recovered by two genes could be different, in this case were the linkage is physical was expexted. The support values increased because the with the addition of new information in this particular case the number of synapomorphies increased as the number of (self-congruent) contradictory charactes ramains low.
Both, in real and simulated data the general development of the model caluculation was the same, to have the higher model among the partitions. In the simulated data, when the model simulation was done with variationof substitution among sites the model was not recovered by the hierarchical test, which is expected because as are different rates of evolution one could become inconsistent (Yang, 2006). Anyway, for most of the cases ML and parsimony recovered the true topology, ML got problems when the simulation model was very slow and other paramenters as invariants were involved.
Nevertheless, the different portions of the gene are evolving at differnt models and rates the groups are not sensitive to this liberty. The “dubium – signatum clade” is a very stable group, in which partitions do not compromise it as a unit and is also consistent in the way as with the increasing in character number the support and resolution of the clade also improve.