miércoles, 10 de marzo de 2021

Philosofical Posture

 

PHILOSOPHICAL POSTURE

Iver Leandro Daza

Traditionally the objective of parsimony has been reduced to the expression of William de Ockham: "it is useless to do with more what can be done with less", and this in essence has been said to be the principle of parsimony [1]. Parsimony, among other principles for inference, have been proposed to reach conclusions about phylogenetic relationships [2]. Maximum parsimony (MP), seeks to find the tree topology that requires the least amount of changes in the state of the character to produce the characteristics of the terminals present in the tree and build said phylogenetic relationships and, additionally, provides an ordering of the trees produced from "best" to “worst” [3]. This inference method requires that the researcher distinguish between antetral and derived characters, plesiomorphies of apomorphies, in the data set to be used [4]. With this said, the best genealogy is the one that contains the least homoplasy [2].

Among the main arguments against the use of parsimony, it is found that parsimony implicitly assumes doubtful propositions about evolutionary processes [3]. For authors who propose against the use of parsimony, such as Felsenstein, this may be inconsistent under a simplified process model and also consider that in empirical data it has a higher probability of failure [5]. Another argument mentions that it tries to estimate too many parameters, this comes from the fact that, if each character of the data is allowed its own branch length vector, the parsimony results are similar to Maximum likelihood [6]. Parsons also argues against it, mentioning that the simplest explanation is not always correct, and there is no prima facie reason to believe that a theory is correct because of the number of objects or entities it contains [7].

Despite the opposition against this approach, there are others who favor this perspective. Platnick and Gaffne in a discussion of Popper and systematics suggested that the best method for phylogenetic inferences is the one on which strong proofs can be developed in the Popperian sense, which could offer a justification for the use of cladistic parsimony [8]. In fact, Sober maintains that phylogenetic reconstructions are better interpreted from the Popperian corroboration [5]. Even when it’s generally accepted that for molecular analyzes likelihood is more appropriate, as well as inference by parsimony is more appropriate for morphological data, it has not been proven and in fact it is more “a social fact and not something logically inevitable", in other words, the data does not have an explicit indicator of which method is better for the analysis [3]

It seems the homoplasy interference while trying to study the phylogenetic relationship in the clade caudata has not been fully resolved using these approaches, that’s why the implied weighing method of the MP has a certain appeal given the fact that no other research (not even the amphibian tree of life phylogeny [9]) has tried to reweight characters given the rampant homoplasy some authors claim have found [10,11]. The justification for the use of parsimony is quite dependent on philosophical and statistical inference [1] just as corroboration and probability both are linked, as discussed by popper [8]. In the maximum parsimony the implied weighing method follows the inverse ratio path that the corroboration and probability have, as the more homoplasy found in a tree’s character, the less impact it has on the subsequent construction of the trees.

The ongoing debate on which method might be better will continue, but I am not going to say that future works using Bayesian inference or maximum likelihood methods will not be helpful. The current approach is a proposal to resolve the current situation and try to emphasize the important of parsimony method among the family-level relationship of the salamanders

Bibliography

1.Sober E. 2004 The contest between parsimony and likelihood. Syst. Biol. 53, 644–653.

2.Rodríguez A et al. 2017 Inferring the shallow phylogeny of true salamanders (Salamandra) by multiple phylogenomic approaches. Mol. Phylogenet. Evol. 115, 16–26.

3.Coelho MTP, Diniz-Filho JA, Rangel TF. 2019 A parsimonious view of the parsimony principle in ecology and evolution. Ecography (Cop.). 42, 968–976.

4.Sober E. 1983 Parsimony in Systematics. Ann. Rev. Ecol. Syst. 14, 335–357.

5.Popper K. 2005 The logic of scientific discovery. Routledge.

6.Daniel F, Peter C. 2016 Probability , Parsimony , and Popper . Cranston Published by : Oxford University Press for the Society of Systematic Biologists Stable Pa. Syst. Biol. 41, 252–257.

 

Parsimony analysis using implied weighting on an asian salamander family (amphibia, caudata)


Introduction

Caudata is a group that contains 578 species grouped into 67 genus, that forms part of the Lissamphibia together with modern Anura and extant Gymnophiona[1,2]. These species are contained within 10 families: Cryptobranchidae, Hynobiidae, Proteidae, Sirenidae, Dicamptodonidae, Ambystomatidae, Salamandridae, Plethodontidae, Amphiumidae and Rhyacotritonidae [2]. Since some genera of salamanders have been used as a study model [3–6], phylogenetic analyses with more robust resolution at the family level are necessary [3]. These investigations search for evolutionary patterns from comparatives studies, such as the mechanisms of tissue regeneration, and they can identify unique and derived characters and identify present homoplasy as long as they contain phylogenies with adequate resolutions [7]. This homoplasy is frequently found within the clade of caudata [8,9]. Many of the characters present in caudata have not developed due to synapomorphy, which may lead to the belief that this tendency may be reflected at the molecular level [10]. In this sense, using a phylogeny it’s possible to find answers about the behaviour, origin or development of certain characters, however, arguments explaining characteristics in plethodontids may not work for hynnobiids [10]. To search for these arguments, parsimony inference has been used for the phylogenies [3,8,9,11–13]. Even taking into account the aforementioned abundance of homoplasy throughout caudata, parsimony analyses have been using equal weights. The relationship between character weighing and parsimony was previously discussed and it was stated that "the most parsimonious cladogram is the hypothesis with the most explanatory power, given the weight that each character deserves", concluding that if the data is weighed appropriately, that data should always be preferred [14]. Furthermore, reducing the weight of characters with greater homoplasy is an issue that has been treated favourably on several occasions [15]. Implicit character weighing, as this method of weighing against homoplasy is known, improves analysis results even on large molecular data sets [16]. In other words, If the results of a given implied weighing outcome are not satisfactory “the results can only be criticized on the grounds that the weights have not been assigned in the best way” [15]. Said weighing can be done through a concavity function used in implied weighting, which is a decreasing function that weighs characters according to their homoplasy so that, when the fit of trees are compared, the influence of that homoplasy will be properly rescaled [17]. The influence of the homoplasy in the characters is altered with a constant, which is K, however, the use of a certain optimal value can be misleading because a group can appear only under this value, and the monophyly of this group cannot be considered as firmly established just because it occurs close or on that value [18].

Methodology

For this analysis, data was obtained from Weisrock [12]. With a total of 14 terminals used within the hynobiidae family with 1 mitochondrial gene and 1 nuclear gene (12S-16S and ND2-COI). For this, the sequences were aligned via MUSCLE [19] using Andrias japonicus as an outgroup and the present gaps were taken as missing information. The analysis was made using maximum parsimony and TNT v.1.5 [20], establishing parameters for the K value following previous guidelines [21], however in this work 5 values among said guidelines will be evaluated (20, 21, 23, 26, 30). Subsequently, a heuristic search was performed with the "mult" command using 100 replications, with the TBR method, 10 drifting iterations and using a random adittion sequence. The tree support was built by means of the "resample" command using 500 iterations of bootstrap. Saving each tree with its support values.

Results and discussions

Throughout the procedure, each value of K resulted in a tree on which bootstrap was applied, the topology of the trees conserves the same phylogenetic relationships among themselves except between Hynobiidae amjieensis and H. leechi, which presents a polytomy with the values of K = 26 and K = 30. In figure 1 you can see the tree with the best support obtained with the different values of K.

Fig. 1 Phylogeny of clade hynobiidae evaluated using value k = 23, includes the outgroup A. japonicus and bootstrap values




Fig. 2 Phylogeny of clade hynobiidae evaluated using value k = 30, includes the outgroup A. japonicus and bootstrap values

The relationships that are presented are mostly similar, however, the differences found between the topologies given by the values of K = 26 and K = 30 lead to support the topology shown in figure 2. Compared with the tree obtained by maximum parsimony from the original work, the bootstrap supports are higher, while the bootstrap values obtained by TNT during this analysis are not as well supported as those mentioned from Weisrock et al. work. This may be due to the genres taken for the present analysis. On the other hand, the phylogenetic relationships obtained here do not differ much from those obtained by Weisrock [12], a result that can also be associated with the number of species taken.

Conclusions

Implicit weighing was used during the parsimony analysis, and larger differences were expected in phylogenetic relationships that could reflect the impact of an assessment of the weight of traits based on their homoplasy. However, the results do not show such an impact, so it is thought that an evaluation with more data and more terminals in the data matrix to be used could show if there is a significant difference. However, and taking into account the results obtained here, it is possible that it will not have a marked difference regarding the use of equal weights of the characters to be used in maximum parsimony for clades where homoplasy is very marked throughout the characters.

Bibliography

1. Blackburn D, Wake D. 2011 Class Amphibia Gray, 1825. In: Zhang, Z.-Q.(Ed.) Animal biodiversity: An outline of higher-level classification and survey of taxonomic richness. Zootaxa 3148, 39–55.

2. Pearson M. 2016 Phylogeny and systematic history of early salamanders. University College London.

3. Zhang P, Wake DB. 2009 Higher-level salamander relationships and divergence dates inferred from complete mitochondrial genomes. Molecular Phylogenetics and Evolution. 53, 492–508. 

4. Arenas Gómez CM, Gómez Molina A, Zapata JD, Delgado JP. 2017 Limb regeneration in a direct-developing terrestrial salamander, Bolitoglossa ramosi (Caudata: Plethodontidae). Regeneration 4, 227–235.

5. Liu Q, Zhang Y, Wang J, Yang H, Hong L. 2020 Modeling of the neural mechanism underlying the terrestrial turning of the salamander. Biological. Cybernetics. 114, 317–336.

6. Parish CL, Beljajeva A, Arenas E, Simon A. 2007 Midbrain dopaminergic neurogenesis and behavioural recovery in a salamander lesion-induced regeneration model. Development 134, 2881–2887.

7. Dwaraka VB, Voss SR. 2019 Towards comparative analyses of salamander limb regeneration. Journal of Experimental Zoology Part B: Molecular and Developmental Evolution

8. Gao KQ, Shubin NH. 2001 Late Jurassic salamanders from northern China. Nature 410, 574–577.

9. Mueller RL, Macey JR, Jaekel M, Wake DB, Boore JL. 2004 Morphological homoplasy, life history evolution, and historical biogeography of plethodontid salamanders inferred from complete mitochondrial genomes. Proceedings of the National Academy of Sciences of the United States of America. 101, 13820–13825.

10. Wake DB. 2009 What salamanders have taught Us about evolution. Annual Review of Ecology, Evolution, and Systematics. 40, 333–352.

11. Faivovich N et al. 2006 The amphibian tree of life. Bulletin of the American Museum of Natural History

12. Weisrock DW, Macey JR, Matsui M, Mulcahy DG, Papenfuss TJ. 2013 Molecular phylogenetic reconstruction of the endemic Asian salamander family hynobiidae (Amphibia, Caudata). Zootaxa 3626, 77–93.

13. Weisrock DW, Harmon LJ, Larson A. 2005 Resolving deep phylogenetic relationships in salamanders: Analyses of mitochondrial and nuclear genomic data. Systematic Biology. 54, 758–777.

14. Farris J. 1983 The logical basis of phylogenetic taxonomy. Systematic Biology. 54, 595–619.

15. Goloboff PA. 1995 parsimony and weighting: a reply to turner and zandee. Cladistics , 91–104.

16. Goloboff PA. 2013 Extended implied weighting. Cladistics 1, 1–13.

17. Goloboff PA. 1995 parsimony and weighting: a reply to turner and zandee. Cladistics , 91–104.

18. Goloboff PA, Carpenter JM, Arias JS, Esquivel DRM. 2008 Weighting against homoplasy improves phylogenetic analysis of morphological data sets. Cladistics 24, 758–773.

19. Madeira F et al. 2019 The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Research. 47, W636—W641.

20. Giribet G. 2005 TNT: Tree Analysis Using New Technology. Systematic Biology. 54, 176–178.

21. Mirande JM. 2018 Morphology, molecules and the phylogeny of Characidae (Teleostei, Characiformes). Cladistics 35, 282–300.


Phylosophy in phylogenetic analyses

 

Phylosophy In Phylogenetic Analysis

María Cristina Navas Serrano 2170058.


In the present written, I going to expose why the Bayesian philosophy, according to my perspective, have a frame of beliefs more credible than the philosophy of likelihoodism.

First of all, bayesian philosophy, in general terms, use the Bayes theorem to support its beliefs, used for Richard Royall's itself to answer question about what we should believe from the evidence. But, why believe in Bayesian philosophy and not in likelihoodism? The line of thought proposed by Bayesian philosophy is firm, stating with certainty we should believe in a hypothesis or not the given evidence, with a mathematical foundation, that is different to the likelihoodish, which don’t have an answer for this question. They support instead a degree of belief for the evidence, and that’s a different approach1, 2.

The basis used by Bayesianism does not focus only on the results based on the present observations we make (called posteriori probabilities), it takes into account the probabilities of the hypotheses before we have the observations (called a priori probabilities)1. Likelihoodism may be similar to Bayesionism in some aspects of the probability, they differ by those characteristics. Likelihood don’t take into account the prior probabilities, focusing only on how the evidence may or may not support one hypothesis compared to another alternative hypothesis. The probabilities a priori influence the posteriori probabilities, this evidence of possible results, an initial expectation, what Sober (2008) calls an "anchor" of the results, is necessary 1.

And by focusing more on this part of the mathematical basis of Bayesian philosophy, that as I said first, is based on the Bayes theorem to support its beliefs, giving it a mathematical basis that supports its ideas. However, likelihoodism is based on any theorem. Likelihoodism is a concept based on the Likelihood function proposed and defended by R. A. Fisher, who justified that likelihoodism is the measure of the fit to the evidence to the hypothesis1, 2. It doesn't take care if the evidence make changes in the hypotheses probabilities, it just compare hypotheses with determinate probability. This is closer approach, and it could generate mistakes when we want to know what is the reason of the changes. That mean that this is not applicable, or that it doesn't work to support theories or support hypotheses, eventually, my project is based on this philosophy. I don’t have another option by the nature of my data and because it allows leaving out some positions that can be subjective when it comes to Bayesianism. By not having a priori probabilities ​​or subjectively assigning the probabilities of the hypotheses, my knowledge acquired empirically doesn't influence the analysis that I'm making.

However, Likelihoodism is more efficient, but it doesn’t make it more precise. Bayesianism in addition to what we should believe from evidence, permits us to know the reliability of a hypothesis if we know how possible a hypothesis is to be true given the observations, and also permits us to know how sensitive the observations and procedures are to changes for answer the hypothesis, while likelihoodism depends on the quantitative ratio of likelihoods (support measure of the observations to one hypothesis or another) to give a degree of credibility to the hypotheses.1 Likelihood don’t measure the reliability or the sensitive in the observations, because it doesn’t measure posteriori probabilities, and it doesn’t analyze one single hypothesis, it always need an alternative hypothesis to compare.

In likelihoodism, any hypothesis can have a degree of credibility. With this, we return to the topic of ignoring probabilities a priori, and I will use the example proposed by Sober (2008) to explain this a bit1:

Suppose we hear noises in our attic. Analyzing the situation, we might think that gremlins are bowling in our attic. Under a likelihoodism and Bayesian frame, the probability that the noise was generated by gremlins is high, because if the gremlins were playing bowling in the attic, they would make a lot of noise, that's a fact. The probability that the noise was due to gremlins would be taken into account, which is a hypothesis with a very, very low probability under the likelihoodism and Bayesian frame, but, Bayesianism take account the probability of have gremlins, that give us more information, and that affects our posteriori probability and give us a more precise result1. Because of this, we could use the likelihood philosophical framework to state any hypothesis, even giving it a low probability, whether it is supported by observations, it can become more likely than other more realistic hypotheses. The fact that it is possible to build any hypothesis does not justify the correctness, and it gives a degree of subjectivity to this method, thus allowing very strange hypotheses to be very probable and that any hypothesis can be raised.

If we put together everything described here, using Bayesian philosophy we can have more precise answers that cover more information about the total probabilistic frame that the hypothesis has, in addition to having more possibility of analyzing the data, and taking into account the probabilities a priori, due to its solid mathematical base provides better analysis to the data, even though it cannot be very flexible in all its terms and have some subjective aspects that could affect our analysis.1. Because my question is based on answering how much data and methods influence the result, it is possible to do it under a likelihoodism framework, focusing on the methods that I use and the difference between them, however, I still support my philosophical position on the side of the Bayesianism.

 

 

BIBLIOGRAPHY

1.       Sober, E. (2008). Evidence and Evolution: The logic behind the science. Cambridge University Press.

2.           Royall, R. (1997). Statistical Evidence. A Likelihood Paradigm. Chapman and Hall.

 

 

 

 

How the number of fossils and the type of molecular clock changes the estimation of ages

 

HOW THE NUMBER OF FOSSILS AND THE TYPE OF MOLECULAR CLOCK CHANGE THE ESTIMATION OF AGES? 

DATE: 10/03/2021

María Cristina Navas Serrano 2170058.


INTRODUCTION

In recent years, the use of DNA sequences is more common to estimate evolutionary moments of relevance, especially for the estimation of the divergence times of clades and species (Rutschmann, 2006). In 1965, Zuckerkandl and Pauling (1965), after compare the differences in the hemoglobin protein sequences of different species against the estimated ages of the species' fossils, they postulated that the differences between the DNA sequences of two species are a function of the time that they diverged apart, so they evolved at a rate constant over time, adapting to the postulates of Moto Kimura's neutral theory (Rutschmann, 2006; Bromham and Penny, 2003).

With the discovery of the molecular clock, various techniques different than a constant rate clock has been proposed. Langley and Fitch (1974) said that the evolutionary rates in primates were different from those of mammals, so a molecular clock with strict rates will be an imprecise method in some cases. Through time, it has been shown that constant rates of evolution may be the exception rather than the rule, and therefore species have different evolutionary rates, calling clocks that have variable rates “relax clocks” (Welch and Bromham, 2005).

Various techniques have been developed for the estimation of strict (constant rate clocks) and relax clocks, often using information of geological moments or the estimated ages of the fossils to calibrate the topologies. One of the simplest methods to estimate divergence times with one rate of change is the Langley-Fitch method (Langley and Fitch, 1974; Sanderson, 2003), which uses maximum likelihood to optimize the rate of substitutions in phylogenies with known branch lengths, recalculating the branch lengths and calculating the divergence times (Rutschmann, 2006). For the calculation of relax clocks, one of the methods that estimates divergence times incorporating heterogeneity in the rates is the nonparametric rate smoothing method, from the acronym NPRS, which estimates unknown divergence times at the same time as smooths the rate at which the rates change along the phylogeny, using a nonparametric function that penalizes the rates that change faster in the branches, as the rate of the tree itself changes (Sanderson, 1997, 2003; Rutschmann, 2006). A technique that combines the two mentioned methods is the penalized likelihood (PL), a semi-parametric method that uses a penalty value for smoothing, which can be estimated by methods based on the data, whose value leads to strict models if the value is high, or models with unconstrained smoothing such as NPRS if the smoothing value is low (Sanderson, 2002, 2003; Rutschmann, 2006). Other varieties of techniques also estimate divergence times by incorporating heterogeneous rates in their calculations, such as heuristic rate smoothing (AHRS), or the implementation of Bayesian models such as PHYBAYES (Rutschmann, 2006).

Through time, the increase in molecular information and DNA sequences, and the improvement of technology has allowed the development of programs that make these techniques, therefore, in this project, it will be tested the sensitivity of the estimates of divergence times depending on the number of fossils tips and depending of the type of molecular clock used, using the program r8s v1.81 (Sanderson, 2003) under the Langley-Fitch method (Langley and Fitch, 1974; Sanderson, 2003) for 6 topologies of turtles to the Pleurodira suborder, of the Pelomedusoides clade.

 

 

METHODOLOGY

Sampling

For the construction of the topologies, it was used the largest matrix of morphological characters of of extant and extinct turtles of the Pleurodira suborder currently known, made by Ferreira et al. (2018) (101 taxa x 245 characters), which was edited to select only 17 terminals with representatives of the 3 families of Pelomedusoides: Podocnemididae, Pelomedusoidea and Bothremididae. 7 terminals fossils and 10 existing terminals was chosen randomly, where it was ensured that this terminal had sufficient molecular information, including 2 outgroups: Chelus fimbriata, an outgroup of Pelomedusoides currently existing, and Proganochelys quenstedti, a fossil outgroup of Pleurodira.

For the existing terminals, a sampling of molecular information was performed in the public access database GenBank (Benson et al., 2015) of 8 loci, with the mitochondrial genes CYTB, COI, 12S and the nuclear genes RAG1, RAG2, R35, because it was possible to find sequences of this genes for the most terminals. When multiple sequences were available for a given species, the longest sequence was used. The gene sequences were individually aligned with the ClustalW algorithm (Wilm et al., 2007) in the program MEGA-X v11.0 (Kumar et al., 2018), and all the genes sequences were concatenated in an only one molecular matrix using MESQUITE v3.61 (Maddison and Maddison, 2019).  The program JmodelTest v2.1.10 (Darriba, et al. 2012) was used to infer adequate DNA substitution for the set of genes, using the Akaike information content (AIC).

The matrix of morphological characters was concatenated by hand together with the molecular matrix in a single matrix of total evidence consisted of 8493 morphological and molecular characters.

 

Phylogenetic Analyses

To see the effect of the different number of fossil terminals, that matrix with a total of 7 fossils was edited by hand to generate another 2 matrices, with 5 and 3 fossils respectively heuristic search, removing the fossil terminals Bothremys maghrebiana and Bairdemys thalassica for the matrix of 5 fossils, and Ummulisani rutgersensis and Caninemys tridentate for the matrix of 3 fossils. The fossil terminals removed were chosen randomly.

Subsequently, with the matrices, a maximum likelihood analysis in PAUP* v4.0a (Swofford, 2003) was performed, using a heuristic search of 100 replicates, applying the tree bisection and reconnection (TBR) algorithm, under the parameters of the molecular model that best adjust to the molecular data, GTR + I + G, without forcing a clock (relax clock) and forcing the clock (strict clock), obtaining only a single best tree for each one of the searches with branch lengths.


Divergence Time Analyses

For the estimation of the divergence times, all the topologies with branch lengths were analyzed in r8s v1.81 (Sanderson, 2003) where the trees made with a clock were analyzed according the parameters of the program, with parameters of ultrametric topologies using the ultrametric command, without using an algorithm that changed the branch lengths, because with this the uncalibrated ages are immediately available for the tree, and scales the times to the absolute age of one specific node in the tree (Sanderson, 2003). The trees made with the relax clock were edited to remove the length of the root, that had a value of 0, and the branch lengths of the outgroup Proganochelys quenstedti was used as the root, because PAUP roots the trees with the closest sister group, leaving a root of length 0, forcing a basal trichotomy. For fixing that error, it was used the outgroup Proganochelys quenstedti as an extra outgroup following the recommendations of the program. The topologies made with relax clock were analyzed under the Langley-Fitch method (Langley and Fitch, 1974; Sanderson, 2003), trying 3 initial points.

All topologies made with both clocks were calibrated with the fossil node of Acleistochelys maliensis with an approximate age of 60 Ma, and the topologies made with relax clock were given a root age range between 140 Ma to 160 Ma, according to the age estimates of the Pelomedusoides clade made by Ferreira et al. (2018).

All the trees resulting from the analyzes were visualized and edited in the Figtree v.1.4.4 program, and with the Ape package (Paradis et al., 2004) from R (R Core Team, 2020).

 

 

RESULTS

The best molecular model calculated by JmodelTest v2.1.10 (Darriba, et al. 2012) using the Akaike information content (AIC) was GTR with invariants and gamma (GTR + I + G).

The results of the phylogenetic analyses with maximum likelihood were one best tree of each one of the 6 searches for the different clocks.

In the topologies made with strict clock it is noted that Bairdemys thalassica is the sister taxon of the clade compound of taxa Carbonemys cofrinii to Podocnemis lewyana (Figure 1). In the topologies made with relax clock Bairdemys thalassica is the sister taxon of Caninemys tridentate, and they compound one single clade, sister of Podocnemis expansa (Figura2). It can also be seen the different relations of the family Podocnemididae. In the topologies with strict clock, the clade compound for Podocnemis expansa and Podocnemis unifilis is the sister clade of the clade compound of Podocnemis erythrocephala and Podocnemis lewyana. It is also observed that Erymnochelys madagascariensis and Peltocephalus dumerilianus made one single clade.

The results of the divergence time analysis for the trees made with a strict clock showed different time intervals with the different number of fossils used.

The topologies made forcing a strict clock shows an interval of time from 19.22 Ma, to 126.09 Ma with 7 fossils, form 18.31 Ma to 120.37 Ma with 5 fossils and 14.84 Ma to 97.97 Ma with 3 fossils (Figure 1), showing time intervals smallest that the shows the topologies made with relax clock and different estimations of time for all the nodes. All the estimations keep the age of 60 Ma of the node of Acleistochelys maliensis.

The topologies made with relax clock shows an interval of time from 11.49 Ma to 157.83 Ma with 7 fossils, from 13.78 Ma to 163.1 Ma with 5 fossils, and from 12.57 Ma to 163.48 Ma with 3 fossils (Figure 2), showing similar ages for the root, the node of Chelus fimbriata, and the nodes of the clade of the Pelomedusidae family: Pelomedusa subrufa, Pelusios castanoides and Pelusios castaneus. All the estimations keep the age of 60 Ma of the node of Acleistochelys maliensis, and the age range for the root of 140 Ma to 160 Ma.


Figure 1. Divergence time resulting from topologies made with strict clock and different number of fossils.


Figure 2. Divergence time resulting from topologies made with relax clock and different number of fossils.

 


DISCUSSION

It is possible to observe that the estimations made with a greater number of fossils are those that have the highest values ​​in the interval of ages of the estimation of divergence times (Figure 1 and 2). This can be explained by the extra information provided by the fossils in these analyzes, when a greater number of fossils, like in the estimation with 7 fossils, allows to have a more precise result then in both estimations with 3 fossils, when the most fossils allowing an increase in the total length of branches of the trees (Schwartz and Mueller, 2010).

 Of all the topologies, the topologies made with relax clock were the ones that showed an estimated maximum value of the time interval closest to the mean of the interval given to the root from 140 Ma to 160 Ma (Figure 2), allowing us to infer that r8s has a better behavior with topologies with a relax clock, that is, not ultrametric topologies, since initially it permits a more complete calibration, letting us to constrain complete nodes as it made with the root node; receive more information on the ages of the fossil nodes for dating, in addition to complementing the estimation using the Langley-Fitch model (Langley and Fitch, 1974; Sanderson, 2003) to re-estimate the branch lengths and ages, giving a parametric validation to the age estimations and a more precise calculation compared to that performed by r8s for the ultrametric trees (Figure 1), where it doesn’t use an estimation model of divergence ages, it just scales the actual branch lengths of the given topologies in ages, in relation with only one the age that is given to calibrate (Sanderson, 2002, 2003).

However, through times, the methods in which fossil evidence is used for estimates of divergence, and how fossils affect the estimates, have been controversial, in the estimations made, the node selected to calibrate the estimates, Acleistochelys maliensis, may have influenced the results (Ho and Phillips, 2009; Lukoschek et al. 2012, Saladin et al. 2017). The position of this fossil node in each topology, and the absence of the information of certain fossils could have generated changes in the calibrations, since this specific fossil in reality may not represent a specific node, but rather a point in the branch, and however, certain fossils that allowed more precise estimates may have been eliminated, such as Ummulisani rutgersensis and Caninemys tridentate in the topologies made with strict clock (Figure 1), or being affected by the precision of the estimation method, when the estimations made with 5 and 3 fossils of the topologies made with relax clock have close time intervals, so the information of the fossils removed from an estimation to other, Ummulisani rutgersensis and Caninemys tridentate, didn’t represent informative nodes (Figure 2) (Lukoschek et al. 2012, Saladin et al. 2017).

However, time-of-divergence analyzes can present various errors that can affect estimates, such as poor selection of the evolutionary model, poor molecular and morphological sampling and, as mentioned before, errors in the selection of calibration nodes (Ho and Phillips, 2009; Lukoschek et al. 2012, Tamura et al, 2012).

It is concluded then, that the divergence time estimates are sensitive to the number of fossils contained in the topology, and to the type of clock with which the topology is made, therefore, different numbers of fossils generate different age intervals, where the topologies with the highest number of fossils tend to be the most precise in the analysis, and the topologies that subsequently underwent a time-of-divergence analysis under the Langley-Fitch model (Langley and Fitch, 1974; Sanderson, 2003) were even more precise (Figure 1 and 2).

 

 

BIBLIOGRAPHY

·         Bromham, L. & Penny, D. (2003) The modern molecular clock. Nature Reviews Genetics, 4, 216–224.

·         Darriba, D., Taboada, G. L., Doallo, R., Posada, D. 2012. jModelTest 2: more models, new heuristics and parallel computing. Nature Methods, 9(8): 772.

·         Ferreira, G., Bronzati, M., Langer, M., y Sterli, J. (2018) Phylogeny, biogeography and diversification patterns of side-necked turtles (Testudines: Pleurodira). R. Soc. open sci, 5, 171-773.

·         Ho, S. Y. W., and Dechene, S. (2014). Molecular‐clock methods for estimating evolutionary rates and timescales. Molecular Ecology, 23: 5947-5965.

·         Ho, S., and Phillips, M. J. (2009). Accounting for Calibration Uncertainty in Phylogenetic Estimation of Evolutionary Divergence Times. Systematic Biology, 58(3): 367–380.

·         Kumar, S., Stecher, G., Li, M., Knyaz, C., y Tamura, K. (2018). MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Molecular Biology and Evolution, 35(6), 1547–1549.

·         Langley, C. H., and Fitch, W. M. (1974) An examination of the constancy of the rate of molecular evolution. J Mol Evol, 3, 161–177.

·         Larkin, M. A., Blackshields, G., Brown, NP., Chenna, R., McGettigan, P. A., McWilliam, H., Valentin, F., Wallace, I. M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., y Higgins, D. G. (2007). Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947-2948.

·         Lukoschek, V., Keogh, J. S., Avise, J. (2012). Evaluating Fossil Calibrations for Dating Phylogenies in Light of Rates of Molecular Evolution: A Comparison of Three Approaches. Systematic Biology, 61(1): 22.

·      Maddison, W. P. y D.R. Maddison. (2019). Mesquite: a modular system for evolutionary analysis. Version 3.61.  http://www.mesquiteproject.org.

·     Paradis, E., Claude, J., y Strimmer, K. (2004). APE: analyses of phylogenetics and evolution in R language. Bioinformatics, 20, 289–290.

·         R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

·      Rutschman, F. (2006). Molecular dating of phylogenetic trees: A brief review of current methods that estimate divergence times. Diversity Distrib, 12, 35–48.

·         Saladin, B., Leslie, A. B., Wüest, R. O. et al. (2017). Fossils matter: improved estimates of divergence times in Pinus reveal older diversification. BMC Evol Biol, 17, 95.

·         Sanderson, M. J. (2002) Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Molecular Biology and Evolution, 19, 101–109.

·         Sanderson, M. J. (2003) r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics, 19, 301–302.

·         Sanderson, M.J. (1997) A nonparameteric approach to estimating divergence times in the absence of rate constancy. Molecular Biology and Evolution, 14, 1218–1231.

·         Sauquet, H. (2013). A practical guide to molecular dating. Comptes Rendus Palevol, 12(6): 355-367.

·         Schwartz, R. S., and Mueller, R. L. (2010). Branch length estimation and divergence dating: estimates of error in Bayesian and maximum likelihood frameworks. BMC evolutionary biology, 10, 5.

·         Swofford, D. L. (2003) PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods) [Computer Program]. Version 4. Sunderland: Sinauer Associates

·         Tamura, K., Battistuzzi, F. U., Billing-Ross, P., Murillo, O., Filipski, A., and Kumar, S. (2012). Estimating divergence times in large molecular phylogenies. National Academy of Sciences, 109(47): 19333-19338.

·       Welch, J. J., and Bromham, L. (2005) Molecular dating when rates vary. Trends Ecol Evol, 20(6): 320-7.

·       Zuckerkandl, E. & Pauling, L. (1965) Evolutionary divergence and convergence in proteins. Evolving genes and proteins (ed. By V. Bryson and H. Vogel), pp. 97–166. Academic Press, New York.