**Introduction**

Estimating
divergence times can be performed by methods based on variation of
substitution rates. The AR method establishes
that the rates of change between ancestor and descendant are
autocorrelated and the rates can follow a log-normal distribution
with a mean equal to the parent rate (Thorne, Kishino and Painter,
1998; Kishino, Thorne and Bruno, 2001; Thorne and Kishino, 2002) or
can be determined by a non-central χ 2 distribution (Lepage

*et al.,*2007). Another method used in molecular clock analysis is the RR method which assigns random independent rates for each lineage and these rates are drawn from a single underlying parametric distribution such as an exponential or log-normal (Drummond*et al.,*2006; Rannala and Yang, 2007; Lepage*et al.,*2007).
RR method has been implemented in

**BEAST**the AR method has been implemented in**MULTIDIVTIME.**In terms of the programs, other difference is the priors on node ages, where**MULTIDIVTIME**uses a dirichlet distribution without an explicit assumption about the biological processes (Kishino, Thorne and Bruno, 2001; Thorne and Kishino, 2002) and in the other hand,**BEAST**uses a Yule prior and the Birth-Death prior (Yule, 1924; Rannala and Yang, 1996; Yang and Rannala, 1997).
These
two methods have been widely used to dating phylogenies and it is
recommended to use several calibration points; however, it is not
common to find a number of fossil equivalent to the number of nodes
in the topology. At the moment, it has not been established what is
the minimum number of constraints to achieve correctness divergence
times.

**Figure 1.**Number of citations. Comparison between Bayesian inference and Maximum Likelihood programs.

**Objectives**

**General objective**

To
determine what are the minimum number of calibration points related
to the number of tips of a topology using the Autocorrelated rate and
Random rate methods.

**Specific objectives**

To
correlate the divergence time estimated with the divergence time
simulated.

To
determine the delta of variation when the number of points increase.

**Methods**

**1. Simulations**

The
trees will be simulated considering four different number of tips:
10, 25, 50 and 100, plus an age of 31Mya at the root, with the
package phytools v.0.4-56 (Revell, 2012) in the R language and it
will be replicated 3
times. Based on the these trees, the sequences will be simulated
using the program Seq-gen v1.3.3 (Rambaut and Grass, 1997), under the
model HKY, with base frequencies 0.30[T], 0.25[C], 0.30[A], 0.15[G];
a transition-transversion rate K=5, gamma parameter alpha = 0.5 and a
length of 1000 bp.

**2. Molecular clock analysis and calibration**

The
analyzes will be performed using the AR and RR methods respectively
in

**MULTIDIVTIME**v9.25.03 (Thorne and Kishino, 2002) (MDT) and**BEAST**v.1.8.2 (Drummond and Rambaut, 2007; Drummond*et al.,*2012). In**MDT**the divergence time will be calculated after 10⁶ generations employing a correlated relaxed lognormal clock. In**BEAST**the analyzes will be done under an uncorrelated relaxed lognormal clock, Yule speciation process, a normal distribution for the tmrca and different number of generations depending on the authors's recommendation about the relation between it and the number of tips. For both methods the calibration points will be chosen randomly, leaving the base node of the ingroup fixed to 30 Mya. The number of points will be replicated for every tree simulation.**3. Correlation and additional comparisons**

The
times simulated and the outcomes of these analyzes will be compared
and related by a Pearson correlation for each topology of different
size and replica. The standard
deviation among replicas was calculated and the delta between number
of points, with the aim of observe variance in the results.

**Results and Discussion**

Accurate
in dating with AR and RR methods varies according to the number of
tips in the phylogeny; by increasing the number of tips the
correlation coefficient (R

^{2}) decrease, thus the number of calibration points must be in proportion to the number of tips. It is evidenced when for the two methods there is a point from which the correlation decrease (Figure 2, a, b).B |

Figura 2. Variation of the correlation coefficient after increase the number of calibrated nodes. A. Autocorrelated rate method. B. Random rate method. |

The
addition of the second point of calibration does not improve the
dating of the phylogeny in
approximately 90% of
the cases reviewed (Figure
2).
Similar
results were obtained by Battistuzi

*et al.,*2010, they assess the effect of multiples calibrations into the credibility Intervals (CrIs) size, and they do not found differences in dating with one or two points. On the other hand, the use of exact divergence times can generate this behavior, in which differences between one and two points are not perceived (Benton*et al.,*2009). Consequently, the minimum number of calibration points is one, because with just one point the value of the correlation coefficient is greater than 0.89, in all cases. However, when I add three points, the coeficient (R^{2}) was high and in almost 70% of the cases was the maximum value.
Standard
deviation among
replicas
was of 10⁻¹,
although cases of standard
deviation
of 10⁻³ were
recorded , it
demonstrated that there is no changes in
using different points and multiples
runs
of the programs. This is consistent with Hedges

*et al.,*2004, who found that along many methods if the data and times are constant the estimations will be similar. Both methods evaluated in this work are based on Bayesian Inference and the*priors*play a key role in the estimations of divergence time and in all the analyses, but Brown and Yang in 2010 tested the influence of the*priors*and conclude that it affect the estimation, however, one can obtain similar results.**Figure 3.**Standard deviation among replicas for each number of tips.

**A.**MultiDivTime.

**B.**BEAST.

The
delta of increasing the number of points varied in the order of 10⁻¹
until 10⁻³. Thus,
demonstrating, that the differences among number of calibrations vary
in a little range.
Then,
when the position of the unique point was assessed, the position
imports, is not the same dating towards the tips that dating the base
node of the ingroup, but
the value of the coefficient of correlation is not below 0.75
in both cases (Figure 4). According to Battistuzi

*et al.,*2010, there is no significant differences in the relationship of the calibration nodes and the estimated nodes, “this suggests that the relative positions of the two nodes do not affect the time estimations as long as the two nodes are closely related (Battistuzi*et al.,*2010).” In contrast, with empirical data, the estimations depends on the position, the method, and the dataset (Hug and Roger, 2007), idea discussed by many other authors who assess this problem*via*simulations and empirical data and found similar results among different methods, or the parameters of the analysis.Figure 4. Variation of the correlation coefficient according to different positions of the calibrated node. |

In
addition, to reduce the error by 5% and increase the coefficient of
correlation in larger topologies compared to that obtained in small
topologies, with a single point calibration, a number of points four
time major than the points evaluated are required. However, a
solution would be dated by small separate clades within the topology
in order to obtain a high coefficient correlation for all phylogeny.

**Conclusions**

Finally,
the number
of tips
influence
the estimation,
regardless the method used in estimating divergence times. Although
a point is the minimum number, when
the number
of tips
increase
with
the
error occurs
exactly the same.

**References**

Battistuzzi,
F. U., Filipski, A., Hedges, S. B., & Kumar, S. (2010).
Performance of relaxed-clock methods in estimating evolutionary
divergence times and their credibility intervals.

*Molecular biology and evolution*,*27*(6), 1289-1300.
Brown,
R. P., & Yang, Z. (2010). Bayesian dating of shallow phylogenies
with a relaxed clock.

*Systematic biology*,*59*(2), 119-131.
Donoghue,
P., Benton, M., Ziheng, Y., & Inoue, J. (2009). Calibrating
and constraining the molecular clock.

*J**ournal of vertebrate paleontology*,*29*, 89A-89A.
Drummond,
A. J., Ho, S. Y., Phillips, M. J., & Rambaut, A. (2006). Relaxed
phylogenetics and dating with confidence.

*PLoS biology*,*4*(5), 699.
Drummond,
A. J., & Rambaut, A. (2007). BEAST: Bayesian evolutionary
analysis by sampling trees.

*BMC evolutionary biology*,*7*(1), 214.
Drummond,
A. J., Suchard, M. A., Xie, D., & Rambaut, A. (2012). Bayesian
phylogenetics with BEAUti and the BEAST 1.7.

*Molecular biology and evolution*,*29*(8), 1969-1973.
Hedges,
S. B., Blair, J. E., Venturi, M. L., & Shoe, J. L. (2004). A
molecular timescale of eukaryote evolution and the rise of complex
multicellular life.

*BMC evolutionary biology*,*4*(1), 2.
Hug,
L. A., & Roger, A. J. (2007). The impact of fossils and taxon
sampling on ancient molecular dating analyses.

*Molecular biology and evolution*,*24*(8), 1889-1897.
Kishino,
H., Thorne, J. L., & Bruno, W. J. (2001). Performance of a
divergence time estimation method under a probabilistic model of rate
evolution.

*Molecular Biology and Evolution*,*18*(3), 352-361.
Lepage,
T., Bryant, D., Philippe, H., & Lartillot, N. (2007). A general
comparison of relaxed molecular clock models.

*Molecular biology and evolution*,*24*(12), 2669-2680.
Rambaut,
A., & Grass, N. C. (1997). Seq-Gen: an application for the Monte
Carlo simulation of DNA sequence evolution along phylogenetic
trees.

*Computer applications in the biosciences: CABIOS*,*13*(3), 235-238.
Rannala,
B., & Yang, Z. (1996). Probability distribution of molecular
evolutionary trees: a new method of phylogenetic inference.

*Journal of molecular evolution*,*43*(3), 304-311.
Rannala,
B., & Yang, Z. (2007). Inferring speciation times under an
episodic molecular clock.

*Systematic Biology*,*56*(3), 453-466.
Revell,
L. J. (2012). phytools: an R package for phylogenetic comparative
biology (and other things).

*Methods in Ecology and Evolution*,*3*(2), 217-223.
Thorne,
J. L., Kishino, H., & Painter, I. S. (1998). Estimating the rate
of evolution of the rate of molecular evolution.

*Molecular Biology and Evolution*,*15*(12), 1647-1657.
Thorne,
J. L., & Kishino, H. (2002). Divergence time and evolutionary
rate estimation with multilocus data.

*Systematic Biology*,*51*(5), 689-702.
Yang,
Z., & Rannala, B. (1997). Bayesian phylogenetic inference using
DNA sequences: a Markov chain Monte Carlo method.

*Molecular biology and evolution*,*14*(7), 717-724.
Yule,
G. U. (1924). An introduction to the theory of statistics.

*Bull. Amer. Math. Soc. 30 (1924), 465-466*
## No hay comentarios:

Publicar un comentario en la entrada