Introduction
Dengue
Virus serotype 2(DENV-2) is one of the four serotypes of the
arthropod borne viruses (arbovirus) that belongs to the Flaviridae
family. As the other serotypes, it is responsible for causing Dengue
fever and severe Dengue. The genome is formed by a single stranded
positive-sense RNA of approximately 11kb long. This viral RNA serves
as an mRNA for the translation of 10 viral proteins: 3
structural(Capside C, Membrane precursor preM, Envelope E), that
assemble the nucleocapside and envelope of mature virions; and 7
non-structural(NS1, NS2a, NS2b, NS3, NS4a, NS4B, NS5), which are
mainly responsible for viral replication and other cell functions.
Due
to its considerable diversity, DENV-2, has been grouped into 6
lineages, designated as genotypes (Rico-hesse et al., 1997, 1998; S
Susanna Twiddy et al., 2002). 1.) Asian I genotype, represents
strains from Thailand, Malaysia, Cambodia, Myanmar, Vietnam and
Australia. 2.) Asian II, represents strains from China, Indonesia,
The Philippines, Taiwan, Sri Lanka, India, Honduras and Mexico. 3.)
Southeast (SE) Asian/American genotype comprises all strains from
Southeast Asia, and strains collected in Central and South America
and the Caribbean over the last 30 years. 4.) The Cosmopolitan
genotype, represents strains distributed in a wide geographic area,
including East and West Africa, the Middle East, the Indian
subcontinent, Indian and Pacific Ocean Islands and Australia. 5.)
American genotype, representing strains from Central and South
America, the Caribbean and older strains collected in the Indian
subcontinent and the Pacific Islands. Lastly, there is the Sylvatic
genotype, representing strains from humans, canopy-dwelling arboreal
mosquitoes, and non-human primates collected in West Africa and
Southeast Asia. Moreover, the great genetic diversity within the
DENV-2 genotypes, reflect their continual divergence and diverse
geographic distribution (Chen & Vasilakis, 2011).
Several
studies have been aimed at understanding the epidemiology of DENV,
rates and dates of evolution, and selection pressure in its different
genes and genotypes (Rico-hesse & Rico-hesse, 1990;
Rico-Hesse et al., 1997; S. S. Twiddy, Holmes, & Rambaut, 2003;
Weaver & Vasilakis, 2009; Zanotto, Gould, Gao, Harvey, &
Holmes, 1996). Further, a relatively recent review (Chen
& Vasilakis, 2011) reports large scale phylogenetic studies of
all dengue serotypes and discusses how factors such as rates of
evolution, selection pressures, population sizes and evolutionary
constraints influences DENV transmission, pathogenesis and emergence.
New approaches reveal the patterns in the spread and epidemiology of
viral diseases by phylogeographic studies. On a global scale,
DENV-2 has been reported as the serotype with the highest
effective population size (Costa, Voloch, & Schrago,
2012), and the time to the last common ancestor as 338.1
(227.1–462.9) (Patil et al., 2011). Nonetheless, all
these studies have been based only in estimations made from the E
gene sequences.
The
objectives of this study were to infer the time to the Most recent
common ancestor MRCA of the DENV2 genotypes, and estimate the
evolutionary rates of the data set, using the complete genome and
individual genes.
Materials
and Methods
Sequence
Data
Complete
genome sequences of DENV-1 were retrieved from
GenBank(1093 sequences with collection dates (1944 to 2014), from
(48) distinct countries, database up to June 2015). In
order to delete duplicates, recombinants, and chimeras, I used uclust
v.4.0.38 (Edgar, 2010), with a cut-off id value of 90%. An
statistical test for detection of significant recombination events in
the dataset was run using SplitsTree v.4.2 (Bruen, Philippe, &
Bryant, 2006). Sequences were aligned using MUSCLE
v3.8.31(Edgar, 2004). From this alignment, I obtained 11
partitions corresponding to the individual genes(10), and the
complete coding sequence(1). Substitution model was selected under
the Akaike information criterion, using PhyML (Guindon,
Dufayard, Lefort, & Anisimova, 2010), called with the
function phymltest(package ape, R v.3.2.1.(R Core Team 2014)).
Phylogenetic
Signal
In
order to test the phylogenetic signal, ergo, the topological
resolution yielded by the data set of each partition, a Likelihood
mapping analysis was run for 10 million random quartets using
TreePuzzle v.5.3.(Schmidt, Strimmer, Vingron, & von Haeseler,
2002). The quartets were reconstructed following a ML
approach under the selected substitution models for each partition.
Thus, an independent analysis was run for each dataset. The results
were plotted as a triangle with fully resolved quartets shown as dots
concentrated near the vertices, and the unresolved ones are located
in the center. (See anexes)
Genotypes
Assignation
a
Maximum likelihood (ML) tree was inferred using RaxML, using the
complete coding sequence (CDS), with the model GTR+G+I.
Estimation
of Evolutionary rates and TMCRA
Tree
topologies, model parameters, evolutionary rates for each partition
dataset, and times of the MCRA for each genotype were
co-estimated, using BEAST v.8.1.2.(A. J. Drummond, Suchard, Xie,
& Rambaut, 2012). Inferrences were performed using a
relaxed molecular clock (uncorrelated
lognormal[rate: 7.5*10-4, variation: 0.4 (Costa
et al., 2012)]. A Bayesian skyline plot was used as a
coalescent prior during the inference (a. J. Drummond,
Rambaut, Shapiro, & Pybus, 2005; Ho & Shapiro, 2011), for
it is an approach with no dependence on a pre-specified
parametric model of demographic history, such as logistic,
exponential. Analysis was run for twice for 50 million
generations, sampling every 10000. Topology was rooted with a
sequence of Japanese Encephalitis Virus(EF107523) from 1988.
Results
and Discussion
The
mean evolutionary rate for each gene ranged 5.3 10-4 –1.5
× 10-4 nucleotide substitutions per site per year,
being the NS5, NS2B, and E, the ones with the highest
substitution rates. This estimations can be compared to the ones
obtained by(S. S. Twiddy et al., 2003). The complete coding
sequence was not account for comparisons, because converge of this
parameter is not guaranteed (ESS<100) for the partition. In
general, evolutionary rates for the dataset obtained from the E gene
sequence were similar compared to the non-structural genes NS2B,
NS4A, NS4B, and NS5.
Table
1. Estimated Mean Nucleotide Substitution Rates per Site per
Year for each partition of Dengue Virus Serotype 2.
Regarding the time the most recent common ancestor (MRCA), results are consistent to the times found by (Walimbe, Lotankar, Cecilia, & Cherian, 2014), and are comparable to the times of outbreaks and introductions summarized in (Chen & Vasilakis, 2011)
Rate(subs/site/year) | Rate 95%HPD | |
C | 4.10E-004 | [9.5E-4, 2.4E-4] |
preM | 4.20E-004 | [8.6E-4, 2.5E-4] |
E | 2.30E-004 | [4.9E-4, 1.4E-4] |
NS1 | 3.80E-004 | [6.4E-4, 1.1E-4] |
NS2A | 5.30E-004 | [8.1E-4, 1.8E-4] |
NS2B | 1.40E-004 | [4.1E-4, 7.3E-3] |
NS3 | 2.80E-004 | [5.2E-4, 1.1E-4] |
NS4A | 2.50E-004 | [6.5E-4, 1.3E-4] |
NS4B | 2.60E-004 | [6.5E-4, 2.4E-4] |
NS5 | 1.50E-004 | [7.5E-4, 3.1E-3] |
CDS | 4.30E-004 | [9.4E-4, 1.2E-3] |
Regarding the time the most recent common ancestor (MRCA), results are consistent to the times found by (Walimbe, Lotankar, Cecilia, & Cherian, 2014), and are comparable to the times of outbreaks and introductions summarized in (Chen & Vasilakis, 2011)
In
average, the partitions whose TMCRA for Asian/American genotype were
closest to the proposed time in other studies (using the E gene
sequence: 42.757-*44.4), were NS1:46.4, NS4A:45.6,
and NS4B:43.7. For genotype Asian1/Asian2, the reference
value (using the E gene sequence: 64.189-*70.45), was
nearly recovered by NS1: 70.7, and preM:
67.8. For the Cosmopolitan genotype, (using the E gene
sequence: 59.8-*67.9), the TMCRA was roughly recovered
by NS2B: 63.885, NS4B: 60.929, and
the CDS: 77.779. The TMCRA of the sylvatic2 genotype
(using the E gene sequence: 121.1-*158.4), was recovered
by NS5:157.1, CDS:157.2, NS1:120.1, NS2A:135.3, and NS4B: 126.9. The
only exception for this case, would be the American 2 genotype, whose
TMCRA estimations were different (103.3*), from the
obtained in this work (56.2). * Values obtained by (Patil et
al., 2011; Walimbe et al., 2014).
Genes
pre-M, NS1, NS2A, NS4B, showed the closest values to the references.
Thence, they could be an alternative to use when estimating
evolutionary processes in DENV2. Besides, regarding the potential
resolution of phylogenetic relationships between clades, these
genes have shown good phylogenetic signal (>90%). Also, they
have relatively smaller sizes compared to the E gene (with exception
of NS1), which may be even practical when running long analyses.
Table
2. Estimated time to the most recent common ancestor (MRCA)
for every partition, estimated under a Bayesian skyline
coalescent tree prior.
root | 95%HDP | AA | 95%HDP | A1/A2 | 95%HDP | Cosmo | 95%HDP | A2 | 95%HDP | Sylvatic | 95%HDP | |
C | 2174.6 | [1100.8, 3381.1] | 55.1 | [38.5, 74.4] | 57.2 | [50, 72.1] | 64.9 | [46.2, 87.7] | 54.9 | [40.1, 77.1] | 100.8 | [56.7, 160.1] |
preM | 3268.7 | [1866.7, 4821.1] | 55.1 | [39.9, 72.5] | 67.8 | [54.3, 83.4] | 54.9 | [43.9, 68.4] | 46.8 | [38.3, 59.2] | 107.4 | [64.3, 165.1] |
E | 6175.5 | [3780, 7200.1] | 42.8 | [34.5, 52.1] | 64.2 | [53.8, 77.5] | 59.8 | [45.5, 76.4] | 56.7 | [42.3, 78.9] | 121.1 | [70.2, 179.5] |
NS1 | 5452.1 | [3371.3, 7790.7] | 46.4 | [34.9, 58.7] | 70.7 | [56.2, 87.3] | 56.5 | [44.9, 70.7] | 59.5 | [40.1, 71.5] | 120.1 | [69.9, 181.3] |
NS2A | 3362.9 | [1944.8, 4966.4] | 43.5 | [34.9, 54.5] | 67.1 | [54.1, 83.7] | 59.1 | [46.6, 74.7] | 53.3 | [39.5, 72.4] | 135.3 | [72.4, 206.5] |
NS2B | 2712.5 | [1497.1, 4033.4] | 49.9 | [35.9, 66.4] | 62.9 | [51.4, 78.7] | 63.9 | [45.3, 89.1] | 57.2 | [43.2, 74.8] | 101.4 | [61.2, 156.5] |
NS3 | 2914.1 | [1502.1, 4102.2] | 56.9 | [38.3, 73.5] | 63.7 | [54.1, 76.1] | 59.1 | [42.8, 83.9] | 58.8 | [46.4, 71.6] | 119.1 | [58.2, 215.6] |
NS4A | 3989.9 | [2283.7, 5881.3] | 45.6 | [35.3, 57.8] | 65.3 | [52.4, 82.1] | 55.1 | [43.9, 66.5] | 45.1 | [38.1, 53.3] | 118 | [69.6, 187.7] |
NS4B | 2454.2 | [1290.3, 3688.6] | 43.7 | [35.3, 54.2] | 60.9 | [51.6, 72.4] | 60.9 | [46.1, 79.8] | 46.1 | [37.9, 58.4] | 126.9 | [73.1, 187.9] |
NS5 | 5675.6 | [3371.4, 7790.7] | 58.6 | [39.9,72.5] | 77.9 | [56.2, 95.1] | 67.3 | [52.5, 83.5] | 63.5 | [49.3, 76.9] | 157.1 | [72.4, 206.5] |
CDS | 10944.7 | [3459.8, 25368.5] | 105 | [36.6, 550.8] | 117.6 | [53.6, 337.7] | 77.8 | [43.9, 172.7] | 76.6 | [39.5, 145] | 157.2 | [75, 290.6] |
AA:American/Asian. A1A2:Asian1/Asian2. Cosmo: Cosmopolitan. A2: American2.
One visible problem with the analysis was that the convergence of parameter for the inference based on complete coding region sequence was not reached. Although this problem was expectable, increasing the number of generations to the double of the used in this study, would probably yield better estimations in terms of ESS, and I consider that this is a worth-doing comparison.
One visible problem with the analysis was that the convergence of parameter for the inference based on complete coding region sequence was not reached. Although this problem was expectable, increasing the number of generations to the double of the used in this study, would probably yield better estimations in terms of ESS, and I consider that this is a worth-doing comparison.
Anexes
A2.) Substitution models for each partition
Partition | Subs Model |
C | GTR+G+I |
preM | GTR+G+I |
E | GTR+G+I |
NS1 | GTR+G+I |
NS2A | GTR+G+I |
NS2B | TN93+G+I |
NS3 | GTR+G+I |
NS4A | TN93+G |
NS4B | GTR+G+I |
NS5 | GTR+G+I |
CDS | * |
A3.) Phylogenetic signal for each partition.
A. Gene C; B. Gene preM; C. Gene E; D. Gene NS1; E. Gene NS2a; F. Gene NS2b; G. Gene NS3; H. Gene NS4a; I. Gene NS4b; J. Gen NS5; K. Complete CDS.
RQ= Resolved quartets.
RQ= Resolved quartets.
References
Bruen,
T. C., Philippe, H., & Bryant, D. (2006). A simple and robust
statistical test for detecting the presence of
recombination. Genetics, 172(4),
2665–2681. doi:10.1534/genetics.105.048975.
Chen, R., & Vasilakis, N. (2011). Dengue--quo tu et quo vadis? Viruses, 3(9), 1562–608. doi:10.3390/v3091562.
Costa, R. L., Voloch, C. M., & Schrago, C. G. (2012). Comparative evolutionary epidemiology of dengue virus serotypes. Infection, Genetics and Evolution, 12(2), 309–314. doi:10.1016/j.meegid.2011.12.011.
Drummond, a. J., Rambaut, a., Shapiro, B., & Pybus, O. G. (2005). Bayesian coalescent inference of past population dynamics from molecular sequences. Molecular Biology and Evolution, 22(5), 1185–1192. doi:10.1093/molbev/msi103.
Drummond, A. J., Suchard, M. a, Xie, D., & Rambaut, A. (2012). Bayesian P hylogenetics with BEAUti and the BEAST 1 . 7. Molecular Biology and Evolution, 29(8), 1969–1973. doi:10.1093/molbev/mss075.
Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32(5), 1792–1797. doi:10.1093/nar/gkh340.
Edgar, R. C. (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26(19), 2460–2461. doi:10.1093/bioinformatics/btq461.
Guindon, S., Dufayard, J.-F., Lefort, V., & Anisimova, M. (2010). New Alogrithms and Methods to Estimate Maximum- Likelihoods Phylogenies: Assessing the performance of PhyML 3.0. Systematic Biology, 59(3), 307–321.
Ho, S. Y. W., & Shapiro, B. (2011). Skyline-plot methods for estimating demographic history from nucleotide sequences. Molecular Ecology Resources, 11(3), 423–434. doi:10.1111/j.1755-0998.2011.02988.x.
Patil, J. a., Cherian, S., Walimbe, a. M., Patil, B. R., Sathe, P. S., Shah, P. S., & Cecilia, D. (2011). Evolutionary dynamics of the American African genotype of dengue type 1 virus in India (1962-2005). Infection, Genetics and Evolution, 11(6), 1443–1448. doi:10.1016/j.meegid.2011.05.011.
Rico-hesse, R., Harrison, L. M., Nisalak, A., Vaughn, D. W., Kalayanarooj, S., Green, S., … Ennis, F. A. (1998). MOLECULAR EVOLUTION OF DENGUE TYPE 2 VIRUS IN THAILAND, 58(1), 96–101.
Rico-Hesse, R., Harrison, L. M., Salas, R. a, Tovar, D., Nisalak, a, Ramos, C., … da Rosa, a T. (1997). Origins of dengue type 2 viruses associated with increased pathogenicity in the Americas. Virology,230(2), 244–251. doi:10.1128/CVI.00363-08.
Rico-hesse, R., Harrison, L. M., Salas, R. A., Tovar, D., Nisalak, A., Ramos, C., … Nogueira, R. M. R. (1997). Origins of Dengue Type 2 Viruses Associated with Increased Pathogenicity in the Americas, 251(230), 244–251.
Rico-hesse, R., & Rico-hesse, R. (1990). Molecular Evolution and Distribution of Dengue Viruses Type 1 and 2 in Nature, 493, 479–493.
Schmidt, H. a, Strimmer, K., Vingron, M., & von Haeseler, A. (2002). TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics (Oxford, England), 18(3), 502–504. doi:10.1093/bioinformatics/18.3.502.
Twiddy, S. S., Farrar, J. J., Chau, N. V., Wills, B., Gould, E. A., Gritsun, T., … Holmes, E. C. (2002). Phylogenetic Relationships and Differential Selection Pressures among Genotypes of Dengue-2 Virus 1, 72, 63–72. doi:10.1006/viro.2002.1447.
Twiddy, S. S., Holmes, E. C., & Rambaut, a. (2003). Inferring the Rate and Time-Scale of Dengue Virus Evolution. Molecular Biology and Evolution, 20(1), 122–129. doi:10.1093/molbev/msg010.
Walimbe, A. M., Lotankar, M., Cecilia, D., & Cherian, S. S. (2014). Global phylogeography of Dengue type 1 and 2 viruses reveals the role of India. Infection, Genetics and Evolution, 22, 30–39. doi:10.1016/j.meegid.2014.01.001.
Weaver, S. C., & Vasilakis, N. (2009). Molecular evolution of dengue viruses: contributions of phylogenetics to understanding the history and epidemiology of the preeminent arboviral disease. Infection, Genetics and Evolution : Journal of Molecular Epidemiology and Evolutionary Genetics in Infectious Diseases, 9(4), 523–40. doi:10.1016/j.meegid.2009.02.003.
Zanotto, P. M., Gould, E. a, Gao, G. F., Harvey, P. H., & Holmes, E. C. (1996). Population dynamics of flaviviruses revealed by molecular phylogenies. Proceedings of the National Academy of Sciences of the United States of America, 93(2), 548–553.
Zhang, Z., Li, J., & Yu, J. (2006). Computing Ka and Ks with a consideration of unequal transitional substitutions. BMC Evolutionary Biology, 6, 44. doi:10.1186/1471-2148-6-44.
Chen, R., & Vasilakis, N. (2011). Dengue--quo tu et quo vadis? Viruses, 3(9), 1562–608. doi:10.3390/v3091562.
Costa, R. L., Voloch, C. M., & Schrago, C. G. (2012). Comparative evolutionary epidemiology of dengue virus serotypes. Infection, Genetics and Evolution, 12(2), 309–314. doi:10.1016/j.meegid.2011.12.011.
Drummond, a. J., Rambaut, a., Shapiro, B., & Pybus, O. G. (2005). Bayesian coalescent inference of past population dynamics from molecular sequences. Molecular Biology and Evolution, 22(5), 1185–1192. doi:10.1093/molbev/msi103.
Drummond, A. J., Suchard, M. a, Xie, D., & Rambaut, A. (2012). Bayesian P hylogenetics with BEAUti and the BEAST 1 . 7. Molecular Biology and Evolution, 29(8), 1969–1973. doi:10.1093/molbev/mss075.
Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32(5), 1792–1797. doi:10.1093/nar/gkh340.
Edgar, R. C. (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26(19), 2460–2461. doi:10.1093/bioinformatics/btq461.
Guindon, S., Dufayard, J.-F., Lefort, V., & Anisimova, M. (2010). New Alogrithms and Methods to Estimate Maximum- Likelihoods Phylogenies: Assessing the performance of PhyML 3.0. Systematic Biology, 59(3), 307–321.
Ho, S. Y. W., & Shapiro, B. (2011). Skyline-plot methods for estimating demographic history from nucleotide sequences. Molecular Ecology Resources, 11(3), 423–434. doi:10.1111/j.1755-0998.2011.02988.x.
Patil, J. a., Cherian, S., Walimbe, a. M., Patil, B. R., Sathe, P. S., Shah, P. S., & Cecilia, D. (2011). Evolutionary dynamics of the American African genotype of dengue type 1 virus in India (1962-2005). Infection, Genetics and Evolution, 11(6), 1443–1448. doi:10.1016/j.meegid.2011.05.011.
Rico-hesse, R., Harrison, L. M., Nisalak, A., Vaughn, D. W., Kalayanarooj, S., Green, S., … Ennis, F. A. (1998). MOLECULAR EVOLUTION OF DENGUE TYPE 2 VIRUS IN THAILAND, 58(1), 96–101.
Rico-Hesse, R., Harrison, L. M., Salas, R. a, Tovar, D., Nisalak, a, Ramos, C., … da Rosa, a T. (1997). Origins of dengue type 2 viruses associated with increased pathogenicity in the Americas. Virology,230(2), 244–251. doi:10.1128/CVI.00363-08.
Rico-hesse, R., Harrison, L. M., Salas, R. A., Tovar, D., Nisalak, A., Ramos, C., … Nogueira, R. M. R. (1997). Origins of Dengue Type 2 Viruses Associated with Increased Pathogenicity in the Americas, 251(230), 244–251.
Rico-hesse, R., & Rico-hesse, R. (1990). Molecular Evolution and Distribution of Dengue Viruses Type 1 and 2 in Nature, 493, 479–493.
Schmidt, H. a, Strimmer, K., Vingron, M., & von Haeseler, A. (2002). TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics (Oxford, England), 18(3), 502–504. doi:10.1093/bioinformatics/18.3.502.
Twiddy, S. S., Farrar, J. J., Chau, N. V., Wills, B., Gould, E. A., Gritsun, T., … Holmes, E. C. (2002). Phylogenetic Relationships and Differential Selection Pressures among Genotypes of Dengue-2 Virus 1, 72, 63–72. doi:10.1006/viro.2002.1447.
Twiddy, S. S., Holmes, E. C., & Rambaut, a. (2003). Inferring the Rate and Time-Scale of Dengue Virus Evolution. Molecular Biology and Evolution, 20(1), 122–129. doi:10.1093/molbev/msg010.
Walimbe, A. M., Lotankar, M., Cecilia, D., & Cherian, S. S. (2014). Global phylogeography of Dengue type 1 and 2 viruses reveals the role of India. Infection, Genetics and Evolution, 22, 30–39. doi:10.1016/j.meegid.2014.01.001.
Weaver, S. C., & Vasilakis, N. (2009). Molecular evolution of dengue viruses: contributions of phylogenetics to understanding the history and epidemiology of the preeminent arboviral disease. Infection, Genetics and Evolution : Journal of Molecular Epidemiology and Evolutionary Genetics in Infectious Diseases, 9(4), 523–40. doi:10.1016/j.meegid.2009.02.003.
Zanotto, P. M., Gould, E. a, Gao, G. F., Harvey, P. H., & Holmes, E. C. (1996). Population dynamics of flaviviruses revealed by molecular phylogenies. Proceedings of the National Academy of Sciences of the United States of America, 93(2), 548–553.
Zhang, Z., Li, J., & Yu, J. (2006). Computing Ka and Ks with a consideration of unequal transitional substitutions. BMC Evolutionary Biology, 6, 44. doi:10.1186/1471-2148-6-44.
No hay comentarios:
Publicar un comentario