The West Nile Virus (WNV) is a mosquito-born flavivirus that causes neurologic diseases such as encephalitis, meningitis, and acute flaccid paralysis (Lim, Koraka, Osterhaus, & Martina, 2011). Similar to other flaviviruses, WNV is an enveloped virus with a single-stranded, positive sense, ∼11-kb RNA genome whose strains are grouped into at least 7 genetic lineages. WNV was first isolated in Uganda in 1937. Posteriorly, the first large outbreak of West Nile neuroinvasive disease (WNND) was recorded in Romania in 1996, with 393 confirmed cases (Tsai, Popovici, Cernescu, Campbell, & Nedelcu, 1998). Three years later, it became a global public health concern after its introduction into North America, and subsequently into Central and South America (Lanciotti et al., 1999). Since then, major outbreaks of WNV fever and encephalitis took place in all continents, apart from Antarctica, causing human and animal deaths. Although its enzootic cycle is mainly maintained between mosquitoes and birds, it can eventually infect horses, humans, and other vertebrates (Hayes et al., 2005). Despite this variety of hosts, studies on the host structure and its influence on the spatiotemporal structure are still scarce. Since host genetic factors have a significant influence on disease distribution patterns, the overall purpose of this study is to assess the host structure of the phylogenetic relationships of WNV in a phylogeographic context, taking the spatiotemporal structure into account.
To identify the lineages of each viral strain.
To infer the main events of host-shift.
To determine the transmission paths within spatiotemporal structure.
1. Sequence Data: All the available sequences of complete genome of WNV, with collection times, and geographic locations will be retrieved from GenBank. In order to identify and delete recombinants, clones, and duplicates from the data base, I used Uclust v1.2.22q with 99 % of identity. Sequences of Ilheus virus (ILHV), Usutu virus (USUV), and Japanese encephalitis virus (JEV) will be used as the outgroup. Subsequently, all the WNV sequences will be aligned using the algorithm of multiple sequence alignment, implemented in MUSCLE v3.8.31 (Edgar, 2004).
2. Evolutionary rates: From this alignment, I will obtain a subset of 11 partitions, which correspond to the complete genome, and the genes that constitute it (C, prM/M, E, NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5). An exploration of evolution rates of every gene will be done using Distance Rates (DistR) method (Bevan, Lang, & Bryant, 2005) to get a first approach to the molecular evolution of the genes, as one of the key determinants of the occurrence of cross-species transmission (Longdon, Brockhurst, Russell, Welch, & Jiggins, 2014; Vrancken et al., 2015).
3. Lineages identification: The substitution model will be selected using Akaike information criterion with JmodelTest2 (Darriba, Taboada, Doallo, & Posada, 2012). With this model, a Maximum likelihood (ML) inference will be performed using ExaML v3.0.X, with 20 searches and 100 bootstrap replicates, which are considered as sufficient for large data sets (Kozlov, Aberer, & Stamatakis, 2015). Every lineage will be assumed as a monophyletic group as sugested by (MacKenzie & Williams, 2009), and all the obtained clades will be revised taking previous studies into account.
4. Phylodynamics: In order to evaluate every lineage independently, the data set will be down sampled. Thus, the tree topologies, model parameters, evolutionary rates, MRCA, viral population size variation over time will be co-estimated independently for the resultant lineages, using with an uncorrelated log-normal relaxed clock model (rationale given in (May, Davis, Tesh, & Barrett, 2011), and the MCMC method implemented in the BEAST package v1.8.2 (Drummond, Suchard, Xie, & Rambaut, 2012). Bayesian skyline plot will be used as a coalescent prior during the estimation over time of the change in effective population size per generation, per year (Ne.g). The MCMC analysis will be run twice for 50 million generations, with sampling every 10000. MCMC convergence will be measured by estimating the effective sampling size (ESS), using Tracer software version 1.5 (http://tree.bio.ed.ac.uk/software/tracer/). Uncertainties will be estimated as 95% high probability densities (95% HPD). The results for the two runs will be combined for final analysis and Bayesian Factor (BF) support for host shift. Transition rates supported by a BF > 3 will be considered as significant support for a host shift between species. The obtained topologies will be summarized in a maximum clade credibility (MCC) tree, and annotated by the use of TreeAnnotator (http://beast.bio.ed.ac.uk/treeannotator).
5. Host-Shift Events: To determine whether there is a stronger influence of cross species transmission (CST) in the genetic divergence over within species transmission, I will compute Genetic distances in PAUP* v.4.0b10 (http://paup.csit.fsu.edu/) using models of nucleotide substitution specific to the lineages, and compare them with a cutoff value. Subsequently, transmission of WNV will be quantified by Metropolis Coupled Markov Chain Monte Carlo (MC3) coalescent simulation of migration rates, implemented in the program Migrate-N v3.6 (Beerli & Palczewski, 2010). The model of transmission (whether asymmetrical, bi-directional, symmetrical, inter alia) will be assessed, and the transmission web will be visualized using this software. In order to estimate the potential of the strains to jump into new hosts (sensu (Frost & Volz, 2010), or to predict viral emergence, I will estimate the per capita cross-species transmission rate Rij, and the effective reproductive number of a pathogen Re.
6. Host Phylogeny and the spatiotemporal Structure: Genetic population predictors: Ne.g, Rij, and Re will be plotted in function of time.
Beerli, P., & Palczewski, M. (2010). Unified framework to evaluate panmixia and migration direction among multiple sampling locations. Genetics, 185(1), 313–326. doi:10.1534/genetics.109.112532
Bevan, R. B., Lang, B. F., & Bryant, D. (2005). Calculating the evolutionary rates of different genes: a fast, accurate estimator with applications to maximum likelihood phylogenetic analysis. Systematic Biology, 54(6), 900–915. doi:10.1080/10635150500354829
Darriba, D., Taboada, G. L., Doallo, R., & Posada, D. (2012). jModelTest 2: more models, new heuristics and parallel computing. Nature Methods, 9(8), 772. doi:10.1038/nmeth.2109
Drummond, A. J., Suchard, M. a, Xie, D., & Rambaut, A. (2012). Bayesian P hylogenetics with BEAUti and the BEAST 1 . 7. Molecular Biology and Evolution, 29(8), 1969–1973. doi:10.1093/molbev/mss075
Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32(5), 1792–1797. doi:10.1093/nar/gkh340
Frost, S. D. W., & Volz, E. M. (2010). Viral phylodynamics and the search for an “effective number of infections”. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 365(1548), 1879–1890. doi:10.1098/rstb.2010.0060
Hayes, E. B., Komar, N., Nasci, R. S., Montgomery, S. P., O’Leary, D. R., & Campbell, G. L. (2005). Epidemiology and transmission dynamics of West Nile virus disease. Emerging Infectious Diseases, 11(8), 1167–1173. doi:10.3201/eid1108.050289a
Kozlov, a. M., Aberer, a. J., & Stamatakis, a. (2015). ExaML Version 3: A Tool for Phylogenomic Analyses on Supercomputers. Bioinformatics, (March), 1–3. doi:10.1093/bioinformatics/btv184
Lanciotti, R. S., Roehrig, J. T., Deubel, V., Smith, J., Parker, M., Steele, K., … Gubler, D. J. (1999). Origin of the West Nile virus responsible for an outbreak of encephalitis in the northeastern United States. Science (New York, N.Y.), 286(5448), 2333–2337. doi:10.1126/science.286.5448.2333
Lim, S. M., Koraka, P., Osterhaus, A. D. M. E., & Martina, B. E. E. (2011). West Nile virus: Immunity and pathogenesis. Viruses, 3(6), 811–828. doi:10.3390/v3060811
Longdon, B., Brockhurst, M. a, Russell, C. a, Welch, J. J., & Jiggins, F. M. (2014). The Evolution and Genetics of Virus Host Shifts. PLoS Pathogens, 10(11). doi:10.1371/journal.ppat.1004395
MacKenzie, J. S., & Williams, D. T. (2009). The zoonotic flaviviruses of southern, south-eastern and eastern Asia, and australasia: The potential for emergent viruses. Zoonoses and Public Health, 56(6-7), 338–356. doi:10.1111/j.1863-2378.2008.01208.x
May, F. J., Davis, C. T., Tesh, R. B., & Barrett, A. D. T. (2011). Phylogeography of West Nile virus: from the cradle of evolution in Africa to Eurasia, Australia, and the Americas. Journal of Virology, 85(6), 2964–2974. doi:10.1128/JVI.01963-10
Tsai, T. F., Popovici, F., Cernescu, C., Campbell, G. L., & Nedelcu, N. I. (1998). West Nile encephalitis epidemic in southeastern Romania. Lancet, 352(9130), 767–771. doi:10.1016/S0140-6736(98)03538-7
Vrancken, B., Lemey, P., Rambaut, A., Bedford, T., Longdon, B., Günthard, H. F., & Suchard, M. a. (2015). Simultaneously estimating evolutionary history and repeated traits phylogenetic signal: applications to viral and host phenotypic evolution. Methods in Ecology and Evolution, 6(1), 67–82. doi:10.1111/2041-210X.12293