Book contents
- Frontmatter
- Contents
- List of Contributors
- Preface
- 1 An Introduction to Next-Generation Biological Platforms
- 2 An Introduction to The Cancer Genome Atlas
- 3 DNA Variant Calling in Targeted Sequencing Data
- 4 Statistical Analysis of Mapped Reads from mRNA-Seq Data
- 5 Model-Based Methods for Transcript Expression-Level Quantification in RNA-Seq
- 6 Bayesian Model-Based Approaches for Solexa Sequencing Data
- 7 Statistical Aspects of ChIP-Seq Analysis
- 8 Bayesian Modeling of ChIP-Seq Data from Transcription Factor to Nucleosome Positioning
- 9 Multivariate Linear Models for GWAS
- 10 Bayesian Model Averaging for Genetic Association Studies
- 11 Whole-Genome Multi-SNP-Phenotype Association Analysis
- 12 Methods for the Analysis of Copy Number Data in Cancer Research
- 13 Bayesian Models for Integrative Genomics
- 14 Bayesian Graphical Models for Integrating Multiplatform Genomics Data
- 15 Genetical Genomics Data: Some Statistical Problems and Solutions
- 16 A Bayesian Framework for Integrating Copy Number and Gene Expression Data
- 17 Application of Bayesian Sparse Factor Analysis Models in Bioinformatics
- 18 Predicting Cancer Subtypes Using Survival-Supervised Latent Dirichlet Allocation Models
- 19 Regularization Techniques for Highly Correlated Gene Expression Data with Unknown Group Structure
- 20 Optimized Cross-Study Analysis of Microarray-Based Predictors
- 21 Functional Enrichment Testing: A Survey of Statistical Methods
- 22 Discover Trend and Progression Underlying High-Dimensional Data
- 23 Bayesian Phylogenetics Adapts to Comprehensive Infectious Disease Sequence Data
- Index
- Plate section
23 - Bayesian Phylogenetics Adapts to Comprehensive Infectious Disease Sequence Data
Published online by Cambridge University Press: 05 June 2013
- Frontmatter
- Contents
- List of Contributors
- Preface
- 1 An Introduction to Next-Generation Biological Platforms
- 2 An Introduction to The Cancer Genome Atlas
- 3 DNA Variant Calling in Targeted Sequencing Data
- 4 Statistical Analysis of Mapped Reads from mRNA-Seq Data
- 5 Model-Based Methods for Transcript Expression-Level Quantification in RNA-Seq
- 6 Bayesian Model-Based Approaches for Solexa Sequencing Data
- 7 Statistical Aspects of ChIP-Seq Analysis
- 8 Bayesian Modeling of ChIP-Seq Data from Transcription Factor to Nucleosome Positioning
- 9 Multivariate Linear Models for GWAS
- 10 Bayesian Model Averaging for Genetic Association Studies
- 11 Whole-Genome Multi-SNP-Phenotype Association Analysis
- 12 Methods for the Analysis of Copy Number Data in Cancer Research
- 13 Bayesian Models for Integrative Genomics
- 14 Bayesian Graphical Models for Integrating Multiplatform Genomics Data
- 15 Genetical Genomics Data: Some Statistical Problems and Solutions
- 16 A Bayesian Framework for Integrating Copy Number and Gene Expression Data
- 17 Application of Bayesian Sparse Factor Analysis Models in Bioinformatics
- 18 Predicting Cancer Subtypes Using Survival-Supervised Latent Dirichlet Allocation Models
- 19 Regularization Techniques for Highly Correlated Gene Expression Data with Unknown Group Structure
- 20 Optimized Cross-Study Analysis of Microarray-Based Predictors
- 21 Functional Enrichment Testing: A Survey of Statistical Methods
- 22 Discover Trend and Progression Underlying High-Dimensional Data
- 23 Bayesian Phylogenetics Adapts to Comprehensive Infectious Disease Sequence Data
- Index
- Plate section
Summary
Introduction
Comprehensive RNA viral surveillance is possible due to new sequencing platforms and reduced cost (Mardis, 2008). This coverage has transformed the resources available to understanding viral evolution. Despite this promise, clinical applications remain lagging in part due to lack of statistical inference tools (Holmes, 2009). Vaccine development and viral eradication rely on determining how viral populations will respond to evolutionary pressures and pinpointing key drug-resistant mutations (Chen and Lee, 2006). The current data infusion brought on by modern viral surveillance may bring the answers to these questions.
We focus on rapidly evolving RNA viruses (Drummond et al., 2003) – a true challenge in vaccine development, as it requires responding to genetically diverse viral populations (Levin et al., 1999). This diversity derives from a fast mutation rate, up to a million-fold times faster than DNA replication, due to poor proofreading (Holland et al., 1982) and a rapid replication time (Belshaw et al., 2008). These conditions result in populations that can be observed evolving on a human timescale (Duffy et al., 2008), providing a uniquely tractable microcosm. Although viral evolution is impacted by difficult to measure effects such as environmental factors and travel, other effects, such as genetic mutation, reassortment, and recombination, are scientifically tractable when couched in the field of phylogenetics (Morens et al., 2004).
In contrast to the 1918 Spanish influenza pandemic, where the three available samples provide an ambiguous evolutionary history (Reid et al., 1999; Gibbs et al., 2001; Worobey et al., 2002; Taubenberger et al., 2005), analysis of twenty-first century outbreaks such as 2009 swine flu will be aided by hundreds (Smith et al., 2009), or thousands, of contemporaneous sequences.
- Type
- Chapter
- Information
- Advances in Statistical BioinformaticsModels and Integrative Inference for High-Throughput Data, pp. 460 - 476Publisher: Cambridge University PressPrint publication year: 2013