Book contents
- Frontmatter
- Contents
- List of Contributors
- Preface
- 1 An Introduction to Next-Generation Biological Platforms
- 2 An Introduction to The Cancer Genome Atlas
- 3 DNA Variant Calling in Targeted Sequencing Data
- 4 Statistical Analysis of Mapped Reads from mRNA-Seq Data
- 5 Model-Based Methods for Transcript Expression-Level Quantification in RNA-Seq
- 6 Bayesian Model-Based Approaches for Solexa Sequencing Data
- 7 Statistical Aspects of ChIP-Seq Analysis
- 8 Bayesian Modeling of ChIP-Seq Data from Transcription Factor to Nucleosome Positioning
- 9 Multivariate Linear Models for GWAS
- 10 Bayesian Model Averaging for Genetic Association Studies
- 11 Whole-Genome Multi-SNP-Phenotype Association Analysis
- 12 Methods for the Analysis of Copy Number Data in Cancer Research
- 13 Bayesian Models for Integrative Genomics
- 14 Bayesian Graphical Models for Integrating Multiplatform Genomics Data
- 15 Genetical Genomics Data: Some Statistical Problems and Solutions
- 16 A Bayesian Framework for Integrating Copy Number and Gene Expression Data
- 17 Application of Bayesian Sparse Factor Analysis Models in Bioinformatics
- 18 Predicting Cancer Subtypes Using Survival-Supervised Latent Dirichlet Allocation Models
- 19 Regularization Techniques for Highly Correlated Gene Expression Data with Unknown Group Structure
- 20 Optimized Cross-Study Analysis of Microarray-Based Predictors
- 21 Functional Enrichment Testing: A Survey of Statistical Methods
- 22 Discover Trend and Progression Underlying High-Dimensional Data
- 23 Bayesian Phylogenetics Adapts to Comprehensive Infectious Disease Sequence Data
- Index
- Plate section
7 - Statistical Aspects of ChIP-Seq Analysis
Published online by Cambridge University Press: 05 June 2013
- Frontmatter
- Contents
- List of Contributors
- Preface
- 1 An Introduction to Next-Generation Biological Platforms
- 2 An Introduction to The Cancer Genome Atlas
- 3 DNA Variant Calling in Targeted Sequencing Data
- 4 Statistical Analysis of Mapped Reads from mRNA-Seq Data
- 5 Model-Based Methods for Transcript Expression-Level Quantification in RNA-Seq
- 6 Bayesian Model-Based Approaches for Solexa Sequencing Data
- 7 Statistical Aspects of ChIP-Seq Analysis
- 8 Bayesian Modeling of ChIP-Seq Data from Transcription Factor to Nucleosome Positioning
- 9 Multivariate Linear Models for GWAS
- 10 Bayesian Model Averaging for Genetic Association Studies
- 11 Whole-Genome Multi-SNP-Phenotype Association Analysis
- 12 Methods for the Analysis of Copy Number Data in Cancer Research
- 13 Bayesian Models for Integrative Genomics
- 14 Bayesian Graphical Models for Integrating Multiplatform Genomics Data
- 15 Genetical Genomics Data: Some Statistical Problems and Solutions
- 16 A Bayesian Framework for Integrating Copy Number and Gene Expression Data
- 17 Application of Bayesian Sparse Factor Analysis Models in Bioinformatics
- 18 Predicting Cancer Subtypes Using Survival-Supervised Latent Dirichlet Allocation Models
- 19 Regularization Techniques for Highly Correlated Gene Expression Data with Unknown Group Structure
- 20 Optimized Cross-Study Analysis of Microarray-Based Predictors
- 21 Functional Enrichment Testing: A Survey of Statistical Methods
- 22 Discover Trend and Progression Underlying High-Dimensional Data
- 23 Bayesian Phylogenetics Adapts to Comprehensive Infectious Disease Sequence Data
- Index
- Plate section
Summary
Introduction: The Purpose of the ChIP-seq Experiment
Every cell is host to a diverse ecosystem of proteins, each protein having its own functional properties. Cell behaviors and processes, such as growth, are dependent on the levels of these various proteins.
Through microarrays, and subsequently high-throughput sequencing, we have become adept at quantifying the expression levels of genes, in various cell types under different conditions. These expression levels provide us with a proxy, albeit an imperfect one, for the levels of protein production in a cell. Indeed, estimates of the correlation between mRNA levels and protein levels are typically very variable [Gry et al., 2009].
We can infer links between cause and effect by perturbing a cell in some way and measuring which genes' mRNA levels change in response. However, this information is, by itself, unsatisfying – specifically, we seek to understand the regulatory mechanisms underlying these links. This has clinical significance – ultimately, we may be able to manipulate these mechanisms ourselves, potentially leading to novel therapies.
Consider the following example: in breast cancer, malignant tumors typically exhibit an invasive, proliferative behavior characterized by abnormal growth (Weinberg, 2007). A critical factor in establishing this behavior is the estrogen receptor (ER). In particular, ER is known to encourage mitotic cell division. This function is, in itself, not dangerous – cells in healthy tissue commonly divide to replace lost cells. However, ER is known to be a key factor in establishing cancer, and approximately 70% of breast tumors are labeled as ER-positive – that is, their cells have ER content above a certain threshold (Tannock and Hill, 1998; Mohibi et al., 2011).
- Type
- Chapter
- Information
- Advances in Statistical BioinformaticsModels and Integrative Inference for High-Throughput Data, pp. 138 - 169Publisher: Cambridge University PressPrint publication year: 2013
- 1
- Cited by