Book contents
- Frontmatter
- Contents
- Contributors
- Introduction
- Part A Horizontal Meta-Analysis
- Part B Vertical Integrative Analysis (General Methods)
- Part C Vertical Integrative Analysis (Methods Specialized to Particular Data Types)
- 12 eQTL and Directed Graphical Model
- 13 MicroRNAs: Target Prediction and Involvement in Gene Regulatory Networks
- 14 Integration of Cancer Omics Data into a Whole-Cell Pathway Model for Patient-Specific Interpretation
- 15 Analyzing Combinations of Somatic Mutations in Cancer Genomes
- 16 A Mass-Action-Based Model for Gene Expression Regulation in Dynamic Systems
- 17 From Transcription Factor Binding and Histone Modification to Gene Expression: Integrative Quantitative Models
- 18 Data Integration on Noncoding RNA Studies
- 19 Drug-Pathway Association Analysis: Integration of High-Dimensional Transcriptional and Drug Sensitivity Profile
- Index
- Color plates
18 - Data Integration on Noncoding RNA Studies
from Part C - Vertical Integrative Analysis (Methods Specialized to Particular Data Types)
Published online by Cambridge University Press: 05 September 2015
- Frontmatter
- Contents
- Contributors
- Introduction
- Part A Horizontal Meta-Analysis
- Part B Vertical Integrative Analysis (General Methods)
- Part C Vertical Integrative Analysis (Methods Specialized to Particular Data Types)
- 12 eQTL and Directed Graphical Model
- 13 MicroRNAs: Target Prediction and Involvement in Gene Regulatory Networks
- 14 Integration of Cancer Omics Data into a Whole-Cell Pathway Model for Patient-Specific Interpretation
- 15 Analyzing Combinations of Somatic Mutations in Cancer Genomes
- 16 A Mass-Action-Based Model for Gene Expression Regulation in Dynamic Systems
- 17 From Transcription Factor Binding and Histone Modification to Gene Expression: Integrative Quantitative Models
- 18 Data Integration on Noncoding RNA Studies
- 19 Drug-Pathway Association Analysis: Integration of High-Dimensional Transcriptional and Drug Sensitivity Profile
- Index
- Color plates
Summary
Abstract
Recent genome-wide studies revealed that the human genome encodes over 10,000 long non-coding RNAs (lncRNAs) with little protein-coding capacity. Growing evidence suggests that many lncRNAs may have important functions in complex diseases and are potentially a new class of therapeutic targets for treating complex disease. In contrast to the fast pace of cataloguing lncRNAs in the human genome, the function of the vast majority of lncRNAs remain unknown. In this chapter, we described data integration strategies for identifying lncRNA that are associated with cancer subtypes and clinical prognosis, and predicted those that are potential drivers of cancer progression.
Introduction
The advancement in high-throughput technologies such as microarray, next-generation sequencing (NGS) has greatly facilitated cost-effective large-scale data generation. As a result, the amount of genomic data deposited into various public data sources such as Gene Expression Omnibus (GEO) (http://www.ncbi. nlm.nih.gov/geo/) and ArrayExpress (http://www.ebi.ac.uk/arrayexpress/) has grown tremendously in the past several years. Taking NCBI short reads archive database (http://www.ncbi.nlm.nih.gov/sra) as an example, the amount of data in this database went from about 10 terabytes (TB) in 2008 to about 1000 TB in 2012, an around 100-fold increase in only four years. These public data sources not only provide the raw data for the researchers to reproduce the discovery that were reported in the original study but also provided opportunities for using the same data for new discoveries. Moreover, integrating the data across individual studies either horizontally or vertically offers unique opportunities to make novel discoveries that would have been impossible based on the data from a single study. The integration of genomic data from the same individual under a specific disease condition is particularly powerful for disease-relevant discoveries. In those genomics-based clinical studies, the orthogonal genomic data and corresponding clinical information were systematically collected from the same group of human subjects. These data can be integrated to discover genes that play important roles in the etiology of the disease and those that may serve as diagnostic, prognostic, and predictive biomarkers.
- Type
- Chapter
- Information
- Integrating Omics Data , pp. 403 - 424Publisher: Cambridge University PressPrint publication year: 2015