- GST
-
glutathione S-transferase
Technologic advances now make it possible to collect large amounts of genetic, epigenetic, proteomic, metabolomic and gut microbiome data. Many of the applications of this multi-dimensional data have been in the areas of disease detection, prognosis and treatment. However, such approaches may also lend themselves towards characterising healthy phenotypes and more effectively informing dietary recommendations for maintaining or improving the health of individuals on a personal level. Omics data have the potential to transform our approach towards nutrition counselling by allowing us to recognise and embrace the metabolic, physiologic and genetic differences among individuals. The ultimate goal would be to integrate these multi-dimensional data so as to characterise the health status and disease risk of an individual and to provide personalised dietary recommendations to maximise health. To this end, accurate and predictive system-based measures of health are needed that incorporate molecular signatures of genes, transcripts, proteins, metabolites and microbes.
Nutrition, as a science, has a long tradition of determining the nutrient requirements of heterogeneous populations eating a wide variety of diets and of providing dietary recommendations for health. This has typically involved simplifying the inherent complexity into manageable recommendations in the form of dietary guidance for the purpose of preventing disease in a population. Despite the application of biostatistical approaches with the goal to be as inclusive of the population as possible, there are limitations due to assumptions that metabolic organisational structure is uniform among individuals and that direct cause–effect relationships exist. In reality, the large number of functional redundancies and adaptive mechanisms that provide for homoeostasis( Reference Jones, Park and Ziegler 1 ) make evaluating the complexities and nuances challenging.
The concept of a ‘nutritional phenotype’, i.e. an integrated set of genetic, proteomic, metabolomic, functional and behavioural factors that, when measured, could provide the basis for assessment of human nutritional status, was introduced several years ago by Ziesel et al. ( Reference Zeisel, Freake and Bauman 2 ) It was proposed as a way to integrate the effects of diet on disease/wellness and provide a quantitative indication of the paths by which genes and environment exert their effects on health( Reference Jones, Park and Ziegler 1 ). The concept provides a good base from which to begin to establish approaches to personalised dietary recommendations; however, several questions need to be addressed. These include, but are not necessarily limited to: What data will we need on an individual in order to personalise dietary recommendations? How can we use controlled feeding studies and other dietary interventions to generate a nutritional phenotypic framework? How can we most effectively integrate omics data so as to be able to apply them towards personalised nutrition?
What data will we need on an individual in order to personalise dietary recommendations?
Numerous factors contribute to variation in nutritional requirements and responses to diet, including sex, stage of life cycle, disease, physical activity level, genetic background, gut microbial community and environmental exposures. Several of these are already considered in the construction of personalised nutritional recommendations; for example, sex, age, adiposity and activity level are routinely used in determining nutrient requirements in healthy individuals and understanding the contributions of disease state to nutritional requirements is a hallmark of therapeutic nutrition. To date, the more complex factors such as genomics, host microbial community structure and environmental exposures are often not included in the equation.
Genetic polymorphisms are well-recognised sources of variation in human response to some aspects of diet, including taste preference, food tolerance, nutrient absorption, transport and metabolism, and effects at target tissues( Reference Lampe, Potter, Costa and Eaton 3 ). Typically, in past studies, one particular genetic variant has been considered in relation to intake of one particular nutrient. For example, two polymorphisms in the MTHFR gene (C677T and A1298C) are associated with reduced methylenetetrahydrofolate reductase activity and higher homocysteine concentrations( Reference Lampe, Potter, Costa and Eaton 3 ). Carriers of these polymorphisms are at higher risk of CVD; thus sufficient intake of folate is particularly important. Other examples include iron overload and haemochromatosis, copper malabsorption and Menkes disease, and glucose-6-phosphate dehydrogenase and consumption of fava beans, high in pro-oxidant glycosides (favism; reviewed in( Reference Lampe, Potter, Costa and Eaton 3 )). Further, genomics may contribute to phenotypic differences in health behaviour and modify response to interventions designed to change health behaviours( Reference Bryan and Hutchison 4 ).
Several genome-wide association studies have evaluated the association between multiple SNP and metabolomics profiles. In a sample of 284 men, Gieger et al. ( Reference Gieger, Geistlinger and Altmaier 5 ) integrated genome-wide association study data with serum metabolomics-based quantitation of 363 metabolites. They reported associations of frequent SNP with differences in the metabolic homoeostasis, explaining up to 12% of the observed variance. Using ratios of certain metabolite concentrations as a proxy for enzymatic activity, up to 28% of the variance can be explained (P-values 10−16–10−21). Four variants in genes coding for enzymes (FADS1, LIPC, SCAD and MCAD) were identified where a corresponding metabolic phenotype (metabotype) clearly matched the biochemical pathways in which these enzymes are active.
More recently, Suhre et al. ( Reference Suhre, Wallaschofski and Raffler 6 ) conducted an analysis of genotype-dependent metabolic phenotypes using a genome-wide association study with non-targeted metabolomics in a sample of 1768 individuals. They identified thirty-seven genetic loci associated with blood metabolite concentrations, of which twenty-five showed effect sizes that accounted for 10–60% difference in metabolite levels per allele copy. These results provided functional insights into disease-related associations that have been reported in previous studies, including those for cardiovascular and renal disorders, type 2 diabetes, cancer, gout, venous thromboembolism and Crohn's disease.
The human gut microbial community also shapes host exposure to dietary constituents by modulating absorption, storage and energy harvest from the diet. It is a large, complex ecosystem, with the number of different species of bacteria estimated to range from 300 and 1000 and the majority of the species diversity distributed between the phyla Firmicutes and Bacteroidetes( Reference Guarner and Malagelada 7 , Reference Qin, Li and Raes 8 ). There is high inter-individual variation in the composition of communities, mostly at the species level( Reference Gill, Pop and Deboy 9 ), whereas the distribution of bacterial functional genes is less varied. This functional redundancy is a hallmark of a stable symbiosis in which many different species carry out the same functional role.
Recent studies suggest that individuals can be clustered into distinct groups based on their gut microbiome composition and functional metabolism( Reference Arumugam, Raes and Pelletier 10 ). The underlying metabolism of the dominant bacteria that define these groups is the degradation of plant polymers (e.g. dietary fibre) via different metabolic pathways; long-term dietary habits have been associated with these groupings( Reference Arumugam, Raes and Pelletier 10 ). Through the metabolism of dietary constituents, the gut microbiome can influence the magnitude and flux of metabolites to which the host is exposed and some of the variations in what have been identified as genotype-dependent metabolic phenotypes actually may be due to the composition and activity of the gut microbiome( Reference Donohoe and Bultman 11 , Reference Donohoe, Garge and Zhang 12 ). Indeed, of the genotype-dependent metabolic phenotypes identified by Suhre et al. ( Reference Suhre, Wallaschofski and Raffler 6 ), an altered microbiome has been associated with CVD( Reference Ordovas and Mooser 13 ), type 2 diabetes( Reference Qin, Li and Cai 14 ), some cancers( Reference Marchesi, Dutilh and Hall 15 , Reference Plottel and Blaser 16 ) and Crohn's disease( Reference Willing, Dicksved and Halfvarson 17 ). However, the relationships between the gut microbiome, diet and metabolic phenotypes need to be addressed in rigorous experimental settings using approaches that integrate metabolomics, host genomics and the gut microbiome.
How can we use controlled feeding studies and other dietary interventions to develop phenotype profiles?
Controlled feeding studies in healthy human subjects have been used for over a century to establish the quantitative requirements and confirm essentiality of nutrients in human subjects. Typically, these studies had small sample sizes, were intensively controlled, and often focused on restriction and re-feeding of specific nutrients or nutrient sources. They were used to evaluate the acute effects of food deprivation, show experimentally the effects of dietary restrictions on development of deficiency diseases, establish specific amino acid requirements and describe vitamin metabolism( Reference Lampe 18 ). Consequently, they were crucial in determining recommended daily dietary allowances. Controlled interventions and defined background diets have also been useful for testing response to varying doses of a dietary constituent( Reference Navarro, Chang and Peterson 19 ) and for testing and monitoring biomarkers of disease susceptibility and dietary exposure( Reference Cross, Major and Sinha 20 ). More recently, dietary interventions have been used to test the effects of particular dietary patterns( Reference Neuhouser, Schwarz and Wang 21 ) and to test genotype–phenotype interactions( Reference Navarro, Chen and Li 22 ).
Controlled feeding studies, particularly with randomised crossover designs where each person serves as their own control, are a useful venue in which to test genotype–diet interactions as well as genotype–phenotype associations. In the latter case, the relationship between genotype and phenotype can sometimes be better characterised on the background of the same dietary exposures (i.e. a controlled diet)( Reference Navarro, Chen and Li 22 ). Participant screening protocols for recruitment into controlled feeding studies also can be set up to enrich a priori for particular genotypes or phenotypes so as to provide more equal distributions of sample sizes in subgroups, particularly if the prevalence of a particular variant is low, and to increase statistical power to compare these subgroups.
Controlled feeding studies also provide a useful approach in which to characterise host-gut microbial interactions and to determine gut microbial community response to diet. In the context of controlled dietary interventions, gut bacterial community composition has been shown to differ significantly when participants consume different diets( Reference Russell, Gratz and Duncan 23 , Reference Li, Hullar and Schwarz 24 ), although the overall response of the gut bacterial community is often unique for each individual( Reference Li, Hullar and Schwarz 24 , Reference Tuohy, Kolida and Lustenberger 25 ). Most studies have tested effects of fermentable complex carbohydrates (e.g. dietary fibres, resistant starch; Table 1). Network analysis of the gut microbial community reveals niche specialisation based on a metabolic interconnection between different bacteria that are often specialised in one enzymatic transformation in the pathway of dietary metabolism( Reference Arumugam, Raes and Pelletier 10 , Reference Arumugam, Raes and Pelletier 26 – Reference Lozupone, Faust and Raes 28 ). The type of carbohydrate ingested often influences the prevalence of certain groups of gut bacteria and the subsequent composition of the microbial metabolic end products to which the host is exposed (e.g. SCFA; Table 1). Differences in gut microbial metabolism of various phytochemicals also contribute to gut bacterial metabolic phenotypes that influence dietary exposures( Reference Bolca, Van de Wiele and Possemiers 29 ). Being able to test for the effects of these phenotypes in the context of nutrition interventions is important, since some subgroups may be more responsive to the intervention than others. For example, Niculescu et al. ( Reference Niculescu, Pop and Fischer 30 ) reported differential lymphocyte gene expression by bacterial metabolic phenotype in postmenopausal women receiving an isoflavone supplement; a greater increase in oestrogen-responsive genes was observed in women who carried the bacteria capable of converting the soya isoflavone daidzein to equol.
DF, dietary fibre; F, female; FISH, fluorescent in-situ hybridisation of 16S rRNA genes; FOS, fructo-oligosaccharide; HPMC, high-protein and moderate-carbohydrate; HPLC, diet and a high-protein and low-carbohydrate; LKF, lupin kernel fibre; M, male; PCA, principal components analysis; PHGG, partially hydrolysed guar gum; WB, wheat bran; WG, whole grain.
‘Omics’ – transcriptomics, proteomics and metabolomics – approaches have been hypothesised to revolutionise our understanding of the interactions of the various systems that are often studied in isolation and have the potential to revolutionise many aspects of our study of nutrition and health promotion. Despite the excitement, at this stage, the technologies still require rigorous evaluation and validation. Controlled feeding studies are a useful approach in which to validate and test the robustness of these omics approaches with the goal of ultimately being able to use them to evaluate the effects of totality of diet on totality of response in human subjects. In addition, they provide important details on the behaviour of proteins, transcripts and metabolites under controlled conditions.
Several studies have used the construct of controlled feeding interventions to test effects of diet on omics measures (Table 2). The majority of these have utilised metabolomics to characterise response to phytochemical-containing foods (fruits, vegetables, tea, nuts) compared with a control in healthy individuals. Many of the metabolites identified typically correspond to dietary biomarkers of the intervention foods consumed (e.g. proline betaine after consumption of citrus fruits). Although many studies also yield a handful of endogenous metabolites that differ in abundance between the interventions, these compounds are often generally reported as differences in metabolite profiles owing to a lack of adequate pathway analysis tools. Thus, it is often unclear whether differences in metabolite profiles are indicative of perturbations in specific pathways or molecular targets in response to the dietary intervention, or are unrelated compounds identified by chance. Some investigators have explored pathways manually. For example, Solanky et al. ( Reference Solanky, Bailey and Beckwith-Hall 31 ) found that soya consumption was associated with osmolyte fluctuations and differences in energy metabolism. Work in our laboratory (DH May, SL Navarro, I Ruczinski et al., unpublished results) suggests potential differences in energy utilisation from glucose to fat between a diet devoid of fruits and vegetables compared with a diet high in crucifers, citrus and soya. These examples provide provocative views of other mechanisms through which plant foods may promote health; however, even with manual analyses, the interpretation is still broad, speculative and incomplete.
BW, body weight; F, female; GSTM1, glutathione S-transferase M1; M, male; SMCSO, S-methyl-L-cysteine sulfoxide; TMAO, trimethylamine-N-oxide; TTR, transthyretin; ZAG, zinc α-2-glycoprotein.
Other investigations have employed alternative omics technologies to study response to diet and have evaluated other endpoints beyond differences in metabolite profiles. Brauer et al. ( Reference Brauer, Libby and Mitchell 32 ), interrogated the proteome in response to 2 weeks of a diet high in cruciferous vegetables, and assessed whether response differed by glutathione S-transferase (GST)M1 genotype. GST enzymes metabolise a variety of exogenous compounds, including isothiocyanates from cruciferous vegetables, and the GSTM1 variants resulting in a complete lack of gene product are common( Reference Navarro, Chen and Li 22 ). Twenty-four distinct peaks were associated with cruciferous vegetable consumption compared with a fruit- and vegetable-free diet, two of which were identified that changed in a GSTM1-genotype-dependent manner. Another study provides an example of a novel use of omics to link metabolic phenotypes with dietary preferences. Taking a targeted approach, Rezzi et al. ( Reference Rezzi, Ramadan and Martin 33 ) used lipidomics to determine metabolites associated with chocolate ‘desiring’ or chocolate ‘indifferent’ preferences among individuals consuming 50 g/d chocolate or bread as a placebo. Heinzmann et al. ( Reference Heinzmann, Merrifield and Rezzi 34 ) used metabolomics to study the stability of phenotypic response to diet through sequential dietary challenges. They found that inter-individual differences were often greater than differences within an individual in response to dietary modulation, providing evidence that individuals each have a unique metabolic phenotype. Moreover, intra-individual differences between consecutive dietary challenges were linked to differences in excretion of microbial co-metabolites suggesting flexibility in gut microbiome function in response to dietary modulation. As the authors point out, these differences illustrate the importance of assessing response to diet in the context of a crossover rather than parallel study design in order to move towards personalised nutrition. As a whole, these controlled feeding studies illustrate the potential for omics technology in characterising individual nutritional phenotypes, but make evident the challenges (i.e. compound identification, pathway analysis) that still exist.
How can we most effectively integrate omics data so as to be able to apply them towards personalised nutrition?
Given that cellular functions are carried out via orchestrated activities of multiplex components of biological systems, data from different omics platforms can shed light on cellular activities at different levels. Methods that integrate omics data from different molecular profiling studies, e.g. data from transcriptomics, proteomics or metabolomics studies, have the potential to provide new insight into how different components of biological systems interact with each other and form the basis of an individual's health. Here, we provide an overview of available methods of data integration from multiple omics platforms, provide examples of each of different approaches, and discuss their advantages and limitations.
Current methods for integrative analysis of omics data from multiple data platforms can be broadly grouped into three categories. The first class of models, which we refer to as concordance analysis methods, studies concordance/correlation between two omics datasets, e.g. comparing the gene expression levels and proteomics datasets on the same set of subjects. The objective of such an approach is to identify genes/proteins/metabolites with an orchestrated activity in a given biological setting. To this end, methods of multivariate analysis, including different variations of principal component analysis, partial least squares, self-organising maps, as well as methods of network visualisation and analysis, have been used to assess the associations among multiple datasets. For instance, Hirai et al. ( Reference Hirai, Yano and Goodenowe 35 ) applied principal component analysis as well as self-organising maps to discover relationships between transcriptome and metabolome in Arabidopsis. In another study, Hirai et al. ( Reference Hirai, Klein and Fujikawa 36 ) analysed the network of gene-to-gene and gene-to-metabolite associations. More recently, Cao et al. ( Reference Lê Cao, Martin and Robert-Granié 37 ) proposed a sparse partial least squares procedure for comparative analysis of data from two omics platforms and applied their method to data from cDNA and Affymetrix chips in NCI60 cancer cell lines.
Concordance analysis methods provide interesting information about components of biological systems that interact with each other in a given setting. Moreover, such analyses can lend themselves to better classificatory models based on a combination of biomarkers from different platforms. However, these approaches often provide limited new insight into the underlying biological mechanisms as omics data from different platforms often show low levels of correlation due to complex mappings of genes to proteins and metabolites, and various post-transcriptional events( Reference Gygi, Rochon and Franza 38 ). Further, the underlying assumption in the majority of these methods is that omics measurements are obtained on the same set of individuals, or more formally, share a common dimension. Van Deun et al. ( Reference Van Deun, Smilde and van der Werf 39 ) reviewed these different approaches for analysis of multiple omics data, in the setting where the datasets share a common set of features.
The second class of integrative models, which we refer to as sequential integration, includes methods that incorporate multiple sets of omics data in order to discover new biomarkers or delineate biological mechanisms of complex phenotypes. It uses multiple omics datasets, in a sequential manner, to further narrow down, or expand, the set of biomarkers. Sequential integration methods can exploit different methods of data analysis, from simple differential expression analysis to gene-set enrichment analysis or analysis of networks. In examples of such an approach, Putluri et al. ( Reference Putluri, Shojaie and Vasu 40 ) first identified the set of differentially active metabolites, and then used meta genomic data to identify pathways associated with prostate cancer progression. In another study, Putluri et al. ( Reference Putluri, Shojaie and Vasu 41 ) coupled this approach with a concordance analysis based on metabolomics flux measurements to delineate pathways and biomarkers associated with bladder cancer. More recently, Imielinski et al. ( Reference Imielinski, Cha and Rejtar 42 ) used gene-set enrichment analysis coupled with the knowledge of biological networks and compared two sequential approaches, called ‘gene-centric’ and ‘protein-centric,’ in a study of molecular bases of breast cancer. In each of these approaches, the authors first evaluated the enrichment of biological pathways based on one source of data (transcriptomic or proteomic) and then filtered the set of identified pathways based on the second source of data. The authors also compared the results of these methods with a concordance-based approach, where the pathways were identified based on gene and protein pairs that demonstrated orchestrated levels of activity.
Sequential integration methods offer an opportunity to gain new insight based on multiple sources of omics data. Moreover, these methods do not require the omics measurements to be necessarily observed for the same set of individuals. Finally, unlike methods of concordance analysis, which cannot be directly extended to analysis of more than two sets of omics data, sequential integration methods offer the flexibility of analysing multiple omics datasets. However, the power of these methods is clearly limited by the ability of the omics data chosen for the first stage of analysis to capture important biological mechanisms. As the study by Imielinski et al. ( Reference Imielinski, Cha and Rejtar 42 ) indicates, the results of the analysis can vary depending on the omics platform used for the first stage of analysis. This sensitivity to the order of analysis can potentially hinder the applications of sequential integration methods, and additional studies are needed to determine whether data-driven criteria can be developed to assess the optimal order of analysis in these methods.
The final group of omics integration techniques, which we refer to as concurrent integration methods, includes emerging approaches that attempt to address some of the shortcomings of the afore-mentioned two sets of approaches. Similar to sequential integration methods, concurrent integration methods try to exploit the information content of multiple sets of omics data. However, these methods often include measures of activity of biological pathways, or their components, based on multiple omics data. This is often achieved by defining a combined score for the activity of each pathway based on activities of its members measured by different omics datasets. Poisson et al. ( Reference Poisson, Taylor and Ghosh 43 ) compared the performance a number of methods for combining data from multiple omics platforms, by considering different summary measures defined based on individual test statistics, with methods based on a single omics data source and show that the integrative approaches can improve the power of the analysis. In a recent study, Jauhiainen et al. ( Reference Jauhiainen, Nerman and Michailidis 44 ) proposed a multivariate approach, using a mixed linear model, to assess the association of transcriptomics and metabolomics measurements with cancer progression. The proposed model requires measurements to be observed on the same set of samples, but offers the potential for discovering novel biological mechanisms, as well as biomarker identification. On the other hand, Shojaie et al. (A Shojaie, K Panzitt, N Putluri et al., unpublished results) propose a network-based method, based on the NetGSA method( Reference Shojaie and Michailidis 45 ), for integrating multiple sources of omics data, which can be applied to data from different samples. This procedure does not lend itself directly to selection of biomarkers, and follow-up analyses are needed to determine which components of the selected pathways should be used as biomarkers.
Concurrent integrative methods have also been proposed for gaining insight into biological mechanisms in the cell. An example of such an approach includes the proposal of Shojaie et al. (A Shojaie, A Jauhiainen, M Kallitsis and G Michailidis, unpublished results) to integrate perturbation screens and steady-state gene expression profiles for discovering causal genetic regulatory mechanisms. In this study, the authors compare their proposed integrative approach with state-of-the-art methods based on a single source of omics data, and show superior estimates of regulatory networks can be obtained that by combining multiple omics data. Table 3 summarises the different classes of integration methods.
DE, differential analysis; GSEA, gene-set enrichment analysis; CA, correlation analysis; PCA, principal component analysis; SOM, self-organising maps; PLS, partial least squares; OCM, Oncomine concept mapping; NetGSA, network-based gene-set analysis.
Novel biomedical technologies continue to improve the quality of the omics data, as well as to reduce the cost of obtaining such data. In nutrition studies, biological experiments now generate multiple sources of omics data including transcriptomic, proteomic, metabolomic and gut microbial community measurements. The main challenge is now integrating such measurements in a systematic way, in order to provide a holistic view of biological systems. As more and more measurements become available, the complexity of the analysis, i.e. the number of variables in statistical models, increases. This poses additional challenges for design of trials, and necessitates the use of advanced statistical models appropriate for analysis of high-dimensional problems. A potential solution for this challenge is to incorporate available biological knowledge, including information on biological pathways and genetic, protein interaction and metabolic networks. Incorporating biological information can both reduce the dimensionality of the problem, and also improve the power and reproducibility of analysis methods.
A way forward to personalised nutrition
There is still a lot of effort needed to establish a robust health phenotype framework on which to develop personalised dietary recommendations. The improving omic technologies and the ability to integrate various omics platforms in a systematic fashion will facilitate providing a holistic view of cellular functions related to healthy phenotypes; however, the characterisation of the contribution of diet to the biochemical and metabolic parameters associated with healthy phenotypes would benefit from systematic evaluation under controlled conditions in well-described groups of individuals. Controlled human feeding studies are a useful experimental setting in which to conduct this work. Nonetheless, these types of studies are expensive and funding multiple, new large-scale dietary interventions that capture a variety of dietary patterns and intakes is likely to be prohibitive.
An efficient and effective way to develop some of the necessary omics databases under experimental conditions may be to take a collaborative approach, leveraging existing samples from previously conducted human interventions. Stored samples from controlled feeding studies are stashed away in freezers around the globe and in many cases are well characterised and ideal for further omic analysis. Statistical techniques for integrating multiple omics data from a common platform but different study populations, i.e. meta-analysis techniques, already exist; they improve statistical power by integrating samples from multiple related studies( Reference Coviello, Haring and Wellons 46 – Reference Rhodes, Barrette and Rubin 54 ) and also allow for testing of reproducibility of results across studies. Looking towards future studies, the adoption of standardised sample and metadata collection protocols would allow for easier pooling of data across studies.
Overall, the careful collection and integration of omics data from controlled dietary interventions may provide us with the data necessary to successfully move towards a goal of more personalised dietary recommendations. Nonetheless, even with the generation of expansive, integrated datasets that allow for in-depth characterisation of health phenotypes, several factors need to be considered if personalised nutrition is to move towards being a part of routine health practice. Adherence to dietary recommendations for chronic disease prevention at the population level, such as those of national and international associations (e.g. US Department of Agriculture, World Cancer Research Fund, American Heart Association) is associated with lower risk of chronic disease; for example, greater adherence to the 2005 US Dietary Guidelines was inversely associated with risk of CHD, stroke, diabetes and total cancer.( Reference Chiuve, Fung and Rimm 55 ) In theory, tailored recommendations may be an improvement over general, population-based dietary recommendations; however, whether more extensive phenotyping, beyond current approaches, is cost-effective in promoting health and preventing disease will need to be determined. Further, in practice, finding individualised approaches that facilitate and maintain desired dietary behaviour on the heels of a personalised diet prescription for health will likely remain an ongoing challenge for nutrition practitioners.
Acknowledgements
This work was supported in part by Fred Hutchinson Cancer Research Center, NCI U01 CA162077 and US NSF DMS-1161565. All authors declare no conflict of interest. The paper was written by J. W. L., S. L. N., M. A. J. H., and A. S. All authors read the draft critically and J. W. L. had the responsibility for final content.