1. Introduction
Asthma is one of the most common chronic inflammatory respiratory diseases worldwide. It is caused by a combination of genetic, environmental, and lifestyle factors and is characterized by inflammation, airway hyperresponsiveness, and variable airflow obstruction [Reference Willis-Owen, Cookson and Moffatt1]. Asthma not only affects the health and life of adults but also seriously influences the physical and mental health of children at the growth stage. Approximately one-third to one-half of children with moderate to severe asthma may have it persist into adulthood, which has exerted a heavy burden on the healthcare system [Reference Van Aalderen2]. It has been reported that the prevalence, incidence, and mortality of childhood asthma worldwide are increasing year by year [Reference Li, Song and Zhu3]. With the current rising trend, childhood asthma is likely to become an increasingly severe public health problem. Currently, corticosteroids, ß-agonists, magnesium sulfate, and anticholinergic drugs are the primary medications used for treating childhood asthma [Reference Shein, Speicher, Proença Filho, Gaston and Rotta4]. However, these drugs require long-term use, which can contribute to side effects and drug resistance in children. Additionally, herbs, including Glycyrrhiza glabra (licorice), have also been used as medicines to treat respiratory diseases such as asthma [Reference Afzal, Ahmad and Jabbar5]. However, the specific treatment mechanism remains unclear. Despite its prevalence and global impact, the physiological and pathological mechanisms underlying the occurrence and progression of childhood asthma remain elusive. This has motivated this study and its search for novel biomarkers for the diagnosis of and therapeutic strategies for asthma.
Long noncoding RNAs (lncRNAs) are a class of noncoding RNAs with a length greater than 200 nt and are regulators of various biological processes, including chromatin structure changes, transcriptional activation/inhibition, intracellular trafficking, and microRNA (miRNA) chelation and translation efficiency regulation [Reference Ponting, Oliver and Reik6, Reference Moran, Perera and Khalil7]. Although the roles lncRNAs may play in disease development have not been fully elucidated, lncRNAs have been reported to participate in the development of cancer due to their function as a gene regulator. LncRNAs have been suggested as potential biomarkers for many diseases, such as hepatocellular carcinoma [Reference Niu, Niu and Wang8], melanoma [Reference Yu, Zheng, Tse, Chan and Wu9], and cardiovascular diseases [Reference Poller, Dimmeler and Heymans10]. A previous study by Wei et al. [Reference Wei and Wang11] showed that lncRNA MEG3 was highly expressed in healthy tissues adjacent to gastric cancer tissues and that its overexpression could inhibit the proliferation and metastasis of gastric cancer cells through the p53 signaling pathway. Cao et al. [Reference Cao, Liu, Huang, Yue and Xi12] reported that lncRNA RMRP was significantly upregulated in bladder cancer tissues and was closely related to tumor size, lymph node metastasis, and survival time of patients. LncRNA RMRP can promote the proliferation, migration, and invasion of bladder cancer cell lines [Reference Cao, Liu, Huang, Yue and Xi12]. All these research studies indicated that lncRNAs are involved in the occurrence and development of various cancers. Furthermore, Dolcino et al. [Reference Dolcino, Tinazzi, Puccetti and Lunardi13] analyzed the transcript profiles of rheumatoid arthritis (RA) patients and healthy donors and found that lncRNA RP11-4989.15 plays an important role in RA pathogenesis. Another study screened eight important lncRNAs interacting with 52 differentially expressed genes enriched in asthma-associated pathways, and these lncRNAs have the potential to serve as underlying biomarkers useful for the future development of therapeutic strategies [Reference Liu, Zhang, Jiang, Jiang and Gao14]. Nevertheless, specific biomarkers for childhood asthma have not been identified, and the roles of lncRNAs in childhood asthma remain unclear.
Weighted gene coexpression network analysis (WGCNA) has been successfully used to identify highly correlated gene clusters (coexpression modules) and to identify potential biomarkers or therapeutic targets significantly associated with clinical phenotypes [Reference Wang, Wang and Zhong15, Reference Liu, Sun and Tian16]. Zhao and Fan [Reference Zhao and Fan17] successfully used WGCNA and univariate Cox regression analysis methods to identify five lncRNAs (GAS5, HCP5, PART1, SNHG11, and SNHG5) to predict the prognosis of ovarian cancer. Therefore, this study aims to identify differentially expressed RNAs (DERs) closely related to asthma progression in children using WGCNA. These findings may provide novel diagnostic biomarkers and therapeutic targets for the development of childhood asthma.
2. Materials and Methods
2.1. Data Source
On November 5, 2020, datasets that met the following criteria were screened in the NCBI GEO database (https://www.ncbi.nlm.nih.gov/) [Reference Barrett, Troup and Wilhite18] with the search keywords “asthma, child, childhood” and the following search restrictions: (1) the samples are from the same type (both solid or blood or tissue samples); (2) the samples are classified as disease and control groups; (3) the total number of samples included in the analysis is more than 20, and each type has at least two duplicate samples. Two sets of available data, GSE65204 [Reference Yang, Pedersen and Liu19] and GSE19187 [Reference Giovannini-Chami, Marcet and Moreilhon20], were screened and downloaded. GSE65204 contained 69 samples of nasal epithelial tissue from inner-city children aged 10–12 y, including 33 healthy samples and 36 asthma samples. GSE19187 consisted of 11 healthy samples and 13 asthma samples from children aged 6–17 y. The detection platforms used for the samples in GSE65204 and GSE19187 were Agilent-028004 SurePrint G3 Human GE 8x60K Microarray and Affymetrix Human Gene 1.0 ST Array, respectively.
2.2. Differentially Expressed RNA (DER) Screen
The detailed annotation information of the platforms was downloaded from Ensembl genome browser, release 96 (https://asia.ensembl.org/index.html), including annotation files, transcript IDs, and RefSeq IDs. The expression profiles in GSE65204 and GSE19187 were reannotated into mRNAs and lncRNAs.
The Bioconductor limma package (version 3.34.0, https://bioconductor.org/packages/release/bioc/html/limma.html) compiled in R 3.6.1 [Reference Ritchie, Phipson and Wu21] was used to identify differential gene expression profiles of DERs in both GSE65204 and GSE19187 datasets, including differentially expressed mRNAs (DE-mRNAs) and differentially expressed lncRNAs (DE-lncRNAs). The threshold values for selecting DERs were a false discovery rate (FDR) of <0.05 and |log2 fold change (FC) |>0.5. By comparing DERs between GSE65204 and GSE19187, the DERs with the same differential expression were chosen as the overlapped DERs (overlapped DE-lncRNAs and DE-mRNAs) related to asthma. These screened overlapped DE-mRNAs were utilized to perform biological process of Gene Ontology (GO BP) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses using DAVID version 6.8 (https://david.ncifcrf.gov/) [Reference Huang, Sherman and Lempicki22]. A P value of <0.05 was used as the cutoff criterion for enrichment significance.
2.3. Screen of the Significantly Stable Modules Related to Asthma by WGCNA
The WGCNA package (version 1.61, https://cran.r-project.org/web/packages/WGCNA/index.html) compiled in R [Reference Langfelder and Horvath23] was used to screen the significantly stable modules related to asthma. GSE65204 contained a relatively large number of samples and served as the training set, and GSE19187 was instead used as the validation set. The parameters were set as follows: (1) ≥100 RNAs were included in one module; (2) cutHeight = 0.995. Subsequently, the overlapped DE-mRNAs were mapped to each module from WGCNA, and then the fold enrichment parameter and P p value of the DE-mRNAs were calculated using a hypergeometric algorithm [Reference Cao and Zhang24]. P < 0.05 and fold enrichment >1 served as the thresholds for module filtering. The formula of the hypergeometric algorithm was , where N denotes all genes involved in WGCNA analysis, M denotes the number of genes in each module, n denotes the number of overlapping DE-mRNAs, and k represents the number of DE-mRNAs mapped to the corresponding module.
2.4. Coexpression Network Construction
The Pearson correlation coefficient (PCC) between DE-mRNAs enriched in the modules and overlapped DE-lncRNAs from the earlier differential expression screen was calculated using the cor function in R (https://77.66.12.57/R-help/cor.test.html). Subsequently, a coexpression network was constructed, and Cytoscape 3.6.1 (https://www.cytoscape.org/) was used for visualization [Reference Shannon, Markiel and Ozier25]. Afterwards, the DAVID tool was utilized to analyze the KEGG pathway enrichment of the genes in the coexpressed network (P < 0.05).
2.5. Identification of RNAs Directly Related to Asthma and Principal Component Analysis (PCA)
With “asthma” as the keyword, Comparative Toxicogenomics Database (2019 update) (https://ctd.mdibl.org/) [Reference Davis, Grondin and Johnson26] was utilized to search the KEGG pathways directly associated with asthma. By comparing pathways after searching to those in the coexpressed network, we used the overlapping pathways to identify the important genes directly related to asthma.
PCA is a dimension reduction technique that transforms multiple variables into a few principal components that can reflect most of the information of the original variables [Reference Alizadeh, Lyons, Castle and Prasad27]. PCA was carried out on these identified asthma-related genes using the Personality Project Psych package version 2.1.6 (https://CRAN.R-project.org/package=psych) compiled in R.
3. Results
3.1. Identification of DE-lncRNAs and DE-mRNAs
After reannotation and processing, 135 lncRNAs and 11,808 mRNAs were obtained. Afterwards, DEGs between normal samples and asthma samples were identified. In total, there were 1067 DERs (7 DE-lncRNAs and 1060 DE-mRNAs) found in GSE65204 and 1034 DERs (7 DE-lncRNAs and 1027 DE-mRNAs) found in GSE19187 (Figure 1(a)). By comparing the DERs between the two datasets, 535 overlapping genes were found (Figure 1(b)), but only 336 overlapping genes had the same differential expression, including 2 overlapped DE-lncRNAs (SNHG8 and LINC01559) and 334 overlapped DE-mRNAs.
3.2. Functional Enrichment Analyses of Overlapped DE-mRNAs
These 334 overlapped DE-mRNAs were used for GO BP and KEGG analyses. It was found that these DE-mRNAs were enriched in 28 biological processes, such as “negative regulation of endopeptidase activity,” “oxidation-reduction process,” “epidermis development,” “T-cell costimulation,” “O-glycan processing,” “protein N-linked glycosylation via asparagine,” “glutathione derivative biosynthetic process,” “extracellular matrix organization,” and “bone coagulation” (Figure 1(c)). Additionally, these overlapped DE-mRNAs were enriched into 12 KEGG pathways: “hematopoietic cell lineage,” “mineral absorption,” “thyroid hormone synthesis,” “arginine and proline metabolism,” “metabolic pathways,” “mucin-type O-glycan biosynthesis,” “phagosome,” “complement and coagulation cascades,” “endocrine and other factor-regulated calcium reabsorption,” “histidine metabolism,” “glutathione metabolism,” and “insulin secretion” (Figure 1(d)).
3.3. Significantly Stable Asthma-Related Modules Identified by WGCNA
WGCNA was performed on all DERs in GSE65204 and GSE19187, using GSE65204 as the training set and GSE19187 as the validation set. To ensure that the gene expression levels in each dataset were comparable, we first analyzed the expression level and connectivity consistency of all gene expression values detected in these two datasets. It was found that the gene expression levels and network connections were significantly positively correlated between the training and validation sets, which indicated that the two datasets were indeed comparable (Figure 2(a)). We subsequently identified significantly stable asthma-related modules in the two datasets. Based on GSE65204, the power value was 12, and the average degree of the coexpression network was 1 (Figure 2(b)). Hierarchical cluster trees were then generated from both datasets (including 11 modules from GSE65204) (Figure 2(c)). The heat map of correlations between the clinical information of samples and each module obtained from the training set is displayed in Figure 2(d). The results showed that the purple, black, and yellow modules were significantly positively correlated with asthma development (Figure 2(d)).
Then, the stability of the modules was assessed. By analyzing, it was found that there were 6 significantly stable modules with a preservation Z-score higher than 5 (Table 1). According to the hypergeometric algorithm, the above 334 overlapped DE-mRNAs were mapped to each module identified by WGCNA, and a total of 242 overlapping genes were found. Two of the modules (black and yellow) were significantly enriched, with 14 black and 58 yellow consistent DE-mRNAs found. The 72 consistent DE-mRNAs were utilized for further study.
3.4. Establishment of the Coexpression Network
PPCs between the 72 consistent DE-mRNAs found in the previous analysis and the aforementioned 2 overlapped DE-lncRNAs were calculated. The connection pairs between mRNAs and lncRNAs were retained with PCC >0.4. Subsequently, a coexpression network, including 2 DE-lncRNAs and 63 DE-mRNAs, was generated using these connection pairs (Figure 3). KEGG pathway analysis found that these DE-mRNAs were significantly enriched in five KEGG pathways: “ECM-receptor interaction,” “focal adhesion,” “beta-alanine metabolism,” “PI3K-Akt signaling pathway,” and “pathways in cancer” (Table 2).
A total of 195 KEGG pathways directly related to asthma were obtained from the CTD database. After comparison with the DE-mRNA-enriched pathways in the coexpression network, the five enriched KEGG pathways (containing eight genes) were shown to be directly associated with asthma (Table 2). It is clear that the von Willebrand factor (VWF), laminin subunit beta 3 (LAMB3), and laminin subunit alpha 4 (LAMA4) genes were involved in “ECM-receptor interaction,” “focal adhesion,” and “PI3K-Akt signaling pathway,” while LAMB3 and LAMA4 were also involved in “pathways in cancer.” In addition, caveolin-1 (CAV1) was involved in “focal adhesion,” and aldehyde dehydrogenase 1 family member A3 (ALDH1A3) and spermine oxidase (SMOX) were associated with “beta-alanine metabolism.” G protein subunit gamma 4 (GNG4) was related to the “PI3K-Akt signaling pathway” and “pathways in cancer.” Additionally, peroxisome proliferator-activated receptor gamma (PPARG) was involved in “pathways in cancer.” The coexpression network found that lncRNA LINC01559 may be coexpressed with VWF, LAMB3, CAV1, ALDH1A3, SMOX, and GNG4. lncRNA SNHG8 might coexpress with VWF, LAMB3, LAMA4, ALDH1A3, SMOX, and PPARG. This analysis suggests that these lncRNAs, mRNAs, and pathways may be closely associated with childhood asthma development.
3.5. PCA
PCA was performed on the eight genes involved in the five asthma-related pathways. Through the calculation and analysis of gene expression values in GSE65204 and GSE19187, a total of eight principal components (PCs) were obtained by fitting. Among them, the cumulative contribution rate of PC1, PC2, and PC3 was over 80%, which implied that these three PC factors contained significant information about the original variables (gene expression values). Thereafter, PC1, PC2, and PC3 were used to construct three-dimensional diagrams of the sample distribution and to build receiver operating characteristic (ROC) curves. As shown in Figures 4(a) and 4(b), PC1, PC2, and PC3 could significantly distinguish between the control and asthma samples. The area under the curve (AUC) of the ROC curves in GSE65204 and GSE19187 was 0.899 and 0.888, respectively (Figures 4(c) and 4(d)). These results further demonstrate that VWF, LAMB3, LAMA4, CAV1, ALDH1A3, SMOX, GNG4, and PPARG are closely associated with the occurrence and development of childhood asthma.
4. Discussion
Asthma is a common chronic respiratory disease in children, seriously affecting children’s health and growth [Reference James and Rosenbaum28]. Since the etiology of childhood asthma is complex and its pathogenesis is still unclear, future research must focus on exploring potential diagnostic and therapeutic biomarkers for disease progression. This study identified a total of 1067 DERs (7 DE-lncRNAs and 1060 DE-mRNAs) and 1034 DERs (7 DE-lncRNAs and 1027 DE-mRNAs) in the GSE65204 and GSE19187 datasets, respectively. After comparison between GSE65204 and GSE19187, two overlapped DE-lncRNAs (SNHG8 and LINC01559) and 334 overlapped DE-mRNAs had the same differential expression. After that, significantly stable modules related to asthma were identified by WGCNA, and a coexpression network was built with 2 DE-lncRNAs and 63 DE-mRNAs. After functional analysis of genes in the coexpression network, five KEGG pathways (“ECM-receptor interaction,” “focal adhesion,” “beta-alanine metabolism,” “PI3K-Akt signaling pathway,” and “pathways in cancer”) contained 8 genes (VWF, LAMB3, LAMA4, CAV1, ALDH1A3, SMOX, GNG4, and PPARG) which were found to be directly related to childhood asthma development and have potential for use as disease progression biomarkers.
Previous studies have shown that lncRNAs play important roles in the development of pediatric airway diseases, including asthma [Reference Narozna, Langwinski and Szczepankiewicz29, Reference Austin, Tsitsiou and Boardman30]. Our study found that lncRNAs SNHG8 and LINC01559 were found to be associated with the progression of childhood asthma. SNHG8, located on 4q26, regulates tumorigenesis and metastasis and controls the progression of multifarious diseases, such as hepatocellular carcinoma [Reference Dong, Teng, Guo, Yang, Ding and Fu31], pancreatic adenocarcinoma [Reference Song, Zou, Li, Shen, Cai and Wu32], endometrial carcinoma [Reference Yang, Zhang and Zhou33], and gastric carcinoma [Reference Liu, Yang and Gu34]. Wang et al. indicated that LINC01559 was upregulated in gastric cancer tissues and could stimulate the PI3K-Akt signaling pathway to accelerate the progression of gastric cancer [Reference Wang, Bo and Yi35]. Therefore, we speculate that lncRNAs SNHG8 and LINC01559 may participate in the occurrence and development of childhood asthma by regulating cell proliferation, migration, and PI3K-Akt signaling pathway.
In addition, from the lncRNA-mRNA coexpression network, LINC01559 might coexpress with VWF, LAMB3, CAV1, ALDH1A3, SMOX, and GNG4, and SNHG8 might coexpress with VWF, LAMB3, LAMA4, ALDH1A3, SMOX, and PPARG, which were enriched in five KEGG pathways, including “ECM-receptor interaction,” “focal adhesion,” “beta-alanine metabolism,” “PI3K-Akt signaling pathway,” and “pathways in cancer.” PCA further verified that these eight genes are important in childhood asthma. VWF, a large plasma glycoprotein, plays a vital role in mediating the balance between blood clotting and bleeding [Reference Xiang and Hwa36]. A study by Yao et al. [Reference Yao, Wang and Li37] showed that VWF was highly expressed in an IL-25-induced murine asthma model and was associated with bronchial mucosal vascular remodeling in asthma. LAMB3 and LAMA4 belong to the laminin subunit family and have been reported to be involved in the metastasis and invasion of some types of cancer [Reference Jung, Lim and Liu38, Reference Li, Guan and Liu39]. A study by Zhang et al. [Reference Zhang, Pan and Cheung40] demonstrated that LAMB3 could mediate apoptosis, proliferation, migration, and invasion of pancreatic ductal adenocarcinoma cells through the PI3K/Akt signaling pathway. Another study found that high expression of LAMA4 may be a new marker of tumor invasion and metastasis in human hepatocellular carcinoma [Reference Huang, Ji, Wu, Wan and Yu41]. CAV1 is a carcinogenic membrane protein associated with extracellular matrix tissue, cell migration, cholesterol distribution, endocytosis, and signal transduction [Reference Nwosu, Ebert, Dooley and Meyer42]. In our study, CAV1 participated in focal adhesion, thus contributing to the pathogenesis of childhood asthma. GNG4 is one of the fourteen γ-subunit proteins of the G protein trimer complex [Reference Milligan and Kostenis43], and Liu et al. [Reference Liu, Huang and Cheng44] found that GNG4 associated with the PI3K-Akt signaling pathway to participate in rectal cancer through using a bioinformatics approach. PPARG is one of the three subtypes of peroxisome proliferator-activated receptors, and its activation has been reported to regulate the synthesis and release of immunomodulatory cytokines from various cell types [Reference Oh, Park and Lee45]. A previous study has shown that PPARG modulates inflammation and may have an effect on regulating the long-term control of asthma in children and young adults [Reference Palmer, Doney and Ismail46].
Additionally, ALDH1A3 and SMOX were also essential for childhood asthma and were enriched in the “beta-alanine metabolism” pathway. ALDH1A3 is widely distributed in normal tissues and abnormally expressed in a variety of cancers [Reference Duan, Cai, Guo, Bian and Yu47]. Li et al. [Reference Li, Li and Liu48] indicated that ALDH1A3 can serve as an activator of mesenchymal differentiation and may be a marker for predicting the survival rate of glioblastoma patients. SMOX, a member of the polyamine oxidases, catalyzes the oxidative degradation of polyamine spermidine to produce spermidine and is caused by a variety of stimuli, including bacterial infection and oxidative stress [Reference Cervelli, Leonetti and Cervoni49]. Jain et al. [Reference Jain, Raina and Gheware50] found low expression of SMOX in bronchial epithelial cells (BECs) of asthmatic lung samples, and SMOX knockdown resulted in asthma in naive mice, such as airway hyperresponsiveness, remodeling, and BEC apoptosis. Combined with our results, we speculate that lncRNAs LINC01559 and SNHG8 may be coexpressed with VWF, LAMB3, LAMA4, CAV1, ALDH1A3, SMOX, GNG4, and PPARG, and these lncRNAs and mRNAs may be direct biomarkers related to the occurrence and progression of childhood asthma. However, the relationships between lncRNAs and their target genes need to be further characterized, and use of these identified lncRNAs and mRNAs as biomarkers will require further validation in a clinical setting.
5. Conclusion
In conclusion, LINC01559 and SNHG8 may be coexpressed with VWF, LAMB3, LAMA4, CAV1, ALDH1A3, SMOX, GNG4, and PPARG. Additionally, these lncRNAs and mRNAs may be directly involved in the pathogenesis of childhood asthma through ECM-receptor interaction, focal adhesion, beta-alanine metabolism, PI3K-Akt signaling pathway, and pathways in cancer. The two lncRNAs and eight genes could explain potential, novel, pathological, and molecular mechanisms for childhood asthma and may serve as candidates for therapeutic strategies for this disease.
Data Availability
The datasets used and/or analyzed in this study are available from the corresponding author upon reasonable request.
Disclosure
This research was performed as part of the authors’ employment at their respective affiliated organizations.
Conflicts of Interest
The authors declare no conflicts of interest.
Authors’ Contributions
Min Hao and Jinling Zan conceptualized and investigated the study, provided methodology, and performed formal analysis. Min Hao wrote the article. Jinling Zan edited the article and supervised the study.