SEMIPARAMETRIC ESTIMATION AND VARIABLE SELECTION FOR SPARSE SINGLE INDEX MODELS IN INCREASING DIMENSION

Chaohua Dong; Yundong Tu

doi:10.1017/S0266466624000021

SEMIPARAMETRIC ESTIMATION AND VARIABLE SELECTION FOR SPARSE SINGLE INDEX MODELS IN INCREASING DIMENSION

Published online by Cambridge University Press: 08 February 2024

Chaohua Dong and

Yundong Tu

Show author details

Chaohua Dong: Affiliation:
Zhongnan University of Economics and Law
Yundong Tu*: Affiliation:
Peking University
*: Address correspondence to Yundong Tu, Guanghua School of Management and Center for Statistical Science, Peking University, Beijing, China; e-mail: yundong.tu@gsm.pku.edu.cn. Address correspondence to Yungong Tu, Peking University; e-mail: yundong.tu@gsm.pku.edu.cn.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

This paper considers semiparametric sieve estimation in high-dimensional single index models. The use of Hermite polynomials in approximating the unknown link function provides a convenient framework to conduct both estimation and variable selection. The estimation of the index parameter is formulated from solutions obtained by the routine penalized weighted linear regression procedure, where the weights are used in order to tackle the unbounded support of the regressors. The resulting index parameter estimator is shown to be consistent and sparse, and the asymptotic normality for the estimators of both the index parameter and the link function is established. To perform variable selection in the ultra-high dimension case, we further suggest a forward regression screening method, which is shown to enjoy the sure independence screening property. This screening procedure can be used before the penalized variable selection to reduce the burden of dimensionality. Numerical results show that both the variable selection procedures and the associated estimators perform well in finite samples.

Type: ARTICLES
Information: Econometric Theory , First View , pp. 1 - 43

DOI: https://doi.org/10.1017/S0266466624000021 [Opens in a new window]
Copyright: © The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The authors thank the Editor, Co-Editor, and the two anonymous referees for constructive comments that helped improve the presentation of the paper. Dong thanks the partial support from the National Natural Science Foundation of China (Grant No. 72073143) and Fundamental Research Funds for the Central Universities, Zhongnan University of Economics and Law (Grant No. 2722022EG001). Tu acknowledges the partial support from the National Natural Science Foundation of China (Grant Nos. 72073002, 12026607, 92046021), the Center for Statistical Science at Peking University, and Key Laboratory of Mathematical Economics and Quantitative Finance (Peking University), Ministry of Education.

References

REFERENCES

Ai, C., & Chen, X. (2003). Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica, 71, 1795–1843.CrossRef Google Scholar

Ai, C., & Chen, X. (2007). Estimation of possibly misspecified semiparametric conditional moment restriction models with different conditioning variables. Journal of Econometrics, 141, 5–43.CrossRef Google Scholar

Antoniadis, A. (1996). Smoothing noisy data with tapered coiflets series. Scandinavian Journal of Statistics, 23, 313–330.Google Scholar

Belloni, A., Chernozhukov, V., Chetverikov, D., & Kato, K. (2015). Some new asymptotic theory for least squares series: Pointwise and uniform results. Journal of Econometrics, 186, 345–366.CrossRef Google Scholar

Belloni, A., Chernozhukov, V., & Wang, L. (2014). Pivotal estimation via square-root lasso in nonparametric regression. Annals of Statistics, 42, 757–788.CrossRef Google Scholar

Chang, J., Chen, S., & Chen, X. (2015). High dimensional generalized empirical likelihood for moment restrictions with dependent data. Journal of Econometrics, 185, 283–304.CrossRef Google Scholar

Chen, J., & Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika, 95(3), 759–771.CrossRef Google Scholar

Chen, X., & Christensen, T. (2015). Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions. Journal of Econometrics, 188, 447–465.CrossRef Google Scholar

Chen, X., & Shen, X. (1998). Sieve extremum estimates for weakly dependent data. Econometrica, 66, 289–314.CrossRef Google Scholar

Cheng, M., Honda, T., & Zhang, J. (2016). Forward variable selection for sparse ultra-high dimensional varying coefficient models. Journal of the American Statistical Association, 111(515), 1209–1221.CrossRef Google Scholar

Cui, X., Hardle, W. K., & Zhu, L. (2011). The EFM approach for single-index models. Annals of Statistics, 39, 1658–1688.CrossRef Google Scholar

Donald, S. G., Imbens, G. W., & Newey, W. K. (2009). Choosing instrumental variables in conditional moment restriction models. Journal of Econometrics, 152, 28–36.CrossRef Google Scholar

Dong, C., Gao, J., & Peng, B. (2015). Semiparametric single-index panel data models with cross-sectional dependence. Journal of Econometrics, 188, 301–312.CrossRef Google Scholar

Dong, C., Gao, J., & Tjøstheim, D. (2016). Estimation for single-index and partially linear single-index integrated models. Annals of Statistics, 44, 425–453.CrossRef Google Scholar

Dong, C., Linton, O., & Peng, B. (2021). A weighted sieve estimator for nonparametric time series models with nonstationary variables. Journal of Econometrics, 222, 909–932.CrossRef Google Scholar

Fan, J., Samworth, R. J., & Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. Journal of Machine Learning Research, 10, 2013–2038.Google Scholar PubMed

Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its Oracle properties. Journal of the American Statistical Association, 96, 1348–1360.CrossRef Google Scholar

Fan, J., & Liao, Y. (2014). Endogeneity in high dimensions. Annals of Statistics, 42, 872–917.CrossRef Google Scholar PubMed

Fan, J., & Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of The Royal Statistical Society Series B-statistical Methodology, 70(5), 849–911.CrossRef Google Scholar

Gorst-Rasmussen, A., & Scheike, T. H. (2013). Independent screening for single-index hazard rate models with ultrahigh dimensional features. Journal of The Royal Statistical Society Series B-statistical Methodology, 75(2), 217–245.CrossRef Google Scholar

Han, X. (2019). Nonparametric screening under conditional strictly convex loss for ultrahigh dimensional sparse data. Annals of Statistics, 47(4), 1995–2022.CrossRef Google Scholar

Hansen, B. E. (2015). A unified asymptotic distribution theory for parametric and nonparametric least square. Working paper, University of Wisconsin.Google Scholar

Hardle, W., & Stocker, T. W. (1989). Investigating smooth multiple regression by method of average derivatives. Journal of the American Statistical Association, 84, 986–995.Google Scholar

Hardle, W., Hall, P., & Ichimura, H. (1993). Optimal smoothing in single-index models. Annals of Statistics, 21, 157–178.CrossRef Google Scholar

Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics, 58(1–2), 71–120.CrossRef Google Scholar

Klein, R. W., & Spady, R. H. (1993). An efficient semiparametric estimator for binary response models. Econometrica, 61, 387–421.CrossRef Google Scholar

Kong, E., Xia, Y., & Zhong, W. (2019). Composite coefficient of determination and its application in ultrahigh dimensional variable screening. Journal of the American Statistical Association, 114(528), 1740–1751.CrossRef Google Scholar

Lv, J., & Fan, Y. (2009). A unified approach to model selection and sparse recovery using regularized least squares. Annals of Statistics, 37, 3498–3528.CrossRef Google Scholar

Ma, S., Liang, H., & Tsai, C.-L. (2014). Partially linear single index models for repeated measurements. Journal of Multivariate Analysis, 130, 354–375.CrossRef Google Scholar

Newey, W. K. (1997). Convergence rates and asymptotic normality for series estimators. Journal of Econometrics, 79, 147–168.CrossRef Google Scholar

Pan, W., Wang, X., Xiao, W., & Zhu, H. (2019). A generic sure independence screening procedure. Journal of the American Statistical Association, 114(526), 928–937.CrossRef Google Scholar PubMed

Peng, H., & Huang, T. (2011). Penalized least squares for single index models. Journal of Statistical Planning and Inference, 141, 1362–1379.CrossRef Google Scholar

Power, J. L., Stock, J. H., & Stoker, T. M. (1989). Semiparametric estimation of index coefficients. Econometrica, 57, 1403–1430.Google Scholar

Radchenko, P. (2015). High dimensional single index models. Journal of Multivariate Analysis, 139, 266–282.CrossRef Google Scholar

Szego, G. (1975). Orthogonal Polynomials. Colloquium Publications XXIII: American Mathematical Association.Google Scholar

Tu, Y., & Wang, S. (2023). Variable screening and model averaging for expectile regressions, Oxford Bulletin of Economics and Statistics, 85(3) 574–598.CrossRef Google Scholar

Wang, H. (2009). Forward regression for ultra-high dimensional variable screening. Journal of the American Statistical Association, 104(488), 1512–1524.CrossRef Google Scholar

Xia, Y. (2006). Asymptotic distributions for two estimators of the single-index model. Econometric Theory, 22, 1112–1137.CrossRef Google Scholar

Xia, Y., Tong, H., Li, W. K., & Zhu, L.-X. (2002). An adaptive estimation of dimension reduction. Journal of the Royal Statistical Society B, 64, 363–410.CrossRef Google Scholar

Yu, Y., & Ruppert, D. (2002). Penalized spline estimation for partially linear single-index models. Journal of the American Statistical Association, 97, 1042–1054.CrossRef Google Scholar

Zhang, C. H. (2010). Nearly unbiased variable selection under minmax concave penalty. Annals of Statistics, 38, 894–942.CrossRef Google Scholar

Zhang, C. H., & Huang, J. (2008). The sparsity and bias of the lasso selection in high-dimensional linear regression. Annals of Statistics, 36, 1567–1594.CrossRef Google Scholar

Zhang, Y., Lian, H., & Yu, Y. (2020). Ultra-high dimensional single-index quantile regression. Journal of Machine Learning Research, 21(224), 1–25.Google Scholar

Zhong, W., Zhu, L., Li, R., & Cui, H. (2016). Regularized quantile regression and robust feature screening for single index models. Statistica Sinica, 26(1), 69–95.Google Scholar PubMed

Dong and Tu supplementary material

File 223.6 KB

Article contents

SEMIPARAMETRIC ESTIMATION AND VARIABLE SELECTION FOR SPARSE SINGLE INDEX MODELS IN INCREASING DIMENSION

Abstract

Access options

Footnotes

References

REFERENCES

Dong and Tu supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests