Hostname: page-component-848d4c4894-wzw2p Total loading time: 0 Render date: 2024-05-18T01:15:28.372Z Has data issue: false hasContentIssue false

SEMIPARAMETRIC ESTIMATION AND VARIABLE SELECTION FOR SPARSE SINGLE INDEX MODELS IN INCREASING DIMENSION

Published online by Cambridge University Press:  08 February 2024

Chaohua Dong
Affiliation:
Zhongnan University of Economics and Law
Yundong Tu*
Affiliation:
Peking University
*
Address correspondence to Yundong Tu, Guanghua School of Management and Center for Statistical Science, Peking University, Beijing, China; e-mail: yundong.tu@gsm.pku.edu.cn. Address correspondence to Yungong Tu, Peking University; e-mail: yundong.tu@gsm.pku.edu.cn.

Abstract

This paper considers semiparametric sieve estimation in high-dimensional single index models. The use of Hermite polynomials in approximating the unknown link function provides a convenient framework to conduct both estimation and variable selection. The estimation of the index parameter is formulated from solutions obtained by the routine penalized weighted linear regression procedure, where the weights are used in order to tackle the unbounded support of the regressors. The resulting index parameter estimator is shown to be consistent and sparse, and the asymptotic normality for the estimators of both the index parameter and the link function is established. To perform variable selection in the ultra-high dimension case, we further suggest a forward regression screening method, which is shown to enjoy the sure independence screening property. This screening procedure can be used before the penalized variable selection to reduce the burden of dimensionality. Numerical results show that both the variable selection procedures and the associated estimators perform well in finite samples.

Type
ARTICLES
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The authors thank the Editor, Co-Editor, and the two anonymous referees for constructive comments that helped improve the presentation of the paper. Dong thanks the partial support from the National Natural Science Foundation of China (Grant No. 72073143) and Fundamental Research Funds for the Central Universities, Zhongnan University of Economics and Law (Grant No. 2722022EG001). Tu acknowledges the partial support from the National Natural Science Foundation of China (Grant Nos. 72073002, 12026607, 92046021), the Center for Statistical Science at Peking University, and Key Laboratory of Mathematical Economics and Quantitative Finance (Peking University), Ministry of Education.

References

REFERENCES

Ai, C., & Chen, X. (2003). Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica, 71, 17951843.CrossRefGoogle Scholar
Ai, C., & Chen, X. (2007). Estimation of possibly misspecified semiparametric conditional moment restriction models with different conditioning variables. Journal of Econometrics, 141, 543.CrossRefGoogle Scholar
Antoniadis, A. (1996). Smoothing noisy data with tapered coiflets series. Scandinavian Journal of Statistics, 23, 313330.Google Scholar
Belloni, A., Chernozhukov, V., Chetverikov, D., & Kato, K. (2015). Some new asymptotic theory for least squares series: Pointwise and uniform results. Journal of Econometrics, 186, 345366.CrossRefGoogle Scholar
Belloni, A., Chernozhukov, V., & Wang, L. (2014). Pivotal estimation via square-root lasso in nonparametric regression. Annals of Statistics, 42, 757788.CrossRefGoogle Scholar
Chang, J., Chen, S., & Chen, X. (2015). High dimensional generalized empirical likelihood for moment restrictions with dependent data. Journal of Econometrics, 185, 283304.CrossRefGoogle Scholar
Chen, J., & Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika, 95(3), 759771.CrossRefGoogle Scholar
Chen, X., & Christensen, T. (2015). Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions. Journal of Econometrics, 188, 447465.CrossRefGoogle Scholar
Chen, X., & Shen, X. (1998). Sieve extremum estimates for weakly dependent data. Econometrica, 66, 289314.CrossRefGoogle Scholar
Cheng, M., Honda, T., & Zhang, J. (2016). Forward variable selection for sparse ultra-high dimensional varying coefficient models. Journal of the American Statistical Association, 111(515), 12091221.CrossRefGoogle Scholar
Cui, X., Hardle, W. K., & Zhu, L. (2011). The EFM approach for single-index models. Annals of Statistics, 39, 16581688.CrossRefGoogle Scholar
Donald, S. G., Imbens, G. W., & Newey, W. K. (2009). Choosing instrumental variables in conditional moment restriction models. Journal of Econometrics, 152, 2836.CrossRefGoogle Scholar
Dong, C., Gao, J., & Peng, B. (2015). Semiparametric single-index panel data models with cross-sectional dependence. Journal of Econometrics, 188, 301312.CrossRefGoogle Scholar
Dong, C., Gao, J., & Tjøstheim, D. (2016). Estimation for single-index and partially linear single-index integrated models. Annals of Statistics, 44, 425453.CrossRefGoogle Scholar
Dong, C., Linton, O., & Peng, B. (2021). A weighted sieve estimator for nonparametric time series models with nonstationary variables. Journal of Econometrics, 222, 909932.CrossRefGoogle Scholar
Fan, J., Samworth, R. J., & Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. Journal of Machine Learning Research, 10, 20132038.Google ScholarPubMed
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its Oracle properties. Journal of the American Statistical Association, 96, 13481360.CrossRefGoogle Scholar
Fan, J., & Liao, Y. (2014). Endogeneity in high dimensions. Annals of Statistics, 42, 872917.CrossRefGoogle ScholarPubMed
Fan, J., & Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of The Royal Statistical Society Series B-statistical Methodology, 70(5), 849911.CrossRefGoogle Scholar
Gorst-Rasmussen, A., & Scheike, T. H. (2013). Independent screening for single-index hazard rate models with ultrahigh dimensional features. Journal of The Royal Statistical Society Series B-statistical Methodology, 75(2), 217245.CrossRefGoogle Scholar
Han, X. (2019). Nonparametric screening under conditional strictly convex loss for ultrahigh dimensional sparse data. Annals of Statistics, 47(4), 19952022.CrossRefGoogle Scholar
Hansen, B. E. (2015). A unified asymptotic distribution theory for parametric and nonparametric least square. Working paper, University of Wisconsin.Google Scholar
Hardle, W., & Stocker, T. W. (1989). Investigating smooth multiple regression by method of average derivatives. Journal of the American Statistical Association, 84, 986995.Google Scholar
Hardle, W., Hall, P., & Ichimura, H. (1993). Optimal smoothing in single-index models. Annals of Statistics, 21, 157178.CrossRefGoogle Scholar
Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics, 58(1–2), 71120.CrossRefGoogle Scholar
Klein, R. W., & Spady, R. H. (1993). An efficient semiparametric estimator for binary response models. Econometrica, 61, 387421.CrossRefGoogle Scholar
Kong, E., Xia, Y., & Zhong, W. (2019). Composite coefficient of determination and its application in ultrahigh dimensional variable screening. Journal of the American Statistical Association, 114(528), 17401751.CrossRefGoogle Scholar
Lv, J., & Fan, Y. (2009). A unified approach to model selection and sparse recovery using regularized least squares. Annals of Statistics, 37, 34983528.CrossRefGoogle Scholar
Ma, S., Liang, H., & Tsai, C.-L. (2014). Partially linear single index models for repeated measurements. Journal of Multivariate Analysis, 130, 354375.CrossRefGoogle Scholar
Newey, W. K. (1997). Convergence rates and asymptotic normality for series estimators. Journal of Econometrics, 79, 147168.CrossRefGoogle Scholar
Pan, W., Wang, X., Xiao, W., & Zhu, H. (2019). A generic sure independence screening procedure. Journal of the American Statistical Association, 114(526), 928937.CrossRefGoogle ScholarPubMed
Peng, H., & Huang, T. (2011). Penalized least squares for single index models. Journal of Statistical Planning and Inference, 141, 13621379.CrossRefGoogle Scholar
Power, J. L., Stock, J. H., & Stoker, T. M. (1989). Semiparametric estimation of index coefficients. Econometrica, 57, 14031430.Google Scholar
Radchenko, P. (2015). High dimensional single index models. Journal of Multivariate Analysis, 139, 266282.CrossRefGoogle Scholar
Szego, G. (1975). Orthogonal Polynomials. Colloquium Publications XXIII: American Mathematical Association.Google Scholar
Tu, Y., & Wang, S. (2023). Variable screening and model averaging for expectile regressions, Oxford Bulletin of Economics and Statistics, 85(3) 574598.CrossRefGoogle Scholar
Wang, H. (2009). Forward regression for ultra-high dimensional variable screening. Journal of the American Statistical Association, 104(488), 15121524.CrossRefGoogle Scholar
Xia, Y. (2006). Asymptotic distributions for two estimators of the single-index model. Econometric Theory, 22, 11121137.CrossRefGoogle Scholar
Xia, Y., Tong, H., Li, W. K., & Zhu, L.-X. (2002). An adaptive estimation of dimension reduction. Journal of the Royal Statistical Society B, 64, 363410.CrossRefGoogle Scholar
Yu, Y., & Ruppert, D. (2002). Penalized spline estimation for partially linear single-index models. Journal of the American Statistical Association, 97, 10421054.CrossRefGoogle Scholar
Zhang, C. H. (2010). Nearly unbiased variable selection under minmax concave penalty. Annals of Statistics, 38, 894942.CrossRefGoogle Scholar
Zhang, C. H., & Huang, J. (2008). The sparsity and bias of the lasso selection in high-dimensional linear regression. Annals of Statistics, 36, 15671594.CrossRefGoogle Scholar
Zhang, Y., Lian, H., & Yu, Y. (2020). Ultra-high dimensional single-index quantile regression. Journal of Machine Learning Research, 21(224), 125.Google Scholar
Zhong, W., Zhu, L., Li, R., & Cui, H. (2016). Regularized quantile regression and robust feature screening for single index models. Statistica Sinica, 26(1), 6995.Google ScholarPubMed
Supplementary material: File

Dong and Tu supplementary material

Dong and Tu supplementary material
Download Dong and Tu supplementary material(File)
File 223.6 KB