Taylor & Francis Group
Browse
uasa_a_1514305_sm1501.pdf (275.3 kB)

Composite Coefficient of Determination and Its Application in Ultrahigh Dimensional Variable Screening

Download (275.3 kB)
journal contribution
posted on 2018-08-27, 20:38 authored by Efang Kong, Yingcun Xia, Wei Zhong

In this article, we propose to measure the dependence between two random variables through a composite coefficient of determination (CCD) of a set of nonparametric regressions. These regressions take consecutive binarizations of one variable as the response and the other variable as the predictor. The resulting measure is invariant to monotonic marginal variable transformation, rendering it robust against heavy-tailed distributions and outliers, and convenient for independent testing. Estimation of CCD could be done through kernel smoothing, with a consistency rate of root-n. CCD is a natural measure of the importance of variables in regression and its sure screening property, when used for variable screening, is also established. Comprehensive simulation studies and real data analysis show that the newly proposed measure quite often turns out to be the most preferred compared to other existing methods both in independence testing and in variable screening. Supplementary materials for this article are available online.

Funding

Efang Kong research was supported by a grant of National Natural Science Foundation of China (NNSFC) 11771066. Yingcun Xia research was supported by NNSFC grant 71371095, and MOE grant of Singapore: MOE2014-T2-1-072, and NUS AcRF grant R-155-000-193-114. Wei Zhong research was supported by NNSFC grants 11671334, 11401497, University Distinguished Young Researchers Program in Fujian Province and the Fundamental Research Funds for the Central Universities 20720181004.

History