Taylor & Francis Group
Browse
ubes_a_1635486_sm8097.pdf (118.62 kB)

Regression Analysis with Individual-Specific Patterns of Missing Covariates

Download (118.62 kB)
Version 2 2019-08-19, 13:42
Version 1 2019-06-26, 16:24
journal contribution
posted on 2019-08-19, 13:42 authored by Huazhen Lin, Wei Liu, Wei Lan

It is increasingly common to collect data from heterogeneous sources in practice. Two major challenges complicate the statistical analysis of such data. First, only a small proportion of units have complete information across all sources. Second, the missing data patterns vary across individuals. Our motivating online-loan data have 93% missing covariates where the missing pattern is individual-specific. The existing regression analysis with missing covariates either are inefficient or require additional modeling assumptions on the covariates. We propose a simple yet efficient iterative least squares estimator of the regression coefficient for the data with individual-specific missing patterns. Our method has several desirable features. First, it does not require any modeling assumptions on the covariates. Second, the imputation of the missing covariates involves feasible one-dimensional nonparametric regressions, and can maximally use the information across units and the relationship among the covariates. Third, the iterative least squares estimate is both computationally and statistically efficient. We study the asymptotic properties of our estimator and apply it to the motivating online-loan data. Supplementary materials for this article are available online. KEY WORDS: High missing rate; Individual-specific missing; Iterative least squares; Missing covariates.

Funding

Lin’s research was partially supported by National Natural Science Foundation of China (Nos. 11571282 and 11829101) and Fundamental Research Funds for the Central Universities (Nos. JBK140507 and JBK120509) of China.

History