Spike-and-Slab Group Lassos for Grouped Regression and Sparse Generalized Additive Models

Version 2 2020-06-08, 19:05

Version 1 2020-05-28, 14:35

dataset

posted on 2020-06-08, 19:05 authored by Ray Bai, Gemma E. Moran, Joseph L. Antonelli, Yong Chen, Mary R. Boland

Abstract–We introduce the spike-and-slab group lasso (SSGL) for Bayesian estimation and variable selection in linear regression with grouped variables. We further extend the SSGL to sparse generalized additive models (GAMs), thereby introducing the first nonparametric variant of the spike-and-slab lasso methodology. Our model simultaneously performs group selection and estimation, while our fully Bayes treatment of the mixture proportion allows for model complexity control and automatic self-adaptivity to different levels of sparsity. We develop theory to uniquely characterize the global posterior mode under the SSGL and introduce a highly efficient block coordinate ascent algorithm for maximum a posteriori estimation. We further employ de-biasing methods to provide uncertainty quantification of our estimates. Thus, implementation of our model avoids the computational intensiveness of Markov chain Monte Carlo in high dimensions. We derive posterior concentration rates for both grouped linear regression and sparse GAMs when the number of covariates grows at nearly exponential rate with sample size. Finally, we illustrate our methodology through extensive simulations and data analysis. Supplementary materials for this article are available online.