Supervised Sparse and Functional Principal Component Analysis

Li, Gen; Shen, Haipeng; Huang, Jianhua Z.

doi:10.6084/m9.figshare.1569737.v1

ucgs_a_1064434_sm2452.pdf (417.29 kB)

Supervised Sparse and Functional Principal Component Analysis

journal contribution

posted on 2015-08-05, 00:00 authored by Gen Li, Haipeng Shen, Jianhua Z. Huang

Principal component analysis (PCA) is an important tool for dimension reduction in multivariate analysis. Regularized PCA methods, such as sparse PCA and functional PCA, have been developed to incorporate special features in many real applications. Sometimes additional variables (referred to as supervision) are measured on the same set of samples, which can potentially drive low-rank structures of the primary data of interest. Classical PCA methods cannot make use of such supervision data. In this article, we propose a supervised sparse and functional principal component (SupSFPC) framework that can incorporate supervision information to recover underlying structures that are more interpretable. The framework unifies and generalizes several existing methods and flexibly adapts to the practical scenarios at hand. The SupSFPC model is formulated in a hierarchical fashion using latent variables. We develop an efficient modified expectation-maximization (EM) algorithm for parameter estimation. We also implement fast data-driven procedures for tuning parameter selection. Our comprehensive simulation and real data examples demonstrate the advantages of SupSFPC. Supplementary materials for this article are available online.