Taylor & Francis Group
Browse
1/1
2 files

Graph-Assisted Inverse Regression for Count Data and Its Application to Sequencing Data

Version 3 2021-09-29, 16:19
Version 2 2020-01-29, 19:02
Version 1 2019-12-19, 16:44
dataset
posted on 2021-09-29, 16:19 authored by Tao Wang

Multivariate count data, such as sequencing reads in genomics, are often connected to a clinical phenotype of interest. We develop a flexible framework for dimension reduction in regression, with predictors that are correlated counts, by modeling the conditional distribution of the predictors, given the response, using a pairwise Poisson graphical model. This new framework, called network-based inverse regression for counts, allows us to derive a sufficient reduction of the predictors, while adjusting for the dependence structure among them. We propose a regularized criterion for estimating both the reduction structure and the network structure. The estimation algorithm can be implemented efficiently on a parallel computer. We also introduce an adaptive version and a sparse variant of the proposed procedure. The methods are evaluated on simulated data and are applied to a gut microbiome sequencing dataset. Supplementary materials for this article are available online.

Funding

This research was supported in part by the National Natural Science Foundation of China (11601326, 11971017), National Key R&D Program of China (2018YFC0910500), Shanghai Municipal Science and Technology Major Project (2017SHZDZX01), and Neil Shen’s SJTU Medical Research Fund.

History