Untangle the structural and random zeros in statistical modelings

Tang, W.; He, H.; Wang, W.J.; Chen, D.G.

doi:10.6084/m9.figshare.5536669.v1

cjas_a_1391180_sm6259.pdf (180.96 kB)

Untangle the structural and random zeros in statistical modelings

journal contribution

posted on 2017-10-25, 03:51 authored by W. Tang, H. He, W.J. Wang, D.G. Chen

Count data with structural zeros are common in public health applications. There are considerable researches focusing on zero-inflated models such as zero-inflated Poisson (ZIP) and zero-inflated Negative Binomial (ZINB) models for such zero-inflated count data when used as response variable. However, when such variables are used as predictors, the difference between structural and random zeros is often ignored and may result in biased estimates. One remedy is to include an indicator of the structural zero in the model as a predictor if observed. However, structural zeros are often not observed in practice, in which case no statistical method is available to address the bias issue. This paper is aimed to fill this methodological gap by developing parametric methods to model zero-inflated count data when used as predictors based on the maximum likelihood approach. The response variable can be any type of data including continuous, binary, count or even zero-inflated count responses. Simulation studies are performed to assess the numerical performance of this new approach when sample size is small to moderate. A real data example is also used to demonstrate the application of this method.