Taylor & Francis Group
Browse
ucgs_a_1753530_sm4461.zip (265.45 kB)

A Tree-Based Semi-Varying Coefficient Model for the COM-Poisson Distribution

Download (265.45 kB)
Version 3 2021-09-29, 16:25
Version 2 2020-05-15, 20:19
Version 1 2020-04-16, 18:31
dataset
posted on 2021-09-29, 16:25 authored by Suneel Babu Chatla, Galit Shmueli

We propose a tree-based semi-varying coefficient model for the Conway–Maxwell–Poisson (CMP or COM-Poisson) distribution which is a two-parameter generalization of the Poisson distribution and is flexible enough to capture both under-dispersion and over-dispersion in count data. The advantage of tree-based methods is their scalability to high-dimensional data. We develop CMPMOB, an estimation procedure for a semi-varying coefficient model, using model-based recursive partitioning (MOB). The proposed framework is broader than the existing MOB framework as it allows node-invariant effects to be included in the model. To simplify the computational burden of the exhaustive search employed in the original MOB algorithm, a new split point estimation procedure is proposed by borrowing tools from change point estimation methodology. The proposed method uses only the estimated score functions without fitting models for each split point and, therefore, is computationally simpler. Since the tree-based methods only provide a piece-wise constant approximation to the underlying smooth function, we further propose the CMPBoost semi-varying coefficient model which uses the gradient boosting procedure for estimation. The usefulness of the proposed methods are illustrated using simulation studies and a real example from a bike sharing system in Washington, DC. Supplementary files for this article are available online.

Funding

This research is partially supported by Ministry of Science and Technology, Taiwan, grant 105-2410-H-007-034-MY3 (both authors) and grant 107-2811-M-007-1047 (first author).

History