Classification with minimum ambiguity under distribution heterogeneity

Liu, Yongxin; Lin, Lu

doi:10.6084/m9.figshare.8107631.v1

gscs_a_1615063_sm9293.pdf (68.78 kB)

Classification with minimum ambiguity under distribution heterogeneity

journal contribution

posted on 2019-05-10, 05:52 authored by Yongxin Liu, Lu Lin

The traditional classification is based on the assumption that distribution of indicator variable X in one class is homogeneous. However, when data in one class comes from heterogeneous distribution, the likelihood ratio of two classes is not unique. In this paper, we construct the classification via an ambiguity criterion for the case of distribution heterogeneity of X in a single class. The separated historical data in each situation are used to estimate the thresholds respectively. The final boundary is chosen as the maximum and minimum thresholds from all situations. Our approach obtains the minimum ambiguity with a high classification accuracy allowing for a precise decision. In addition, nonparametric estimation of the classification region and theoretical properties are derived. Simulation study and real data analysis are reported to demonstrate the effectiveness of our method.