Sensitivity analysis and choosing between alternative polytomous IRT models using Bayesian model comparison criteria

Polytomous Item Response Theory (IRT) models are used by specialists to score assessments and questionnaires that have items with multiple response categories. In this article, we study the performance of five model comparison criteria for comparing fit of the graded response and generalized partial credit models using the same dataset when the choice between the two is unclear. Simulation study is conducted to analyze the sensitivity of priors and compare the performance of the criteria using the No-U-Turn Sampler algorithm, under a Bayesian approach. The results were used to select a model for an application in mental health data.