Sample Size for Multiple Hypothesis Testing in Biosimilar Development

In biosimilar development, often multiple endpoints within a study, multiple doses and routes of administration or multiple studies in different populations are considered. However, a regulatory requirement is that equivalence of the biosimilar and the reference drug has to be shown for all comparisons, which would typically require a large sample size for a clinical development program. One way that the sample size can be reduced, when m null hypotheses are to be considered, is to require that only k < m null hypotheses have to be rejected to get approval. In fact, this is a realistic requirement, since despite their guidelines the European Medicines Agency (EMA) has already approved applications for biosimilars where not all primary endpoints met the equivalence criteria. In this article, we investigate the properties of the test for the success of at least k out of m endpoints and discuss several multiplicity adjustments that might be useful in practice. We illustrate the impact of multiple hypotheses testing on the sample size using three real-world examples of pharmacokinetic studies that were submitted to the EMA for the approval of biosimilars. Supplementary materials for this article are available online.