%0 Generic %A Lei, Jing %D 2019 %T Cross-Validation With Confidence %U https://tandf.figshare.com/articles/dataset/Cross-Validation_with_Confidence/9976901 %R 10.6084/m9.figshare.9976901.v2 %2 https://tandf.figshare.com/ndownloader/files/18232058 %2 https://tandf.figshare.com/ndownloader/files/18232061 %2 https://tandf.figshare.com/ndownloader/files/18232064 %2 https://tandf.figshare.com/ndownloader/files/18232067 %K Cross-validation %K Hypothesis testing %K Model selection %K Overfitting %K Tuning parameter selection %X

Cross-validation is one of the most popular model and tuning parameter selection methods in statistics and machine learning. Despite its wide applicability, traditional cross-validation methods tend to overfit, due to the ignorance of the uncertainty in the testing sample. We develop a novel statistically principled inference tool based on cross-validation that takes into account the uncertainty in the testing sample. This method outputs a set of highly competitive candidate models containing the optimal one with guaranteed probability. As a consequence, our method can achieve consistent variable selection in a classical linear regression setting, for which existing cross-validation methods require unconventional split ratios. When used for tuning parameter selection, the method can provide an alternative trade-off between prediction accuracy and model interpretability than existing variants of cross-validation. We demonstrate the performance of the proposed method in several simulated and real data examples. Supplemental materials for this article can be found online.

%I Taylor & Francis