Tuning the regularisation and kernel hyperparameters is a vital step in optimising the generalisation performance of kernel methods, such as the support vector machine (SVM). This is most often performed by minimising a resampling/cross-validation based model selection criterion, however there seems little practical guidance on the most suitable form of resampling.
In this article, the authors present the results of an extensive empirical evaluation of resampling procedures for SVM hyperparameter selection, designed to address this gap in the machine learning literature. They tested 17 different resampling procedures on 121 binary classification data sets in order to select the best SVM hyperparameters.
The conclusion is that the 2-fold procedure should be used in data sets with more than 1000 points. In these cases, the user may expect a difference of −0.0031 to 0.0031 in the error rate of the classifier if a 5-fold procedure was used, which they believe is the limit of what one should consider an irrelevant change in the classifier error rate. For smaller data sets, they could not detect any significant difference (on average) between 5-fold and computationally more costly procedures such as 10-fold, 5 to 20 times repeated bootstrap, or 2 times repeated 5-fold. Thus, a 3-fold is appropriate for smaller data sets.
WAINER, Jacques; CAWLEY, Gavin. Empirical evaluation of resampling procedures for optimising SVM hyperparameters. Journal of Machine Learning Research, 2016.