Meta-regression based pool size prediction scheme for dynamic selection of classifiers

Meta-regression based pool size prediction scheme for dynamic selection of classifiers

Roy, Anandarup and Cruz, Rafael M.O. and Sabourin, Robert and Cavalcanti, George D.C.

Proceedings – International Conference on Pattern Recognition 2016

Abstract : Dynamic selection (DS) is a mechanism to select one or an ensemble of competent classifiers from a pool of base classifiers, in order to classify a specific test sample. The size of this pool is user defined and yet crucial to control the computational complexity and performance of a DS. An appropriate pool size depends on the choice of base classifiers, the underlying DS method used, and more importantly, the characteristics of the given problem. After the DS method and the base classifiers are selected, an appropriate pool size for a given problem can be obtained by the repetitive application of the DS with a variety of sizes, after which a selection is performed. Since this brute force approach is computationally expensive, researchers usually set the pool size to a pre-specified value. However, this strategy may reduce the performance of the DS method. Instead, we propose a meta-regression model in order to predict a suitable pool size, based on the intrinsic classification complexity of a problem. In our strategy, we obtain the best pool sizes for a number of data sets, using the brute force approach. Additionally, we extract meta-features that represent classification complexity of a problem. These two pieces of information are associated by means of meta-regression models. Finally, for an unseen problem, we predict the pool size using this model and the classification complexity information.We carry out the experiments on 64 two-class data sets and with several well-known DS methods. We also consider variants of meta-regression techniques and report prediction results. We further analyze these results using a statistical test. Finally, we investigate the performance of a DS and observe that DS performs equivalently for predicted and the best pool sizes.