Meta-learning recommendation of default size of classifier pool for META-DES

Meta-learning recommendation of default size of classifier pool for META-DES

Roy, Anandarup and Cruz, Rafael M.O. and Sabourin, Robert and Cavalcanti, George D.C.

Neurocomputing 2016

Abstract : Dynamic ensemble selection (DES) is a mechanism for selecting an ensemble of competent classifiers from a pool of base classifiers, in order to classify a particular test sample. The size of this pool is user-defined, and yet is crucial for controlling the computational complexity and performance of a DES. An appropriate pool size depends on the choice of base classifiers, the underlying DES method used, and more importantly, the characteristics of the given problem. After the DES method and the base classifiers are selected, an appropriate pool size for a given problem can be obtained by the repetitive application of the DES with a variety of sizes, after which a selection is performed. Since this brute force approach is computationally expensive, researchers set the pool size to a pre-specified value. This strategy, may, however further complicate and reduce the performance of the DES method. Instead, we propose a framework that is akin to meta-learning, in order to predict a suitable pool size based on the intrinsic classification complexity of a problem. In our strategy, we collect meta-features corresponding to classification complexity from a number of data sets. Additionally, we obtain the best pool sizes for these data sets using the brute force approach. The association between these two pieces of information is captured using meta-regression models. Finally, for an unseen problem, we predict the pool size using this model and the classification complexity information. We carry out experiments on 65 two-class data sets and with a recent DES method, namely, META-DES. We also consider variants of meta-regression techniques and report prediction results, after which we carry out a statistical comparison among them. Moreover, we investigate the performance of META-DES and observe that it performs equivalently for both the predicted and the best pool sizes.