Dynamic ensembles of exemplar-SVMs for still-to-video face recognition

Dynamic ensembles of exemplar-SVMs for still-to-video face recognition

Bashbaghi, Saman and Granger, Eric and Sabourin, Robert and Bilodeau, Guillaume Alexandre

Pattern Recognition 2017

Abstract : Face recognition (FR) plays an important role in video surveillance by allowing to accurately recognize individuals of interest over a distributed network of cameras. Systems for still-to-video FR are exposed to challenging operational environments. The appearance of faces changes when captured under unconstrained conditions due to variations in pose, scale, illumination, occlusion, blur, etc. Moreover, the facial models used for matching may not be robust to intra-class variations because they are typically designed a priori with one reference facial still per person. Indeed, faces captured during enrollment (using still cameras) may differ considerably from those captured during operations (using surveillance cameras). In this paper, an efficient multi-classifier system (MCS) is proposed for accurate still-to-video FR based on multiple face representations and domain adaptation (DA). An individual-specific ensemble of exemplar-SVM (e-SVM) classifiers is thereby designed to improve robustness to intra-class variations. During enrollment of a target individual, an ensemble is used to model the single reference still, where multiple face descriptors and random feature subspaces allow to generate a diverse pool of patch-wise classifiers. To adapt these ensembles to the operational domains, e-SVMs are trained using labeled face patches extracted from the reference still versus patches extracted from cohort and other non-target stills mixed with unlabeled patches extracted from the corresponding face trajectories captured with surveillance cameras. During operations, the most competent classifiers per given probe face are dynamically selected and weighted based on the internal criteria determined in the feature space of e-SVMs. This paper also investigates the impact of using different training schemes for DA, as well as, the validation set of non-target faces extracted from stills and video trajectories of unknown individuals in the operational domain. The performance of the proposed system was validated using videos from the COX-S2V and Chokepoint datasets. Results indicate that the proposed system can surpass state-of-the-art accuracy, yet with a significantly lower computational complexity. Indeed, dynamic selection and weighting allow to combine only the most relevant classifiers for each input probe.