Visual and acoustic identification of bird species

Visual and acoustic identification of bird species

Marini, A. and Turatti, A. J. and Britto, A. S. and Koerich, A. L.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings 2015

Abstract : This paper presents a novel approach for bird species identification that relies on both visual features extracted from unconstrained bird images and acoustic features extracted from bird vocalizations. The Scale Invariant Feature Transform (SIFT) detects local features in bird images, which are then used to train a support vector machine classifier. The instances that are not classified with a certain degree of certainty are then rejected and reclassified using Mel-frequency cepstral coefficients (MFCCs) extracted from the bird songs if available. Experiments conducted on a dataset of 50 bird species that comprise images from the CUB200-2011 and audio samples from Xeno-Canto have shown that improvements between 1.2 and 15.7 percentage points are achieved when using an acoustic classifier to re-process the instances rejected by the visual classifier, depending on the rejection level.