Native language detection using the I-vector framework

Native language detection using the I-vector framework

Senoussaoui, Mohammed and Cardinal, Patrick and Dehak, Najim and Koerich, Alessandro L.

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2016

Abstract : Native-language identification is the task of determining a speaker’s native language based only on their speeches in a second language. In this paper we propose the use of the wellknown i-vector representation of the speech signal to detect the native language of an English speaker. The i-vector representation has shown an excellent performance on the quite similar task of distinguishing between different languages. We have evaluated different ways to extract i-vectors in order to adapt them to the specificities of the native language detection task. The experimental results on the 2016 ComParE Native language sub-challenge test set have shown that the proposed system based on a conventional i-vector extractor outperforms the baseline system with a 42% relative improvement.