Resources / Jobs

< Back to the main resources list

I-vectors in the context of phonetically-constrained short utterances for speaker verification

Posted September 25, 2013

Larcher, A.; Bousquet, P.; Kong Aik Lee; Matrouf, D.; Haizhou Li; Bonastre, J.-F.    2012

Short speech duration remains a critical factor of performance degradation when deploying a speaker verification system. To overcome this difficulty, a large number of commercial applications impose the use of fixed pass-phrases. In this context, we show that the performance of the popular i-vector approach can be greatly improved by taking advantage of the phonetic information that they convey. Moreover, as i-vectors require a conditioning process to reach high accuracy, we show that further improvements are possible by taking advantage of this phonetic information within the normalisation process. We compare two methods, Within Class Covariance Normalization (WCCN) and Eigen Factor Radial (EFR), both relying on parameters estimated on the same development data. Our study suggests that WCCN is more robust to data mismatch but less efficient than EFR when the development data has a better match with the test data.

Visit linked resource