ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

The IBM Speaker Recognition System: Recent Advances and Error Analysis

Sadjadi, Seyed Omid and Pelecanos, Jason W and Ganapathy, Sriram (2017) The IBM Speaker Recognition System: Recent Advances and Error Analysis. In: 17th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2016), SEP 08-12, 2016, San Francisco, CA, pp. 3633-3637.

[img] PDF
UND_SPE_PRO_HUM_MAC_3633_2017.PDF - Published Version
Restricted to Registered users only

Download (186kB) | Request a copy
Official URL: http://doi.org/10.21437/Interspeech.2016-1159

Abstract

We present the recent advances along with an error analysis of the IBM speaker recognition system for conversational speech. Some of the key advancements that contribute to our system include: a nearest-neighbor discriminant analysis (NDA) approach (as opposed to LDA) for intersession variability compensation in the i-vector space, the application of speaker and channel-adapted features derived from an automatic speech recognition (ASR) system for speaker recognition, and the use of a DNN acoustic model with a very large number of output units (similar to 10k senones) to compute the frame-level soft alignments required in the i-vector estimation process. We evaluate these techniques on the NIST 2010 SRE extended core conditions (C1-C9), as well as the 10sec-10sec condition. To our knowledge, results achieved by our system represent the best performances published to date on these conditions. For example, on the extended tel-tel condition (C5) the system achieves an EER of 0.59%. To garner further understanding of the remaining errors (on C5), we examine the recordings associated with the low scoring target trials, where various issues are identified for the problematic recordings/trials. Interestingly, it is observed that correcting the pathological recordings not only improves the scores for the target trials but also for the non target trials.

Item Type: Conference Paper
Series.: Interspeech
Additional Information: Copy right for this article belongs to the ISCA-INT SPEECH COMMUNICATION ASSOC, C/O EMMANUELLE FOXONET, 4 RUE DES FAUVETTES, LIEU DIT LOUS TOURILS, BAIXAS, F-66390, FRANCE
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 30 Oct 2017 03:39
Last Modified: 30 Oct 2017 03:39
URI: http://eprints.iisc.ac.in/id/eprint/58126

Actions (login required)

View Item View Item