ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Hierarchical classification of speaker and background noise and estimation of SNR using sparse representation

Girish, Vijay KV and Ramakrishnan, AG and Ananthapadmanabha, TV (2017) Hierarchical classification of speaker and background noise and estimation of SNR using sparse representation. In: 17th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2016), SEP 08-12, 2016, San Francisco, CA, pp. 2972-2976.

[img] PDF
UND_SPE_PRO_HUM_MAC_2972_2017.PDF - Published Version
Restricted to Registered users only

Download (254kB) | Request a copy
Official URL: http://doi.org/10.21437/Interspeech.2016-175

Abstract

In the analysis of recordings of conversations, one of the motivations is to be able to identify the nature of background noise as a means of identifying the possible geographical location of a speaker. In a high noise environment, to minimize manual analysis of the recording, it is also desirable to automatically locate only the segments of the recording, which contain speech. The next task is to identify if the speech is from one of the known people. A dictionary learning and block sparsity based source recovery approach has been used to estimate the SNR of a noisy speech recording, simulated at different SNRs using ten different noise sources. Given a test utterance, a noise label is assigned using block sparsity approach, and subsequently, the speaker is classified using sum of weights recovered from the concatenation of speaker dictionaries and the identified noise source dictionary. Using the dictionaries of the identified speaker and noise sources, framewise speech and noise energy are estimated using a source recovery method. The energy estimates are then used to identify the segments, where speech is present. We obtain 100% accuracy for background classification and around 90% for speaker classification at a SNR of 10 dB.

Item Type: Conference Paper
Additional Information: Copy right for this article belongs to the ISCA-INT SPEECH COMMUNICATION ASSOC, C/O EMMANUELLE FOXONET, 4 RUE DES FAUVETTES, LIEU DIT LOUS TOURILS, BAIXAS, F-66390, FRANCE
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Depositing User: Id for Latest eprints
Date Deposited: 30 Oct 2017 03:37
Last Modified: 22 Feb 2019 09:30
URI: http://eprints.iisc.ac.in/id/eprint/58124

Actions (login required)

View Item View Item