ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

LZW Based Distance Measures for Spoken Language Identification

Basavaraja, SV and Sreenivas, TV (2006) LZW Based Distance Measures for Spoken Language Identification. In: IEEE Odyssey 2006: The Speaker and Language Recognition Workshop, 2006., 28-30 June 2006 , San Juan.

[img] PDF
LZW_BASED_DISTANCE.pdf - Published Version
Restricted to Registered users only

Download (435kB) | Request a copy
Official URL: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumb...

Abstract

We present a new approach to spoken language modeling for language identification (LID) using the Lempel-Ziv-Welch (LZW) algorithm. The LZW technique is applicable to any kind of tokenization of the speech signal. Because of the efficiency of LZW algorithm to obtain variable length symbol strings in the training data, the LZW codebook captures the essentials of a language effectively. We develop two new deterministic measures for LID based on the LZW algorithm namely: (i) Compression ratio score (LZW-CR) and (ii) weighted discriminant score (LZW-WDS). To assess these measures, we consider error-free tokenization of speech as well as artificially induced noise in the tokenization. It is shown that for a 6 language LID task of OGI-TS database with clean tokenization, the new model (LZW-WDS) performs slightly better than the conventional bigram model. For noisy tokenization, which is the more realistic case, LZW-WDS significantly outperforms the bigram technique

Item Type: Conference Paper
Publisher: IEEE
Additional Information: Copyright 2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Department/Centre: Division of Electrical Sciences > Electrical Communication Engineering
Date Deposited: 11 Nov 2011 06:13
Last Modified: 11 Nov 2011 06:13
URI: http://eprints.iisc.ac.in/id/eprint/42029

Actions (login required)

View Item View Item