Voice source characterization using pitch synchronous discrete cosine transform for speaker identification

Ramakrishnan, AG and Abhiram, B and Prasanna, Mahadeva SR (2015) Voice source characterization using pitch synchronous discrete cosine transform for speaker identification. In: JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 137 (6). EL469-EL475.

PDF
Jou_of_Aco_Sco_of_Ame_137-6_EL469_2015.pdf - Published Version
Restricted to Registered users only
Download (513kB) | Request a copy

Official URL: http://dx.doi.org/10.1121/1.4921679

Abstract

A characterization of the voice source (VS) signal by the pitch synchronous (PS) discrete cosine transform (DCT) is proposed. With the integrated linear prediction residual (ILPR) as the VS estimate, the PS DCT of the ILPR is evaluated as a feature vector for speaker identification (SID). On TIMIT and YOHO databases, using a Gaussian mixture model (GMM)-based classifier, it performs on par with existing VS-based features. On the NIST 2003 database, fusion with a GMM-based classifier using MFCC features improves the identification accuracy by 12% in absolute terms, proving that the proposed characterization has good promise as a feature for SID studies. (C) 2015 Acoustical Society of America

Item Type:	Journal Article
Publication:	JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA
Publisher:	ACOUSTICAL SOC AMER AMER INST PHYSICS
Additional Information:	Copy right for this article belongs to the ACOUSTICAL SOC AMER AMER INST PHYSICS, STE 1 NO 1, 2 HUNTINGTON QUADRANGLE, MELVILLE, NY 11747-4502 USA
Department/Centre:	Division of Electrical Sciences > Electrical Engineering
Date Deposited:	31 Jul 2015 13:26
Last Modified:	31 Jul 2015 13:26
URI:	http://eprints.iisc.ac.in/id/eprint/51959

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India