Pitch-synchronous discrete cosine transform features for speaker identification and verification

Meghanani, A and Ramakrishnan, AG (2020) Pitch-synchronous discrete cosine transform features for speaker identification and verification. In: ICPRAM 2020 - Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods 2020, 22 - 24 February 2020, Valletta, Malta, pp. 395-401.

Preview

PDF
ICPRAM2020_1_395-401_2020.pdf - Published Version
Download (375kB) | Preview

Official URL: https://doi.org/10.5220/0008911503950401

Abstract

We propose a feature called pitch-synchronous discrete cosine transform (PS-DCT), derived from the voiced part of the speech for speaker identification (SID) and verification (SV) tasks. PS-DCT features are derived from the ‘time-domain, quasi-stationary waveform shape’ of the voiced sounds. We test our PS-DCT feature on TIMIT, Mandarin and YOHO datasets. On TIMIT with 168 and Mandarin with 855 speakers, we obtain the SID accuracies of 99.4% and 96.1%, respectively, using a Gaussian mixture model-based classifier. In the i-vector-based SV framework, fusing the ‘PS-DCT based system’ with the ‘MFCC-based system’ at the score level reduces the equal error rate (EER) for both YOHO and Mandarin datasets. In the case of limited test data and session variabilities, we obtain a significant reduction in EER, up to 5.8% (for test data of duration < 3 sec).

Item Type:	Conference Paper
Publication:	ICPRAM 2020 - Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods
Publisher:	SciTePress
Additional Information:	The copyright of the article belongs to the Authors.
Keywords:	Continuous speech recognition; Gaussian distribution; Loudspeakers; Time domain analysis, Equal error rate; Gaussian Mixture Model; MFCC; Pitch synchronous; Quasi-stationary; Speaker identification; Speaker verification; Waveform shape, Discrete cosine transforms
Department/Centre:	Division of Electrical Sciences > Electrical Engineering
Date Deposited:	29 Sep 2020 11:11
Last Modified:	05 Dec 2023 09:45
URI:	https://eprints.iisc.ac.in/id/eprint/65176

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India