Upadhyaya, Prashant and Mittal, Sanjeev Kumar and Varshney, Yash Vardhan and Farooq, Omar and Abidi, Musiur Raza (2017) Speaker Adaptive Model for Hindi Speech using Kaldi Speech Recognition toolkit. In: International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT), NOV 24-26, 2017, Aligarh, INDIA, pp. 222-226.
PDF
IMPACT_222_2017.pdf - Published Version Restricted to Registered users only Download (366kB) | Request a copy |
Abstract
Speech communication is fast gaining market penetration as a preferable input for human computer interface (HCI) and is finding its way into the commercial applications from the academic research setup. For public applications, acceptance is determined not only by the accuracy and reliability but the ease of usage and habituation. In this work, we show that accuracy of a system can be enhanced using Speaker Adaption Technique (SAT). Kaldi speech recognition toolkit was used to evaluate the performance of our Hindi speech model. Acoustic feature were extracted using MFCC and PLP from 1000 phonetically balanced Hindi sentence from AMUAV corpus. Acoustic model was trained using Hidden Markov Model and Gaussian Mixture Models (HMM-GMM) and decoding was performed using Weight Finite State Transducers (WFSTs). Maximum improvement of 6.93% in word error rate is obtained for speaker adaptive training when used along with Linear Discriminant Analysis-Maximum Likelihood Linear Transform model over monophone model.
Item Type: | Conference Proceedings |
---|---|
Publisher: | IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA |
Additional Information: | Copy right of this article belong to IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA |
Department/Centre: | Division of Electrical Sciences > Electrical Communication Engineering |
Date Deposited: | 28 Jun 2018 14:31 |
Last Modified: | 28 Jun 2018 14:31 |
URI: | http://eprints.iisc.ac.in/id/eprint/60104 |
Actions (login required)
View Item |