Singh, Astha and Meenakshi, G Nisha and Ghosh, Prasanta Kumar (2018) Relating articulatory motions in different speaking rates. In: 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018;, 2-6 Sept. 2018, Hyderabad International Convention Centre (HICC)Hyderabad, pp. 2992-2996.
PDF
Interspeech (2018)1.pdf - Published Version Restricted to Registered users only Download (438kB) | Request a copy |
Abstract
Movements of articulators (e.g., tongue, lips and jaw) in different speaking rates are related in a complex manner. In this work, we examine the underlying function to transform articulatory movements involved in producing speech at a neutral speaking rate into those at fast and slow speaking rates (N2F and N2S). For this we use articulatory movement data collected from five subjects using an Electromagnetic articulograph at neutral, fast and slow speaking rates. As candidate transformation functions (TF), we use affine transformations with a diagonal matrix and a full matrix and a nonlinear function modeled by a deep neural network (DNN). Since the duration of an utterance in different speaking rates would typically be unequal, it is required to time align the articulatory movement trajectories, which, in turn, affects the TF learnt. Therefore, we propose an iterative algorithm to alternately optimize for the TF and the time alignments. Subject specific experiments reveal that while N2F transformation can be well described by an affine transformation with a full matrix, N2S transformation is better represented by a more complex nonlinear function modeled by a DNN. This could be because subjects exhibit gross articulatory movements during fast speech and hyper-articulate while producing slow speech.
Item Type: | Conference Paper |
---|---|
Series.: | Interspeech |
Publisher: | ISCA-INT SPEECH COMMUNICATION ASSOC |
Additional Information: | 19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018), Hyderabad, INDIA, AUG 02-SEP 06, 2018 |
Keywords: | Electromagnetic Articulography; Speaking rate; Deep Neural Network |
Department/Centre: | Division of Electrical Sciences > Electrical Communication Engineering > Electrical Communication Engineering - Technical Reports Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 29 Jun 2020 11:01 |
Last Modified: | 29 Jun 2020 11:01 |
URI: | http://eprints.iisc.ac.in/id/eprint/62928 |
Actions (login required)
View Item |