ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Relating articulatory motions in different speaking rates

Singh, Astha and Meenakshi, G Nisha and Ghosh, Prasanta Kumar (2018) Relating articulatory motions in different speaking rates. In: 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018;, 2-6 Sept. 2018, Hyderabad International Convention Centre (HICC)Hyderabad, pp. 2992-2996.

[img] PDF
Interspeech (2018)1.pdf - Published Version
Restricted to Registered users only

Download (438kB) | Request a copy
Official URL: https://dx.doi.org/10.21437/Interspeech.2018-1862


Movements of articulators (e.g., tongue, lips and jaw) in different speaking rates are related in a complex manner. In this work, we examine the underlying function to transform articulatory movements involved in producing speech at a neutral speaking rate into those at fast and slow speaking rates (N2F and N2S). For this we use articulatory movement data collected from five subjects using an Electromagnetic articulograph at neutral, fast and slow speaking rates. As candidate transformation functions (TF), we use affine transformations with a diagonal matrix and a full matrix and a nonlinear function modeled by a deep neural network (DNN). Since the duration of an utterance in different speaking rates would typically be unequal, it is required to time align the articulatory movement trajectories, which, in turn, affects the TF learnt. Therefore, we propose an iterative algorithm to alternately optimize for the TF and the time alignments. Subject specific experiments reveal that while N2F transformation can be well described by an affine transformation with a full matrix, N2S transformation is better represented by a more complex nonlinear function modeled by a DNN. This could be because subjects exhibit gross articulatory movements during fast speech and hyper-articulate while producing slow speech.

Item Type: Conference Paper
Series.: Interspeech
Additional Information: 19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018), Hyderabad, INDIA, AUG 02-SEP 06, 2018
Keywords: Electromagnetic Articulography; Speaking rate; Deep Neural Network
Department/Centre: Division of Electrical Sciences > Electrical Communication Engineering > Electrical Communication Engineering - Technical Reports
Division of Electrical Sciences > Electrical Engineering
Date Deposited: 29 Jun 2020 11:01
Last Modified: 29 Jun 2020 11:01
URI: http://eprints.iisc.ac.in/id/eprint/62928

Actions (login required)

View Item View Item