ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Estimation of the air-tissue boundaries of the vocal tract in the mid-sagittal plane from electromagnetic articulograph data

Parida, Satyabrata and Kumar, Pattern Ashok and Ghosh, Prasanta Kumar (2015) Estimation of the air-tissue boundaries of the vocal tract in the mid-sagittal plane from electromagnetic articulograph data. In: 16th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2015), SEP 06-10, 2015, Dresden, GERMANY, pp. 2147-2151.

[img] PDF
INTERSPEECH_2147_2015.pdf - Published Version
Restricted to Registered users only

Download (993kB) | Request a copy
Official URL: http://spire.ee.iisc.ernet.in/spire/papers_pdf/sat...

Abstract

Electromagnetic articulograph (EMA) provides movement data of sensors attached to a few flesh points on different speech articulators including lips, jaw, and tongue while a subject speaks. In this work, we quantify the amount of information these flesh points provide about the vocal tract (VT) shape in the mid-sagittal plane. VT shape is described by the air-tissue boundaries, which are obtained manually from the recordings by real-time magnetic resonance imaging (rtMRI) of a set of utterances spoken by a subject, from whom the EMA recordings of the same set of utterances are also available. We propose a two-stage approach for reconstructing the VT shape from the EMA data. The first stage involves a co-registration of the EMA data with the VT shape from the rtMRI frames. The second stage involves the estimation of the air-tissue boundaries from the co-registered EMA points. Co-registration is done by a spatio-temporal alignment of the VT shapes from the rtMRI frames and EMA sensor data, while radial basis function (RBF) network is used for estimating the air tissue boundaries (ATBs). Experiments with the EMA and rtMRI recordings of five sentences spoken by one male and one female speakers show that the VT shape in the mid-sagittal plane can be recovered from the EMA flesh points with an average reconstruction error of 2.55 mm and 2.75 mm respectively.

Item Type: Conference Proceedings
Additional Information: Copy right for this article belongs to the ISCA-INT SPEECH COMMUNICATION ASSOC, C/O EMMANUELLE FOXONET, 4 RUE DES FAUVETTES, LIEU DIT LOUS TOURILS, BAIXAS, F-66390, FRANCE
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 28 Oct 2016 07:01
Last Modified: 28 Oct 2016 07:01
URI: http://eprints.iisc.ac.in/id/eprint/55129

Actions (login required)

View Item View Item