Sainathan, Abhilash and Rudresh, Sunil and Seelamantula, Chandra Sekhar (2018) An Optimization Framework for Recovery of Speech From Phase-Encoded Spectrograms. In: 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018, 2 September 2018 through 6 September 2018, Hyderabad International Convention Centre (HICC)Hyderabad; India, pp. 741-745.
PDF
Interspeech 2018(3).pdf - Published Version Restricted to Registered users only Download (1MB) | Request a copy |
Abstract
In general, reconstruction of a speech signal from the spectrogram is non-unique because of the unavailability of the phase spectrum. Considering zero phase would result in a minimum phase reconstruction. This limitation is overcome by computing the recently introduced phase-encoded spectrogram. In this approach, one modifies each frame of a speech signal to possess the causal, delta-dominant (CDD) property prior to computing the spectrogram. In an earlier publication, we showed that finite-length CDD sequences can be retrieved exactly from their magnitude spectra using a scepstrum technique. Although exactness is guaranteed in principle, practical implementations result in a limited, but high, reconstruction accuracy. In this paper, we focus on increasing the reconstruction accuracy. We formulate the reconstruction problem within an optimization framework and deploy a recently proposed iterative, alternating direction method of multipliers (ADMM) algorithm called autocorrelation retrieval Kolmogorov factorization (CoRK). Experimental validations show that the CoRK algorithm results in a reconstruction accurate up to machine precision. We also show that both CoRK and cepstrum techniques are robust and invariant to the choice of the window duration, the amount of overlap between consecutive speech frames, the strength of the delta used to impart the CDD property, and the presence of noise.
Item Type: | Conference Paper |
---|---|
Series.: | Interspeech |
Publisher: | ISCA-INT SPEECH COMMUNICATION ASSOC |
Additional Information: | 19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018), Hyderabad, INDIA, 02-SEP- 06 SEP, 2018 |
Keywords: | Phase-encoded spectrogram; causal delta dominant sequence; autocorrelation retrieval and Kolmogorov factorization (CoRK); cepstrum |
Department/Centre: | Division of Electrical Sciences > Electronic Systems Engineering (Formerly Centre for Electronic Design & Technology) Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 12 Mar 2020 08:46 |
Last Modified: | 12 Mar 2020 08:46 |
URI: | http://eprints.iisc.ac.in/id/eprint/62916 |
Actions (login required)
View Item |