Sharma, Neeraj and Potadar, Shreepad and Chetupalli, Srikanth Raj and Sreenivas, TV (2017) Mel-scale sub-band modelling for perceptually improved time-scale modification of speech and audio signals. In: 23rd National Conference on Communications, NCC 2017, 02-04 March 2017, Chennai, India, pp. 1-5.
PDF
IEEE_NCC_2017.pdf - Published Version Restricted to Registered users only Download (542kB) | Request a copy |
Abstract
Good quality time-scale modification (TSM) of speech, and audio is a long standing challenge. The crux of the challenge is to maintain the perceptual subtilities of temporal variations in pitch and timbre even after time-scaling the signal. Widely used approaches, such as phase vocoder, and waveform overlap-add (OLA), are based on quasi-stationary assumption and the time-scaled signals have perceivable artifacts. In contrast to these approaches, we propose application of time-varying sinusoidal modeling for TSM, without any quasi-stationary assumption. The proposed model comprises of a mel-scale nonuniform bandwidth filter bank, and the instantaneous amplitude (IA), and instantaneous phase (IP) factorization of sub-band timevarying sinusoids. TSM of the signal is done by time-scaling IA, and IP in each sub-band. The lowpass nature of IA, and IP allows for time-scaling via interpolation. Formal listening tests on speech, and music (solo, and polyphonic) show reduction in TSM artifacts such as phasiness, and transient smearing. Further, the proposed approach gives improved quality in comparison to waveform synchronous OLA (WSOLA), phase vocoder with identity phase locking, and the recently proposed harmonicpercussive separation (HPS) based TSM methods. The obtained improvement in TSM quality highlights that speech analysis can benefit from appropriate choice of time-varying signal models.
Item Type: | Conference Paper |
---|---|
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
Additional Information: | The Copyright of this article to Institute of Electrical and Electronics Engineers Inc. |
Keywords: | Locks (fasteners); Bandwidth filters; Instantaneous amplitude; Instantaneous phase; Quasi-stationary; Sinusoidal model; Temporal variation; Time varying signal; Time-scale modification; Vocoders |
Department/Centre: | Division of Electrical Sciences > Electrical Communication Engineering |
Date Deposited: | 13 Jun 2022 05:55 |
Last Modified: | 13 Jun 2022 05:55 |
URI: | https://eprints.iisc.ac.in/id/eprint/73303 |
Actions (login required)
View Item |