Ramakrishnan, AG and Krishnan, G and Srivathsan, S (2018) Voice activity detection from the breathing pattern of the speaker. In: 14th IEEE India Council International Conference, INDICON 2017, 15 - 17 December 2017, Roorkee.
PDF
IEEE_INDICON_2017.pdf - Published Version Restricted to Registered users only Download (303kB) | Request a copy |
Abstract
In this paper, we propose a method to perform voice activity detection using only the breathing signal of a speaker. Human breathing and speech production go hand in hand. Normal respiration and respiration during speech have a different profile. The former is generally symmetric as compared to an asymmetric profile in the case of respiration during speech. Impedance pneumography provides a mechanism to capture chest expansions and compressions due to breathing. We have recorded the breathing signal along with the speech audio for 44 subjects while they were speaking and quiet. We have classified cycles of breathing into two classes, namely during speech and normal, using the cycle-synchronous discrete cosine transform coefficients of the breathing signal with different classifiers. The best accuracy of 96.4 is obtained using the k-nearest neighbor classifier. From the classified breathing cycles, we determine the intervals when a subject is quiet and when he is speaking. We use the corresponding timeframes on the simultaneously recorded audio and achieve a good accuracy in voice activity detection. Compared to the earlier reported time resolution of 30 sec, we obtain a decision for every breathing cycle, which works out to an average resolution of about 3 sec.
Item Type: | Conference Paper |
---|---|
Publication: | 2017 14th IEEE India Council International Conference, INDICON 2017 |
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
Additional Information: | The copyright for this article belongs to the IEEE. |
Keywords: | Discrete cosine transforms; Image coding; Nearest neighbor search; Speech; Support vector machines, Asymmetric profile; Breathing patterns; Cycle-synchronous DCT; Discrete cosine transform coefficients; Impedance pneumography; K-nearest neighbor classifier; Speech production; Voice activity detection, Speech recognition |
Department/Centre: | Division of Electrical Sciences > Electrical Engineering |
Date Deposited: | 03 Aug 2022 06:41 |
Last Modified: | 03 Aug 2022 06:41 |
URI: | https://eprints.iisc.ac.in/id/eprint/75211 |
Actions (login required)
View Item |