ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection

Koluguri, Nithin Rao and Meenakshi, G Nisha and Ghosh, Prasanta Kumar (2017) Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection. In: IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25 (6). pp. 1183-1192. ISSN 2329-9290

[img] PDF
IEEE_ACM_tra_aud_spe_lan_pro_25-6_1183-1192_2017.pdf - Published Version
Restricted to Registered users only

Download (630kB) | Request a copy
Official URL: https://doi.org/10.1109/TASLP.2017.2690562

Abstract

Bird sound detection from real-field recordings is essential for identifying bird species in bioacoustic monitoring. Variations in the recording devices, environmental conditions, and the presence of vocalizations from other animals make the bird sound detection very challenging. In order to overcome these challenges, we propose an unsupervised algorithm comprising two main stages. In the first stage, a spectrogram enhancement technique is proposed using a multiple window Savitzky-Golay (MWSG) filter. We show that the spectrogram estimate using MWSG filter is unbiased and has lower variance compared with its single window counterpart. It is known that bird sounds are highly structured in the time-frequency (T-F) plane. We exploit these cues of prominence of T-F activity in specific directions from the enhanced spectrogram, in the second stage of the proposed method, for bird sound detection. In this regard, we use a set of four moving average filters that when applied to the enhanced spectrogram, yield directional spectrograms that capture the direction specific information. We propose a thresholding scheme on the time varying energy profile computed from each of these directional spectrograms to obtain frame-level binary decisions of bird sound activity. These individual decisions are then combined to obtain the final decision. Experiments are performed with three different datasets, with varying recording and noise conditions. Frame level F-score is used as the evaluation metric for bird sound detection. We find that the proposed method, on average, achieves higher F-score (10.24% relative) compared to the best of the six baseline schemes considered in this work.

Item Type: Journal Article
Publication: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: The copyright for this article belongs to the Institute of Electrical and Electronics Engineers Inc.
Keywords: Bioacoustic monitoring; bird sound detection; directional spectrograms; Savitzky-Golay filter
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Date Deposited: 13 Jun 2022 11:53
Last Modified: 13 Jun 2022 11:53
URI: https://eprints.iisc.ac.in/id/eprint/73400

Actions (login required)

View Item View Item