Hazarika, Devamanyu and Gorantla, Sruthi and Poria, Soujanya and Zimmermann, Roger (2018) Self-Attentive Feature-level Fusion for Multimodal Emotion Detection. In: IEEE Conference on Multimedia Information Processing and Retrieval 2018, 10-12 April 2018, Miami, FL, USA, pp. 196-201.
PDF
ICMIPR_2018.pdf - Published Version Restricted to Registered users only Download (633kB) | Request a copy |
Abstract
Multimodal emotion recognition is the task of detecting emotions present in user-generated multimedia content. Such resources contain complementary information in multiple modalities. A stiff challenge often faced is the complexity associated with feature-level fusion of these heterogeneous modes. In this paper, we propose a new feature-level fusion method based on self-attention mechanism. We also compare it with traditional fusion methods such as concatenation, outer-product, etc. Analyzed using textual and speech (audio) modalities, our results suggest that the proposed fusion method outperforms others in the context of utterance-level emotion recognition in videos.
Item Type: | Conference Proceedings |
---|---|
Publisher: | IEEE |
Additional Information: | Copyright for this article belongs to IEEE |
Keywords: | Multimodal emotion recognition, Feature level Fusion , Self Attention |
Department/Centre: | Division of Electrical Sciences > Computer Science & Automation |
Date Deposited: | 23 May 2019 09:34 |
Last Modified: | 23 May 2019 09:34 |
URI: | http://eprints.iisc.ac.in/id/eprint/62756 |
Actions (login required)
View Item |