ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Online Estimation of Evolving Human Visual Interest

Katti, Harish and Rajagopal, Anoop Kolar and Kankanhalli, Mohan and Kalpathi, Ramakrishnan (2014) Online Estimation of Evolving Human Visual Interest. In: ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 11 (1).

[img] PDF
acm_tra_mul_com_com_app_11-1_2014.pdf - Published Version
Restricted to Registered users only

Download (10MB) | Request a copy
Official URL: http://dx.doi.org/ 10.1145/2632284


Regions in video streams attracting human interest contribute significantly to human understanding of the video. Being able to predict salient and informative Regions of Interest (ROIs) through a sequence of eye movements is a challenging problem. Applications such as content-aware retargeting of videos to different aspect ratios while preserving informative regions and smart insertion of dialog (closed-caption text) into the video stream can significantly be improved using the predicted ROIs. We propose an interactive human-in-the-loop framework to model eye movements and predict visual saliency into yet-unseen frames. Eye tracking and video content are used to model visual attention in a manner that accounts for important eye-gaze characteristics such as temporal discontinuities due to sudden eye movements, noise, and behavioral artifacts. A novel statistical-and algorithm-based method gaze buffering is proposed for eye-gaze analysis and its fusion with content-based features. Our robust saliency prediction is instantiated for two challenging and exciting applications. The first application alters video aspect ratios on-the-fly using content-aware video retargeting, thus making them suitable for a variety of display sizes. The second application dynamically localizes active speakers and places dialog captions on-the-fly in the video stream. Our method ensures that dialogs are faithful to active speaker locations and do not interfere with salient content in the video stream. Our framework naturally accommodates personalisation of the application to suit biases and preferences of individual users.

Item Type: Journal Article
Additional Information: Copy right for this article belongs to the ASSOC COMPUTING MACHINERY, 2 PENN PLAZA, STE 701, NEW YORK, NY 10121-0701 USA.
Department/Centre: Division of Biological Sciences > Centre for Neuroscience
Date Deposited: 18 Oct 2014 06:48
Last Modified: 18 Oct 2014 06:48
URI: http://eprints.iisc.ac.in/id/eprint/50032

Actions (login required)

View Item View Item