Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis

Subramanian Ramanathan, Harish Katti, Raymond Huang, Tat Seng Chua, Mohan Kankanhalli

Research output: A Conference proceeding or a Chapter in BookConference contributionpeer-review

25 Citations (Scopus)

Abstract

We propose a novel framework to localize and label affective objects and actions in images through a combination of text, visual and gaze-based analysis. Human gaze provides useful cues to infer locations and interactions of affective objects. While concepts (labels) associated with an image can be determined from its caption, we demonstrate localization of these concepts upon learning from a statistical affect model for world concepts. The affect model is derived from non-invasively acquired fixation patterns on labeled images, and guides localization of affective objects (faces, reptiles) and actions (look, read) from fixations in unlabeled images. Experimental results obtained on a database of 500 images confirm the effectiveness and promise of the proposed approach.

Original languageEnglish
Title of host publicationMM'09 - Proceedings of the 2009 ACM Multimedia Conference, with Co-located Workshops and Symposiums
EditorsWen Gao, Yong Rui, Alan Hanjalic, Changsheng Xu, Eckehard Steinbach, Abdulmotaleb El Saddik, Michelle Zhou
Place of PublicationUnited States
PublisherAssociation for Computing Machinery (ACM)
Pages729-732
Number of pages4
ISBN (Print)9781605586083
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event17th ACM International Conference on Multimedia, MM'09, with Co-located Workshops and Symposiums - Beijing, China
Duration: 19 Oct 200924 Oct 2009

Publication series

NameMM'09 - Proceedings of the 2009 ACM Multimedia Conference, with Co-located Workshops and Symposiums

Conference

Conference17th ACM International Conference on Multimedia, MM'09, with Co-located Workshops and Symposiums
Country/TerritoryChina
CityBeijing
Period19/10/0924/10/09

Fingerprint

Dive into the research topics of 'Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis'. Together they form a unique fingerprint.

Cite this