MIP-GAF: A MLLM-annotated Benchmark for Important Person Localization and Group Context Understanding

Surbhi Madan, Shreya Ghosh, Lownish Rai Sookha, M.A. Ganaie, Ramanathan Subramanian, Abhinav Dhall, Tom Gedeon

Research output: A Conference proceeding or a Chapter in BookConference contributionpeer-review

Abstract

Estimating the Most Important Person (MIP) in any social event setup is a challenging problem mainly due to contextual complexity and scarcity of labeled data. Moreover, the causality aspects of MIP estimation are quite subjective and diverse. To this end, we aim to address the problem by annotating a large-scale `in-the-wild' dataset for identifying human perceptions about the `Most Important Person (MIP)' in an image. The paper provides a thorough description of our proposed Multimodal Large Language Model (MLLM) based data annotation strategy, and a thorough data quality analysis. Further, we perform a comprehensive benchmarking of the proposed dataset utilizing state-of-the-art MIP localization methods, indicating a significant drop in performance compared to existing datasets. The performance drop shows that the existing MIP localization algorithms must be more robust with respect to `in-the-wild' situations. We believe the proposed dataset will play a vital role in building the next-generation social situation understanding methods. The code and data is available at https://github.com/surbhimadan92/MIP-GAF
Original languageEnglish
Title of host publicationIEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
EditorsSharon X. Huang, Peyman Milanfar, Vishal M. Patel, Soma Biswas, Hadar Averbuch-Elor, Vitomir Štruc, Yezhou Yang
Place of PublicationUnited States
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages1-10
Number of pages10
DOIs
Publication statusPublished - 2025
EventIEEE/CVF Winter Conference on Applications of Computer Vision (WACV) - Tucson, United States
Duration: 28 Feb 20254 Mar 2025
https://wacv2025.thecvf.com/

Publication series

NameWinter Conference on Applications of Computer Vision (WACV)
PublisherIEEE
ISSN (Print)2379-8920
ISSN (Electronic)2379-8939

Conference

ConferenceIEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Abbreviated titleWACV2025
Country/TerritoryUnited States
CityTucson
Period28/02/254/03/25
Internet address

Fingerprint

Dive into the research topics of 'MIP-GAF: A MLLM-annotated Benchmark for Important Person Localization and Group Context Understanding'. Together they form a unique fingerprint.

Cite this