Mining informativeness in scene graphs: Prioritizing informative relations in Scene Graph Generation for enhanced performance in applications

Maëlic Neau, Paulo E Santos, Anne-Gwenn Bosser, Alistair Macvicar, Cédric Buche

Research output: Contribution to journalArticlepeer-review

Abstract

Learning to compose visual relationships from raw images in the form of scene graphs is a highly challenging Computer Vision task, yet it is essential for applications related to scene understanding. However, no current approaches in Scene Graph Generation (SGG) aim at providing useful graphs for downstream tasks. Instead, the main focus has primarily been on unbiasing the data distribution for predicting more fine-grained relations. That being said, not all fine-grained relations are equally relevant to any particular task and at least a subset of them are of no use for real-world applications. In this work, we address the issue of the relevance of relations in Scene Graphs from the perspective of the quantity of information they bring to the understanding of the scene. To this end, we introduce a new evaluation metric for the task of SGG, called InformativeRecall@K, that aims at evaluating the ability of models to produce accurate and informative relations. We show that selecting relations based on this informativeness criteria is beneficial for the downstream tasks of Image Generation, Visual Question Answering, and Image Captioning. Finally, we provide a new taxonomy of relations linked to the informativeness value for the task of Image Generation.

Original languageEnglish
Pages (from-to)64-70
Number of pages7
JournalPattern Recognition Letters
Volume189
Early online date25 Jan 2025
DOIs
Publication statusPublished - Mar 2025

Keywords

  • Image Captioning
  • Image Generation
  • Scene Graph Generation
  • Scene Understanding
  • Visual Question Answering

Fingerprint

Dive into the research topics of 'Mining informativeness in scene graphs: Prioritizing informative relations in Scene Graph Generation for enhanced performance in applications'. Together they form a unique fingerprint.

Cite this