CAPTION: Caption Analysis with Proposed Terms, Image of Objects, and Natural Language Processing

Leonardo A. Ferreira, Douglas De Rizzo Meneghetti, Marcos Lopes, Paulo E. Santos

Research output: Contribution to journalArticlepeer-review

Abstract

This paper proposes a novel algorithm, called CAPTION, for identifying and correcting errors in automatically generated image captions. The algorithm combines Deep Learning (DL) for object detection in images with Natural Language Processing techniques. CAPTION has been tested in the following three tasks: (1) classify a caption as correct or not; (2) detect wrong words in the caption, and (3) suggest text corrections. Results show that our method is superior with respect to others evaluated in the same data set in the error correction task. These other methods are generally based exclusively on DL models. This work shows that, although semantics still has not been used at its fullest in this type of task, a combination of DL with Natural Language Processing tools presents a better overall performance than using DL methods alone.

Original languageEnglish
Article number390
Number of pages16
JournalSN Computer Science
Volume3
DOIs
Publication statusPublished - 23 Jul 2022

Keywords

  • Computer vision
  • Image captioning
  • Machine learning
  • NLP

Fingerprint

Dive into the research topics of 'CAPTION: Caption Analysis with Proposed Terms, Image of Objects, and Natural Language Processing'. Together they form a unique fingerprint.

Cite this