AEHRC CSIRO at ImageCLEFmed caption 2021

Aaron Nicolson, Jason Dowling, Bevan Koopman

Research output: Contribution to conference (non-published works)Paper

2 Citations (Scopus)


We describe our participation in the ImageCLEFmed Caption task of 2021. The task required participants to automatically compose coherent captions for a set of medical images. To this end, we employed a sequence-to-sequence model for caption generation, where its encoder and decoder were initialised with pre-trained Transformer checkpoints. In addition, we investigated the use of Self-Critical Sequence Training (SCST) (which offered a marginal improvement) and pre-training on five external medical image datasets. Overall, our approach was kept intentionally general so that it might be applied to tasks other than medical image captioning. AEHRC CSIRO placed third amongst the participating teams in terms of BLEU score-with a score 0.078 worse than the first placed participant. Our best-performing submission had the simplest configuration-it did not use SCST or pre-training on any of the external datasets. An overview of ImageCLEFmed Caption 2021 is available at:

Original languageEnglish
Number of pages12
Publication statusPublished - 2021
Externally publishedYes
Event2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 - Virtual, Bucharest, Romania
Duration: 21 Sept 202124 Sept 2021


Conference2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021
CityVirtual, Bucharest

Cite this