Abstract
We describe our participation in the ImageCLEFmed Caption task of 2021. The task required participants to automatically compose coherent captions for a set of medical images. To this end, we employed a sequence-to-sequence model for caption generation, where its encoder and decoder were initialised with pre-trained Transformer checkpoints. In addition, we investigated the use of Self-Critical Sequence Training (SCST) (which offered a marginal improvement) and pre-training on five external medical image datasets. Overall, our approach was kept intentionally general so that it might be applied to tasks other than medical image captioning. AEHRC CSIRO placed third amongst the participating teams in terms of BLEU score-with a score 0.078 worse than the first placed participant. Our best-performing submission had the simplest configuration-it did not use SCST or pre-training on any of the external datasets. An overview of ImageCLEFmed Caption 2021 is available at: https://www.imageclef.org/2021/medical/caption.
Original language | English |
---|---|
Pages | 1317-1328 |
Number of pages | 12 |
Publication status | Published - 2021 |
Externally published | Yes |
Event | 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 - Virtual, Bucharest, Romania Duration: 21 Sept 2021 → 24 Sept 2021 |
Conference
Conference | 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 |
---|---|
Country/Territory | Romania |
City | Virtual, Bucharest |
Period | 21/09/21 → 24/09/21 |