The Role of Data Curation in Image Captioning

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

The Role of Data Curation in Image Captioning. / Li, Wenyan; Lotz, Jonas F.; Qiu, Chen; Elliott, Desmond.

EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference. ed. / Yvette Graham; Matthew Purver; Matthew Purver. Association for Computational Linguistics (ACL), 2024. p. 1074-1088.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Li, W, Lotz, JF, Qiu, C & Elliott, D 2024, The Role of Data Curation in Image Captioning. in Y Graham, M Purver & M Purver (eds), EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference. Association for Computational Linguistics (ACL), pp. 1074-1088, 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024, St. Julian's, Malta, 17/03/2024. <https://aclanthology.org/2024.eacl-long.65/>

APA

Li, W., Lotz, J. F., Qiu, C., & Elliott, D. (2024). The Role of Data Curation in Image Captioning. In Y. Graham, M. Purver, & M. Purver (Eds.), EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 1074-1088). Association for Computational Linguistics (ACL). https://aclanthology.org/2024.eacl-long.65/

Vancouver

Li W, Lotz JF, Qiu C, Elliott D. The Role of Data Curation in Image Captioning. In Graham Y, Purver M, Purver M, editors, EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference. Association for Computational Linguistics (ACL). 2024. p. 1074-1088

Author

Li, Wenyan ; Lotz, Jonas F. ; Qiu, Chen ; Elliott, Desmond. / The Role of Data Curation in Image Captioning. EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference. editor / Yvette Graham ; Matthew Purver ; Matthew Purver. Association for Computational Linguistics (ACL), 2024. pp. 1074-1088

Bibtex

@inproceedings{0f35f206b3b5492a8410b67391202bd3,

title = "The Role of Data Curation in Image Captioning",

abstract = "Image captioning models are typically trained by treating all samples equally, neglecting to account for mismatched or otherwise difficult data points. In contrast, recent work has shown the effectiveness of training models by scheduling the data using curriculum learning strategies. This paper contributes to this direction by actively curating difficult samples in datasets without increasing the total number of samples. We explore the effect of using three data curation methods within the training process: complete removal of a sample, caption replacement, or image replacement via a text-to-image generation model. Experiments on the Flickr30K and COCO datasets with the BLIP and BEiT-3 models demonstrate that these curation methods do indeed yield improved image captioning models, underscoring their efficacy.",

author = "Wenyan Li and Lotz, {Jonas F.} and Chen Qiu and Desmond Elliott",

note = "Publisher Copyright: {\textcopyright} 2024 Association for Computational Linguistics.; 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024 ; Conference date: 17-03-2024 Through 22-03-2024",

year = "2024",

language = "English",

pages = "1074--1088",

editor = "Yvette Graham and Matthew Purver and Matthew Purver",

booktitle = "EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference",

publisher = "Association for Computational Linguistics (ACL)",

address = "United States",

}

RIS

TY - GEN

T1 - The Role of Data Curation in Image Captioning

AU - Li, Wenyan

AU - Lotz, Jonas F.

AU - Qiu, Chen

AU - Elliott, Desmond

PY - 2024

Y1 - 2024

N2 - Image captioning models are typically trained by treating all samples equally, neglecting to account for mismatched or otherwise difficult data points. In contrast, recent work has shown the effectiveness of training models by scheduling the data using curriculum learning strategies. This paper contributes to this direction by actively curating difficult samples in datasets without increasing the total number of samples. We explore the effect of using three data curation methods within the training process: complete removal of a sample, caption replacement, or image replacement via a text-to-image generation model. Experiments on the Flickr30K and COCO datasets with the BLIP and BEiT-3 models demonstrate that these curation methods do indeed yield improved image captioning models, underscoring their efficacy.

AB - Image captioning models are typically trained by treating all samples equally, neglecting to account for mismatched or otherwise difficult data points. In contrast, recent work has shown the effectiveness of training models by scheduling the data using curriculum learning strategies. This paper contributes to this direction by actively curating difficult samples in datasets without increasing the total number of samples. We explore the effect of using three data curation methods within the training process: complete removal of a sample, caption replacement, or image replacement via a text-to-image generation model. Experiments on the Flickr30K and COCO datasets with the BLIP and BEiT-3 models demonstrate that these curation methods do indeed yield improved image captioning models, underscoring their efficacy.

M3 - Article in proceedings

AN - SCOPUS:85189930294

SP - 1074

EP - 1088

BT - EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference

A2 - Graham, Yvette

A2 - Purver, Matthew

PB - Association for Computational Linguistics (ACL)

T2 - 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024

Y2 - 17 March 2024 through 22 March 2024

ER -

ID: 392216501

Department of Computer Science