A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

A Primer on Contrastive Pretraining in Language Processing : Methods, Lessons Learned, and Perspectives. / Rethmeier, Nils; Augenstein, Isabelle.

In: ACM Computing Surveys, Vol. 55, No. 10, 203, 2023.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Rethmeier, N & Augenstein, I 2023, 'A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives', ACM Computing Surveys, vol. 55, no. 10, 203. https://doi.org/10.1145/3561970

APA

Rethmeier, N., & Augenstein, I. (2023). A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives. ACM Computing Surveys, 55(10), [203]. https://doi.org/10.1145/3561970

Vancouver

Rethmeier N, Augenstein I. A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives. ACM Computing Surveys. 2023;55(10). 203. https://doi.org/10.1145/3561970

Author

Rethmeier, Nils ; Augenstein, Isabelle. / A Primer on Contrastive Pretraining in Language Processing : Methods, Lessons Learned, and Perspectives. In: ACM Computing Surveys. 2023 ; Vol. 55, No. 10.

Bibtex

@article{574d698ce5954e5d860f7dce4461f2f0,
title = "A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives",
abstract = "Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to boost the performance of various downstream tasks. These pretraining methods are frequently extended with recurrence, adversarial, or linguistic property masking. Recently, contrastive self-supervised training objectives have enabled successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar. In NLP however, a single token augmentation can invert the meaning of a sentence during input-input contrastive learning, which led to input-output contrastive approaches that avoid the issue by instead contrasting over input-label pairs. In this primer, we summarize recent self-supervised and supervised contrastive NLP pretraining methods and describe where they are used to improve language modeling, zero to few-shot learning, pretraining data-efficiency, and specific NLP tasks. We overview key contrastive learning concepts with lessons learned from prior research and structure works by applications. Finally, we point to open challenges and future directions for contrastive NLP to encourage bringing contrastive NLP pretraining closer to recent successes in image representation pretraining. ",
keywords = "Contrastive learning",
author = "Nils Rethmeier and Isabelle Augenstein",
note = "Publisher Copyright: {\textcopyright} 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.",
year = "2023",
doi = "10.1145/3561970",
language = "English",
volume = "55",
journal = "ACM Computing Surveys",
issn = "0360-0300",
publisher = "Association for Computing Machinery, Inc.",
number = "10",

}

RIS

TY - JOUR

T1 - A Primer on Contrastive Pretraining in Language Processing

T2 - Methods, Lessons Learned, and Perspectives

AU - Rethmeier, Nils

AU - Augenstein, Isabelle

N1 - Publisher Copyright: © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.

PY - 2023

Y1 - 2023

N2 - Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to boost the performance of various downstream tasks. These pretraining methods are frequently extended with recurrence, adversarial, or linguistic property masking. Recently, contrastive self-supervised training objectives have enabled successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar. In NLP however, a single token augmentation can invert the meaning of a sentence during input-input contrastive learning, which led to input-output contrastive approaches that avoid the issue by instead contrasting over input-label pairs. In this primer, we summarize recent self-supervised and supervised contrastive NLP pretraining methods and describe where they are used to improve language modeling, zero to few-shot learning, pretraining data-efficiency, and specific NLP tasks. We overview key contrastive learning concepts with lessons learned from prior research and structure works by applications. Finally, we point to open challenges and future directions for contrastive NLP to encourage bringing contrastive NLP pretraining closer to recent successes in image representation pretraining.

AB - Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to boost the performance of various downstream tasks. These pretraining methods are frequently extended with recurrence, adversarial, or linguistic property masking. Recently, contrastive self-supervised training objectives have enabled successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar. In NLP however, a single token augmentation can invert the meaning of a sentence during input-input contrastive learning, which led to input-output contrastive approaches that avoid the issue by instead contrasting over input-label pairs. In this primer, we summarize recent self-supervised and supervised contrastive NLP pretraining methods and describe where they are used to improve language modeling, zero to few-shot learning, pretraining data-efficiency, and specific NLP tasks. We overview key contrastive learning concepts with lessons learned from prior research and structure works by applications. Finally, we point to open challenges and future directions for contrastive NLP to encourage bringing contrastive NLP pretraining closer to recent successes in image representation pretraining.

KW - Contrastive learning

UR - http://www.scopus.com/inward/record.url?scp=85146295166&partnerID=8YFLogxK

U2 - 10.1145/3561970

DO - 10.1145/3561970

M3 - Journal article

AN - SCOPUS:85146295166

VL - 55

JO - ACM Computing Surveys

JF - ACM Computing Surveys

SN - 0360-0300

IS - 10

M1 - 203

ER -

ID: 337589600