Visual Prompt Tuning

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

  • Menglin Jia
  • Luming Tang
  • Bor Chun Chen
  • Claire Cardie
  • Belongie, Serge
  • Bharath Hariharan
  • Ser Nam Lim

The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, i.e., full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) as an efficient and effective alternative to full fine-tuning for large-scale Transformer models in vision. Taking inspiration from recent advances in efficiently tuning large language models, VPT introduces only a small amount (less than 1% of model parameters) of trainable parameters in the input space while keeping the model backbone frozen. Via extensive experiments on a wide variety of downstream recognition tasks, we show that VPT achieves significant performance gains compared to other parameter efficient tuning protocols. Most importantly, VPT even outperforms full fine-tuning in many cases across model capacities and training data scales, while reducing per-task storage cost. Code is available at github.com/kmnp/vpt.

OriginalsprogEngelsk
TitelComputer Vision – ECCV 2022 : 17th European Conference, Proceedings
RedaktørerShai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner
Antal sider19
ForlagSpringer
Publikationsdato2022
Sider709-727
ISBN (Trykt)978-3-031-19826-7
ISBN (Elektronisk)978-3-031-19827-4
DOI
StatusUdgivet - 2022
Begivenhed17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel
Varighed: 23 okt. 202227 okt. 2022

Konference

Konference17th European Conference on Computer Vision, ECCV 2022
LandIsrael
ByTel Aviv
Periode23/10/202227/10/2022
NavnLecture Notes in Computer Science
Vol/bind13693 LNCS
ISSN0302-9743

Bibliografisk note

Funding Information:
Acknowledgement. Menglin is supported by a Meta AI research grant awarded to Cornell University, Luming and Bharath is supported by NSF IIS-2144117, Serge is supported in part by the Pioneer Centre for AI, DNRF grant number P1. We would like to thank Alexander Rush, Yin Cui for valuable suggestions and discussion.

Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

ID: 342671827