Visual Prompt Tuning

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Menglin Jia
Luming Tang
Bor Chun Chen
Claire Cardie
Belongie, Serge
Bharath Hariharan
Ser Nam Lim

The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, i.e., full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) as an efficient and effective alternative to full fine-tuning for large-scale Transformer models in vision. Taking inspiration from recent advances in efficiently tuning large language models, VPT introduces only a small amount (less than 1% of model parameters) of trainable parameters in the input space while keeping the model backbone frozen. Via extensive experiments on a wide variety of downstream recognition tasks, we show that VPT achieves significant performance gains compared to other parameter efficient tuning protocols. Most importantly, VPT even outperforms full fine-tuning in many cases across model capacities and training data scales, while reducing per-task storage cost. Code is available at github.com/kmnp/vpt.

Originalsprog	Engelsk
Titel	Computer Vision – ECCV 2022 : 17th European Conference, Proceedings
Redaktører	Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner
Antal sider	19
Forlag	Springer
Publikationsdato	2022
Sider	709-727
ISBN (Trykt)	978-3-031-19826-7
ISBN (Elektronisk)	978-3-031-19827-4
DOI	https://doi.org/10.1007/978-3-031-19827-4_41
Status	Udgivet - 2022
Begivenhed	17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel Varighed: 23 okt. 2022 → 27 okt. 2022

Konference

Konference	17th European Conference on Computer Vision, ECCV 2022
Land	Israel
By	Tel Aviv
Periode	23/10/2022 → 27/10/2022

Navn	Lecture Notes in Computer Science
Vol/bind	13693 LNCS
ISSN	0302-9743

Bibliografisk note

Funding Information:
Acknowledgement. Menglin is supported by a Meta AI research grant awarded to Cornell University, Luming and Bharath is supported by NSF IIS-2144117, Serge is supported in part by the Pioneer Centre for AI, DNRF grant number P1. We would like to thank Alexander Rush, Yin Cui for valuable suggestions and discussion.

Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

ID: 342671827

Datalogisk Institut