Minimax and Neyman–Pearson Meta-Learning for Outlier Languages
Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Standard
Minimax and Neyman–Pearson Meta-Learning for Outlier Languages. / Ponti, Edoardo Maria; Aralikatte, Rahul; Shrivastava, Disha; Reddy, Siva; Søgaard, Anders.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics, 2021. s. 1245-1260.Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Minimax and Neyman–Pearson Meta-Learning for Outlier Languages
AU - Ponti, Edoardo Maria
AU - Aralikatte, Rahul
AU - Shrivastava, Disha
AU - Reddy, Siva
AU - Søgaard, Anders
PY - 2021
Y1 - 2021
N2 - Model-agnostic meta-learning (MAML) hasbeen recently put forth as a strategy to learnresource-poor languages in a sample-efficientfashion. Nevertheless, the properties of theselanguages are often not well represented bythose available during training. Hence, weargue that the i.i.d. assumption ingrained inMAML makes it ill-suited for cross-lingualNLP. In fact, under a decision-theoretic framework, MAML can be interpreted as minimising the expected risk across training languages(with a uniform prior), which is known asBayes criterion. To increase its robustness tooutlier languages, we create two variants ofMAML based on alternative criteria: MinimaxMAML reduces the maximum risk across languages, while Neyman–Pearson MAML constrains the risk in each language to a maximum threshold. Both criteria constitute fullydifferentiable two-player games. In light ofthis, we propose a new adaptive optimiser solving for a local approximation to their Nashequilibrium. We evaluate both model variants on two popular NLP tasks, part-of-speechtagging and question answering. We reportgains for their average and minimum performance across low-resource languages in zeroand few-shot settings, compared to joint multisource transfer and vanilla MAML. The codefor our experiments is available at https://github.com/rahular/robust-maml.
AB - Model-agnostic meta-learning (MAML) hasbeen recently put forth as a strategy to learnresource-poor languages in a sample-efficientfashion. Nevertheless, the properties of theselanguages are often not well represented bythose available during training. Hence, weargue that the i.i.d. assumption ingrained inMAML makes it ill-suited for cross-lingualNLP. In fact, under a decision-theoretic framework, MAML can be interpreted as minimising the expected risk across training languages(with a uniform prior), which is known asBayes criterion. To increase its robustness tooutlier languages, we create two variants ofMAML based on alternative criteria: MinimaxMAML reduces the maximum risk across languages, while Neyman–Pearson MAML constrains the risk in each language to a maximum threshold. Both criteria constitute fullydifferentiable two-player games. In light ofthis, we propose a new adaptive optimiser solving for a local approximation to their Nashequilibrium. We evaluate both model variants on two popular NLP tasks, part-of-speechtagging and question answering. We reportgains for their average and minimum performance across low-resource languages in zeroand few-shot settings, compared to joint multisource transfer and vanilla MAML. The codefor our experiments is available at https://github.com/rahular/robust-maml.
U2 - 10.18653/v1/2021.findings-acl.106
DO - 10.18653/v1/2021.findings-acl.106
M3 - Article in proceedings
SP - 1245
EP - 1260
BT - Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
PB - Association for Computational Linguistics
T2 - Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Y2 - 1 August 2021 through 6 August 2021
ER -
ID: 300446234