Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification
Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
We consider zero-shot cross-lingual transfer in legal topic classification using the recent Multi-EURLEX dataset. Since the original dataset contains parallel documents, which is unrealistic for zero-shot cross-lingual transfer, we develop a new version of the dataset without parallel documents. We use it to show that translation-based methods vastly outperform cross-lingual fine-tuning of multilingually pre-trained models, the best previous zero-shot transfer method for Multi-EURLEX. We also develop a bilingual teacher-student zero-shot transfer approach, which exploits additional unlabeled documents of the target language and performs better than a model fine-tuned directly on labeled target language documents.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the 12th Hellenic Conference on Artificial Intelligence, SETN 2022 |
Antal sider | 8 |
Forlag | Association for Computing Machinery, Inc. |
Publikationsdato | 2022 |
Artikelnummer | 19 |
ISBN (Elektronisk) | 9781450395977 |
DOI | |
Status | Udgivet - 2022 |
Begivenhed | 12th Hellenic Conference on Artificial Intelligence, SETN 2022 - Corfu, Grækenland Varighed: 7 sep. 2022 → 9 sep. 2022 |
Konference
Konference | 12th Hellenic Conference on Artificial Intelligence, SETN 2022 |
---|---|
Land | Grækenland |
By | Corfu |
Periode | 07/09/2022 → 09/09/2022 |
Sponsor | Hellenic Artificial Intelligence Society, Humanistic and Social Informatics Laboratory (HILab), Ionian University, Department of Informatics |
Navn | ACM International Conference Proceeding Series |
---|
Bibliografisk note
Funding Information:
This work is partly funded by the Innovation Fund Denmark (IFD)6 under File No. 0175-00011A. This research has been also co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH – CREATE – INNOVATE ( 2 -03849).
Publisher Copyright:
© 2022 ACM.
ID: 342927381