When the student becomes the master : learning better and smaller monolingual models from mBERT

Publication type
C1
Publication status
Published
Authors
Singh, P., & Lefever, E.
Series
Proceedings of the 29th International Conference on Computational Linguistic (COLING 2022)
Pagination
4434-4441
Publisher
International Committee on Computational Linguistics
Conference
29th International Conference on Computational Linguistic (Gyeongju, Republic of Korea)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

In this research, we present pilot experiments to distil monolingual models from a jointly trained model for 102 languages (mBERT). We demonstrate that it is possible for the target language to outperform the original model, even with a basic distillation setup. We evaluate our methodology for 6 languages with varying amounts of resources and belonging to different language families.