Word sense disambiguation for specific purposes: an example sentence-based methodology for Intelligent Computer-Assisted Language Learning

Publication type
U
Publication status
Published
Authors
Degraeuwe, JRD, & Goethals, P.
Conference
The 31st Meeting of Computational Linguistics In the Netherlands 31 (CLIN31) (Gent (online))
Download
(.pdf)
View in Biblio
(externe link)

Abstract

In this poster, we will present the main challenges of word sense disambiguation (WSD) for the specific purpose of Intelligent Computer-Assisted Language Learning, and report the results of a first experiment. The starting point is that applying WSD could considerably enrich existing language learning and teaching resources, as it would, for instance, enable querying corpora for usage examples of specific semantic uses of vocabulary items. However, most existing WSD methods are based on WordNet and BabelNet sense distinctions, even though their very fine-grained nature actually makes them unsuitable for many NLP applications (Hovy et al., 2013). Moreover, it is argued that a single set of word senses is unlikely to be appropriate for different NLP applications, since “different corpora, and different purposes, will lead to different senses” (Kilgarriff, 1997).
In other words, WSD for specific purposes is an open problem which requires research into tailoring the sense inventory to the particularities of the specific purpose, and into designing methodologies which require little human-curated input (Degraeuwe et al., in press). For this latter challenge, word embeddings could be a key factor, since the surge of neural networks and along with it the introduction of static (e.g. word2vec [Mikolov et al., 2013]) and especially contextualised word embedding models (e.g. BERT [Devlin et al., 2019]) meant an important breakthrough for the WSD task, pushing performance levels to new heights (Loureiro et al., 2021). In a first experiment (focused on Spanish as a foreign language), we developed a customised, coarse-grained sense inventory in which the senses are represented by prototypical usage examples, and then used a pretrained BERT model to convert those sentences into “sense embeddings” and predict the sense of unseen ambiguous instances through cosine similarity calculations. On a 25-item lexical sample, this methodology achieves a promising average F1 score of 0.9.