D-terminer : online demo for monolingual and bilingual automatic term extraction

Publication type
C1
Publication status
Published
Authors
Rigouts Terryn, A., Hoste, V., & Lefever, E.
Editor
Rute Costa, Sara Carvalho, Ana Ostroski Anic and Anas Fahad Khan
Series
Proceedings of the Workshop on Terminology in the 21st century : many faces, many places
Pagination
33-40
Publisher
European Language Resources Association (ELRA) (Marseille, France)
Conference
LREC 2022 Workshop : Terminology in the 21st century: many faces, many places (Term21) (Marseille, France)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

This contribution presents D-Terminer: an open access, online demo for monolingual and multilingual automatic term extraction from parallel corpora. The monolingual term extraction is based on a recurrent neural network, with a supervised methodology that relies on pretrained embeddings. Candidate terms can be tagged in their original context and there is no need for a large corpus, as the methodology will work even for single sentences. With the bilingual term extraction from parallel corpora, potentially equivalent candidate term pairs are extracted from translation memories and manual annotation of the results shows that good equivalents are found for most candidate terms. Accompanying the release of the demo is an updated version of the ACTER Annotated Corpora for Term Extraction Research (version 1.5).