Posted on June 1, 2021
Last Friday (28 May 2021), Ayla successfully defended her PhD: D-TERMINE: Data-Driven Term Extraction Methodologies Investigated. She was supervised by professor Els Lefever and professor Véronique Hoste.
In her dissertation Ayla looked into automatic term extraction (ATE). She created and publicly released a new dataset (ACTER) where both terms and named entities were manually identified in three languages and four domains. Using this novel dataset she explored two different machine learning methodologies to perform ATE: a hybrid and a sequential labeling approach. The first approach can be considered a more traditional one and relies on both linguistic and statistical information to filter a list of candidate terms using supervised machine learning. For the sequential labeling approach two methods were compared: a feature-based conditional random fields classifier and a recurrent neural network with word embeddings. The results reveal that the hybrid and neural network approach achieve comparable results. Through an elaborate evaluation several strengths and weaknesses were revealed for each of the approaches which offer interesting challenges for future research.
Congratulations dr. Rigouts Terryn!