A hybrid approach to domain-independent taxonomy learning

Publication type
A1
Publication status
Published
Author
Lefever, E.
Journal
APPLIED ONTOLOGY
Volume
11
Issue
3
Pagination
255-278
Publisher
IOS PRESS (NIEUWE HEMWEG 6B, 1013 BG AMSTERDAM, NETHERLANDS)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

Creating domain ontologies is usually performed by teams of knowledge engineers and domain experts, and is considered to be a time-consuming and difficult task. As a result, scientists have started to develop automatic approaches to ontology learning and population. For the proposed research, we focus on the central subtask of ontology learning, being the hypernym detection task, where the system has to detect hierarchical semantic relationships, i.e. hypernym–hyponym relationships, between domain-specific terms, resulting in a domain-specific taxonomy. We propose in this paper a hybrid approach to automatic taxonomy learning, which combines a data-driven and a knowledge-based component. The data-driven component is composed of a lexico-syntactic pattern-based module, a morpho-syntactic analyzer and a distributional model, whereas the knowledge-based component extracts structured semantic information from the Linked Open Data cloud (DBpedia) and WordNet. The proposed methodology has been applied to three different knowledge domains: viz. food , equipment and science . A thorough quantitative and qualitative evaluation has shown promising results for all considered test domains. In addition, the results show a clear contribution of all different modules to the automatic taxonomy learning task. Although there is still room for improvement for all different modules, our approach outperforms state-of-the-art systems that participated in the SemEval “Taxonomy Extraction Evaluation” task when it comes to comparing the automatically constructed taxonomy against a manually verified gold standard taxonomy. As all modules are run automatically, the system provides a flexible and domain-independent approach to automatic taxonomy learning and could be an important step in solving the knowledge acquisition bottleneck in ontology learning.