Exploiting Grammatical Relations for Protein Relation Extraction and Role Labeling

Publication type: C1
Publication status: Published
Authors: Fayruzov, T., De Cock, M., Cornelis, C., & Hoste, V.
Editor: E, Hoenkamp, M. De Cock, and V. Hoste
Series: Proceedings of the Dutch-Belgian Information Retrieval workshop (DIR-2008)
Pagination: 37-44
Download

Abstract

Automatic protein interaction mining from natural language texts and automatic identification of the agent and target proteins (i.e. role labeling) are challenging problems that attract a lot of attention because of the growing amount of biomedical text resources. We propose a novel approach that relies exclusively on parsing and dependency information. We strategically omit any context information such as keywords or parts-of-speech to maximally abstract from the given corpora and look whether the grammatical relations correspond to the semantic relations in the text and how close this correspondence is. In particular, we construct a feature vector for each sentence only from the grammatical relations and some parsing information. We then use the obtained vector with standard machine learning algorithms in deciding whether a sentence describes a protein interaction and which roles the interaction participants play. Evaluation on benchmark datasets shows that our method is competitive with existing state-of-the-art algorithms for the extraction of protein interactions, and gives promising results for protein role detection.

July 10, 2025	LT3 at EST 2025
July 4, 2025	LT3 at MT Summit and ICWSM 2025
June 27, 2025	Workshop CALM Work Placements
June 12, 2025	LT3 at LTRC, ICTIC, NITS and DHBenelux
June 5, 2025	Podcast Episode Dwars Door de Klas