Linguistically-based sub-sentential alignment for terminology extraction from a bilingual automotive corpus

Publication type
C1
Publication status
Published
Authors
Macken, L., Lefever, E., & Hoste, V.
Series
Proceedings of the 22nd international conference on computational linguistics
Pagination
529-536
Publisher
Association for Computational Linguistics (ACL) (Stroudsburg, PA, USA)
Conference
22nd International conference on Computational Linguistics (COLING 2008) (Manchester, UK)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

We present a sub-sentential alignment system that links linguistically motivated phrases in parallel texts based on lexical correspondences and syntactic similarity.
We compare the performance of our sub-sentential alignment system with different symmetrization heuristics that combine the GIZA++ alignments of both translation directions.
We demonstrate that the aligned linguistically motivated phrases are a useful means to extract bilingual terminology and more specifically complex multiword terms.