Dutch Parallel Corpus: a multifunctional and multilingual corpus

Publication type
A3
Publication status
Published
Authors
Paulussen, H., Macken, L., Trushkina, J., Desmet, P., & Vandeweghe, W.
Journal
Cahiers de l'Institut de Linguistique de Louvain
Volume
32
Issue
1
Pagination
269-285
Publisher
Peeters (Louvain-la-Neuve, Belgium)
Download
(.pdf)
Project
DPC

Abstract

Nowadays, text corpora play an important role in language research and all fields involving language study, including theoretical and applied linguistics, language technology, translation studies and CALL (Computer Assisted Language Learning). Multilingual corpora, especially translated corpora, are not always readily available for Dutch. Much depends on the private initiative of individuals, and the data are often restrictedly available. The DPC-project (Dutch Parallel Corpus), which is carried out within the STEVIN program (Odijk et al. 2004), intends to fill the gap for this type of corpora for Dutch. This paper gives an overview of the DPC project. First, an overview and a discussion is given of the main parallel corpora containing Dutch. Then the DPC project is described, focusing on those aspects that make the DPC different from existing parallel corpora. Finally, the choice of an XML based format is explained.