Methodological advances in lexical pattern extraction : examples from Spanish adventure tourism

Publication type: B2
Publication status: Published
Authors: Goethals, P., & Degraeuwe, JRD
Editor: Isabel Durán-Muñoz and Eva Lucía Jiménez-Navarro
Series: Exploring the language of adventure tourism : a corpus-assisted approach
Volume: 24
Pagination: 85-110
Publisher: Peter Lang (Berlin)
Download
View in Biblio

Abstract

This chapter aims to contribute to the methodological innovation in the description of language use for specific purposes and the language of adventure tourism in particular. We describe techniques such as dependency parsing and semantic similarity calculation based on non-contextual word embeddings and transformer-based language models in order to show how these recent innovations or developments can lead to a more fruitful use of small and mid-size corpora, which are the typical use cases when studying languages for specific purposes. In these corpora, purely quantitative filters set too strict limitations on less frequently recurrent constructions. At the same time, we discuss the challenges posed by the new techniques, since they require adequate fine-tuning. Semantic similarity calculation in particular seems to offer important opportunities to explore the full richness of small and mid-size corpora such as the ADVENCOR corpus, since it takes recurrent constructions as a starting point and enriches these data by identifying semantically similar constructions which by themselves do not pass the frequency thresholds. We will apply these methodological insights to verb constructions that are typical in the adventure tourism discourse.

June 8, 2026	20 years of LT3
May 31, 2026	PhD Defense Quanqi Du
May 20, 2026	📢 PhD Position
Dec. 17, 2025	On how GPT-4o, Gemini-2.5 and DeepSeek-R1 have been used in lexicography
Oct. 31, 2025	PhD Defense Sofie