Posted on April 5, 2018
On the occasion of Suzanne Kleijn's PhD defense a Readability Workshop takes place in Utrecht on 5 April 2018.
The speakers were asked to reflect on three issues in particular, using their earlier empirical work as a background.
- What kind of readability data do we use? Traditional readability work used cloze testing data. Such data are scarce for many languages. Recently we have seen other data sources, most notably readability assessments of text by either experts or lay readers. Do these data provide a solution for the data scarcity problem?
- What kind of predictors do we use? Classic readability work used shallow linguistic features such as word and sentence length. From the eighties on, linguists and discourse analysts have proposed more sophisticated features regarding lexis, syntax and cohesion. Later on, computational work has extended the set of potential predictors by, for instance, n-gram features and probability features such as entropy. What do we want from our predictors, besides being powerful enough to do the job?
- Some researchers confines themselves to readability prediction, others are interested in readability improvement as well. In fact, much of the popularity of readability formulas is probably due to the expectation that they help to communicate more clearly. However, improvement-oriented readability researchers face additional constraints regarding research design and predictor selection. For instance, they need to estimate both text effects and text version effects, which complicates matters considerably.
Orphée is one of the speakers and will present her research on generic readability prediction.