A Posteriori Agreement as a Quality Measure for Readability Prediction Systems
- Publication type
- P1
- Publication status
- Published
- Authors
- van Oosten, P., Hoste, V., & Tanghe, D.
- Editor
- Alexander Gelbukh
- Journal
- Computational Linguistics and Intelligent Text Processing
- Series
- Lecture Notes in Computer Science
- Volume
- 6609
- Pagination
- 424-435
- Publisher
- Springer-Verlag (Tokyo, Japan)
- External link
- http://dx.doi.org/10.1007/978-3-642-19437-5_35
- Download
-
(.pdf)
- Project
- Hendi
Abstract
All readability research is ultimately concerned with the research question whether it is possible for a prediction system to automatically determine the level of readability of an unseen text. A significant problem for such a system is that readability might depend in part on the reader. If different readers assess the readability of texts in fundamentally different ways, there is insufficient a priori agreement to justify the correctness of a readability prediction system based on the texts assessed by those readers.
We built a data set of readability assessments by expert readers. We clustered the experts into groups with greater a priori agreement and then measured for each group whether classifiers trained only on data from this group exhibited a classification bias. As this was found to be the case, the classification mechanism cannot be unproblematically generalized to a different user group.