Raw text extraction of 466 early modern Dutch comedies and farces after nltk sentence tokenization, with author indications. Gold data comes from DBNL and CENETON, OCR data is post-corrected after finetuning mBART on floriandebaene/EmDComF_OCR_post-correction on Hugging Face.