Despite the fact that speech recognition involves a fair amount of natural language processing, there has been very little collaboration between linguists and speech recognition engineers.
Next to the ambitious aim of significantly reducing the gap between human speech recognition and authomatic speech recognition, the specific objectives of this project are: (i) a flexible layered speech recognition architecture that allows one to plug-in and evaluate new information sources (linguistic or otherwise); (ii) a robust decoding strategy combining left-to-right, right-to-left, bottom-up and top-down inference that can cope with rich information; (iii) a set of linguistic modules that are compatible with the new architecture and that significantly improve the accuracy of speech recognition.
ELIS will focus on the search strategy and acoustic modelling while LT3 will focus on the improved language modelling.