Speech and Language Technology is concerned with the processing of natural language, both in spoken as wel as in textual form. The aim is to develop methods and algorithms by which language can be either analysed or generated. The level of understanding cq. well-formedness may vary and there is a large number of relevant disciplines. Computational linguistics i sthe discipline that contributes models of language and language use, while software engineering expertise is an important supportive discipline, as the processing tools may be complex and the data sets huge.

Among the applications of language technology studied at HMI are
  • multi-modal dialogue systems that support the interaction in natural language between humans and conversational agents. The latter are either or not embodied in a virtual environment.
  • systems for information extraction from documents that support textual, multi-lingual and/or multimodal information retrieval

Statistical language modelling is being researched in the context of information retrieval, and increasingly the possibility to apply machine learning techniques is being investigated for information-extraction, labelling of expressions in terms of speech and dialogue acts, question answering, determining the polarity of subjective texts, etc.

Also in the domain of speech technology both analysis and synthesis play a role. Speech recognition is an analysis task, while dialogue systems both have to recognise and to generate spoken language. Research themes at HMI are: speech transcription for Dutch, language modelling, speaker identification, emotion detection, dialogue analysis, corpus handling. Speech driven dialogue systems can be deployed in e.g., automated directory services and interactive learning. Speech transcription can support dictation tools, audio retrieval and voice controlled user interfaces. Text-to-speech tools can be beneficial for the visually impaired.

In Twente a huge text corpus of more than 400 M words is being maintained that supports the development and testing of statistical modules.


