Multimodal dialogue means that user and computer interact with each
using a choice of different modalities (such as speech, pictures,
Providing natural multimodal dialogue requires dialogue management
handle both the user's natural language and natural communication in
modalities. We summarise several HMI projects involving multimodal
management in several different application domains.
multimodal interaction, dialogue management, question answering, route navigationSome themes present in this work are :This showcase lists information about Trung Bui's work on the ICIS
project, the Virtual Music Centre Guide by Dennis Hofs,
and the Vidiam project.
Affective dialogue modeling for affective multimodal dialogue
work aims to develop a dialogue model which is able to take into
aspect of the user's emotional state and act appropriately. The
Observable Markov Decision Process (POMDP) is exploring for this
prototype is under development for analyzing the influence of the
on action and how the system should respond in crisis situations.
Generic dialogue modeling for multi-application, multimodal
This work aims to create a unified multimodal dialogue model for a
number of applications. The applications (for example, car route
air route navigation, traffic lanes, map and fire management, tunnel
management, weather forecast, virtual control room, road surface
monitoring, patient information search, and medical worker
first constructed using the rapid dialogue prototyping methodology and
integrated into a hierarchy using vector-space model techniques. The
uses this hierarchy to switch between applications based on the user's
application of interest.
Twente route navigation demo.|
An example of the multimodal
management prototype with three input/output modalities (text, speech,
ICIS Home page: http://hmi.ewi.utwente.nl/project/icis
Trung's personal home page: http://wwwhome.cs.utwente.nl/~buith/
The virtual guide is an agent in the Virtual Music Centre that can
find their way in the building. It includes a multimodal Dutch
that accepts speech or text input as well as pointing gestures (mouse
from the user. Output comes in the form of the virtual guide speaking
The dialogue system is mainly suited to be used in a virtual world
objects and agents that can be talked about or pointed at. It
several components including a natural language parser, a fusion agent
merges parsed text or speech input with pointing gestures, a dialogue
recogniser that acts on a parsed Dutch phrase or sentence using the
history, a reference resolver based on salience factors that links
to objects in the virtual world, and an action stack that the dialogue
fills by matching the user's dialogue acts to action templates.
||Screenshot of the virtual guide
Virtual Guide homepage:
Vidiam is part of the IMIX (Interactive Multimodal Information
eXtraction) project, which concerns a multimodal interactive Question
(QA) system. Unlike most QA systems, IMIX can give answers with pictures
in them, and enables the user's information need to be satisfied
natural dialogue. The Vidiam dialogue manager recognises several types of
follow-up questions, and several kinds of feedback on the quality of the
answer. In addition to supporting dialogue, Vidiam enables the user to
communicate multimodally. The user can point to or encircle words or
elements on the screen.
Extending QA with dialogue and multimodality is still a relatively
area. Basic questions still have to answered. Such as: how do users
naturally react to QA answers? And: how are multimodal followup
be handled? We try to answer these questions with help of several
utterance corpora that we designed in the Vidiam project.
Screen grab of Imix system. Top left: animated system architecture.
right: Ruth-based talking face. Bottom: interaction window.
An interaction from multimodal interaction experiment. Top:
answer. Bottom: follow-up
question. The green sketch line is a user gesture.
Vidiam homepage: http://wwwhome.cs.utwente.nl/~schooten/vidiam/
|D.H.W. Hofs, H.J.A. op den Akker and A. Nijholt A generic architecture and dialogue model for multimodal interaction, in Proceedings of the 1st Nordic Symposium on Multimodal Communication, P. Paggio, K. Jokinen and A. Jönsson (eds), volume 1, CST Publication, Center for Sprogteknologi, Copenhagen, ISSN 1600-339X, pp. 79-91, 2003 [ BiBTeX ]  |
|T.H. Bui, J. Zwiers, A. Nijholt and M. Poel Generic dialogue modeling for multi-application dialogue systems, in Proceedings of the 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, S. Renals and S. Bengio (eds), Lecture Notes in Computer Science, volume 3869, Springer-Verlag, Berlin, ISBN 3-540-32549-2, ISSN 0302-9743, pp. 174-185, 2006 [ BiBTeX ]  |
- DEMON [IMIX/DEMON]
- ICIS [Interactive Collaborative Information Systems]
- Vidiam [IMIX/VIDIAM]