Multimodal dialogue means that user and computer interact with each
other
using a choice of different modalities (such as speech, pictures,
gestures).
Providing natural multimodal dialogue requires dialogue management
that can
handle both the user's natural language and natural communication in
other
modalities. We summarise several HMI projects involving multimodal
dialogue
management in several different application domains.
multimodal interaction, dialogue management, question answering, route navigation Some themes present in this work are :This showcase lists information about Trung Bui's work on the ICIS
project, the Virtual Music Centre Guide by Dennis Hofs,
and the Vidiam project.
Affective dialogue modeling for affective multimodal dialogue
systems
This
work aims to develop a dialogue model which is able to take into
account some
aspect of the user's emotional state and act appropriately. The
Partially
Observable Markov Decision Process (POMDP) is exploring for this
approach. A
prototype is under development for analyzing the influence of the
user's stress
on action and how the system should respond in crisis situations.
Generic dialogue modeling for multi-application, multimodal
dialogue systems
This work aims to create a unified multimodal dialogue model for a
large
number of applications. The applications (for example, car route
navigation,
air route navigation, traffic lanes, map and fire management, tunnel
sensors
management, weather forecast, virtual control room, road surface
temperature
monitoring, patient information search, and medical worker
verification) are
first constructed using the rapid dialogue prototyping methodology and
then
integrated into a hierarchy using vector-space model techniques. The
system
uses this hierarchy to switch between applications based on the user's
application of interest.
|
Twente route navigation demo. An example of the multimodal
dialogue
management prototype with three input/output modalities (text, speech,
pointing gesture)
|
ICIS Home page: http://hmi.ewi.utwente.nl/project/icis
Trung's personal home page: http://wwwhome.cs.utwente.nl/~buith/
The virtual guide is an agent in the Virtual Music Centre that can
help users
find their way in the building. It includes a multimodal Dutch
dialogue system
that accepts speech or text input as well as pointing gestures (mouse
clicks)
from the user. Output comes in the form of the virtual guide speaking
and making
gestures.
The dialogue system is mainly suited to be used in a virtual world
with
objects and agents that can be talked about or pointed at. It
consists of
several components including a natural language parser, a fusion agent
that
merges parsed text or speech input with pointing gestures, a dialogue
act
recogniser that acts on a parsed Dutch phrase or sentence using the
dialogue
history, a reference resolver based on salience factors that links
noun phrases
to objects in the virtual world, and an action stack that the dialogue
manager
fills by matching the user's dialogue acts to action templates.
 |
Screenshot of the virtual guide |
Virtual Guide homepage:
http://wwwhome.ewi.utwente.nl/~hofs/dialogue/
|
Vidiam is part of the IMIX (Interactive Multimodal Information
eXtraction) project, which concerns a multimodal interactive Question
Answering
(QA) system. Unlike most QA systems, IMIX can give answers with pictures
in them, and enables the user's information need to be satisfied
through a
natural dialogue. The Vidiam dialogue manager recognises several types of
follow-up questions, and several kinds of feedback on the quality of the
answer. In addition to supporting dialogue, Vidiam enables the user to
communicate multimodally. The user can point to or encircle words or
visual
elements on the screen.
Extending QA with dialogue and multimodality is still a relatively
unexplored
area. Basic questions still have to answered. Such as: how do users
naturally react to QA answers? And: how are multimodal followup
questions to
be handled? We try to answer these questions with help of several
followup
utterance corpora that we designed in the Vidiam project.
|
Screen grab of Imix system. Top left: animated system architecture.
Top
right: Ruth-based talking face. Bottom: interaction window.
|
|
An interaction from multimodal interaction experiment. Top:
question and
answer. Bottom: follow-up
question. The green sketch line is a user gesture.
|
Vidiam homepage: http://wwwhome.cs.utwente.nl/~schooten/vidiam/
Former HMI-members:- DEMON [IMIX/DEMON]
- ICIS [Interactive Collaborative Information Systems]
- Vidiam [IMIX/VIDIAM]
|