|
|
Activities
The project activities center around multimodal dialogue systems from
various perspectives:
- Investigations on the design of multi-modal
systems where spoken interaction is one important modality. This
involves the design and implementation of multi-modal interfaces,
which have been evaluated with a number of users and settings (Ibrahim
& Johansson 2002, 2002b, Ibrahim et. al 2001, Qvarfordt &
Santamarta 2000, Bäckvall et. al. 2000, Qvarfordt 2003,
Berglund & Qvarfordt 2003, Qvarfordt, Jönsson & Dahlbäck 2003).
- The use of eye gaze in multimodal dialogue systems. We have
conducted experiments to see how eye-gaze can be used in multimodal
dialogue systems. Based on the results from these investigations a
multimodal dialogue system using eye-gaze as input modality has been
developed and evaluated. All users were able to carry out a set of
task using only eye-gaze and they also liked using the system
(Qvarfordt 2004).
- Development of techniques to improve speech recognition. By
combining results from a grammar-based recognizer and a
statistical language model targeted help can be presented
to the user in the case of unreliable recognition (Gorrell
et.al. 2002). The techniques has also been used to identify new
words (Gorrell 2003) and to give users a suggested interpretation that
the user can accept to get a respond or reject to form a new request
(Gorrell 2004).
- Multimodal dialogue system framework development. Our framework
has been extended with a separate module called the Domain Knowledge
Manager (DKM) (Flycht-Eriksson 2001, 2000) (the result of the graduate
student Annika Flycht-Erikssons Licentiate Thesis 2001). The DKM is
responsible for domain reasoning and retrieval of information from
various domain knowledge sources. The DKM cooperates with the
Dialogue Manager (DM) to answer questions for information posed by the
user (Flycht-Eriksson & Jönsson, 2000). User utterances are
transformed into information requests by the DM, possibly involving
clarification sub-dialogues with the user. The fully specified request
is then sent to the DKM. The DKM consults one or several information
or domain knowledge sources in order to retrieve the requested
information and produces an answer to the request. This requires that
the DKM knows where and how different types of information should be
retrieved, a task that becomes more difficult when the domain is open
and the information unstructured. Separating domain knowledge from
dialogue knowledge allowed us to integrate ontological knowledge to
the system (Flycht-Eriksson 2003, 2004).
- Synergistic integration of
multimodal speech and pen information (Johansson, H 2001a, 2001b,
2000). The model consists of an algorithm for matching and
integrating interpretations of inputs from different modalities, as
well as of a grammar that constrains integration. Integration proper
is achieved by unifying feature structures. The integrator is part of
the general framework for multimodal information systems with dialogue
capabilities.
- Implementation of dialogue systems for new applications could be
viewed as a process of customising a generic framework to fit the
needs of a more specific application. To be useful the framework must
be well documented and modularised. Current work involves developing
language technology resources, such as the dialogue systems
development platform, MALIN, to an Open Source code repository. We
have also presented a method for how to iteratively develop a dialogue
system from a generic framework (Degerstedt & Jönsson 2001a, 2001b,
Johansson, Degersyedt & Jönsson 2002). As part of developing
techniques for dialogue systems implementation we have extended the
Phase Process pattern to handle dialogue systems phenomena, the Phase
Graph Process pattern (Degerstedt & Johansson 2003).
- Prototype development using the framework. The framework has been
utilised in the implementation of the Nokia Tv programme Guide
(Johansson, Degerstedt & Jönsson 2002, Pontus Johansson 2001) and a
limited version is available for download at nlpFarm. Another
prototype system is the birdQuest system, a multimodal dialogue system
based on a bird encyclopaedia (Andén et.al 2003, Jönsson &
Merkel 2003, Flycht-Ericsson & Jönsson 2004)
- Design and development of adaptive multimodal dialogue
systems. Adaptivity is a broad concept and we have initially study how
natural language can be used in a recommender systems to handle the
cold start problem, i.e. when a new user has to provide information on
his/her preferences in order to create a model that can be used to
give new recommendations based on the users' interest. This is
implemented in the MADFilm system (Johansson, P. 2003, 2204).
|