project 00.05

Multimodal Dialogue Systems for Industrial Applications

Activities

The project activities center around multimodal dialogue systems from various perspectives:

Investigations on the design of multi-modal systems where spoken interaction is one important modality. This involves the design and implementation of multi-modal interfaces, which have been evaluated with a number of users and settings (Ibrahim & Johansson 2002, 2002b, Ibrahim et. al 2001, Qvarfordt & Santamarta 2000, Bäckvall et. al. 2000, Qvarfordt 2003, Berglund & Qvarfordt 2003, Qvarfordt, Jönsson & Dahlbäck 2003).
The use of eye gaze in multimodal dialogue systems. We have conducted experiments to see how eye-gaze can be used in multimodal dialogue systems. Based on the results from these investigations a multimodal dialogue system using eye-gaze as input modality has been developed and evaluated. All users were able to carry out a set of task using only eye-gaze and they also liked using the system (Qvarfordt 2004).
Development of techniques to improve speech recognition. By combining results from a grammar-based recognizer and a statistical language model targeted help can be presented to the user in the case of unreliable recognition (Gorrell et.al. 2002). The techniques has also been used to identify new words (Gorrell 2003) and to give users a suggested interpretation that the user can accept to get a respond or reject to form a new request (Gorrell 2004).
Multimodal dialogue system framework development. Our framework has been extended with a separate module called the Domain Knowledge Manager (DKM) (Flycht-Eriksson 2001, 2000) (the result of the graduate student Annika Flycht-Erikssons Licentiate Thesis 2001). The DKM is responsible for domain reasoning and retrieval of information from various domain knowledge sources. The DKM cooperates with the Dialogue Manager (DM) to answer questions for information posed by the user (Flycht-Eriksson & Jönsson, 2000). User utterances are transformed into information requests by the DM, possibly involving clarification sub-dialogues with the user. The fully specified request is then sent to the DKM. The DKM consults one or several information or domain knowledge sources in order to retrieve the requested information and produces an answer to the request. This requires that the DKM knows where and how different types of information should be retrieved, a task that becomes more difficult when the domain is open and the information unstructured. Separating domain knowledge from dialogue knowledge allowed us to integrate ontological knowledge to the system (Flycht-Eriksson 2003, 2004).
Synergistic integration of multimodal speech and pen information (Johansson, H 2001a, 2001b, 2000). The model consists of an algorithm for matching and integrating interpretations of inputs from different modalities, as well as of a grammar that constrains integration. Integration proper is achieved by unifying feature structures. The integrator is part of the general framework for multimodal information systems with dialogue capabilities.
Implementation of dialogue systems for new applications could be viewed as a process of customising a generic framework to fit the needs of a more specific application. To be useful the framework must be well documented and modularised. Current work involves developing language technology resources, such as the dialogue systems development platform, MALIN, to an Open Source code repository. We have also presented a method for how to iteratively develop a dialogue system from a generic framework (Degerstedt & Jönsson 2001a, 2001b, Johansson, Degersyedt & Jönsson 2002). As part of developing techniques for dialogue systems implementation we have extended the Phase Process pattern to handle dialogue systems phenomena, the Phase Graph Process pattern (Degerstedt & Johansson 2003).
Prototype development using the framework. The framework has been utilised in the implementation of the Nokia Tv programme Guide (Johansson, Degerstedt & Jönsson 2002, Pontus Johansson 2001) and a limited version is available for download at nlpFarm. Another prototype system is the birdQuest system, a multimodal dialogue system based on a bird encyclopaedia (Andén et.al 2003, Jönsson & Merkel 2003, Flycht-Ericsson & Jönsson 2004)
Design and development of adaptive multimodal dialogue systems. Adaptivity is a broad concept and we have initially study how natural language can be used in a recommender systems to handle the cold start problem, i.e. when a new user has to provide information on his/her preferences in order to create a model that can be used to give new recommendations based on the users' interest. This is implemented in the MADFilm system (Johansson, P. 2003, 2204).

References

Page responsible: Webmaster
Last updated: 2004-04-22

IDA - Department of Computer and Information Science

Multimodal Dialogue Systems for Industrial Applications

Activities

References