Multimodal Dialogue Systems for Industrial Applications


The rapid development within speech technology has enabled a number of commercial applications. Currently most applications are spoken dialogue systems for English, but there is also an increasing interest for Swedish systems. The most well-known application today is perhaps SJ's system for timetable information. At the Natural Language Processing Laboratory we also see an increasing interest from various companies as more and more students are doing their master's theses on the use of speech technology.

One of the shortcomings of most dialogue systems, and an area where more research is needed, is the limited ability to understand and correctly handle the dialogue, e.g. what to do when the user deviates from the pre-defined sequence of actions, when and how to issue clarifications, and how to handle focus information. Most computer applications involving interaction with a human user could benefit from utilising knowledge on dialogue systems. A dialogue system can be viewed as a system where users interact with a background system in an efficient and intuitive manner. The background system can be any information system such as information retrieval on the Internet, business administration systems, decision support systems and tutoring systems. There are also examples of using dialogue systems for interaction in aircrafts and cars.

Another important area of research is on multi-modal interaction and knowledge representation and reasoning for efficient use of different modalities. As we see it, efficient and intuitive interaction involves natural language as an important modality. Natural language can be spoken interaction only, e.g. using a telephone, or combined with other interaction modalities, such as gestures, graphics, text, menus, icons, or using a simulated talking head. Combinations of these constitute a variety of dialogue system realisations, from simple telephone-based timetable information systems to multi-modal tutoring systems utilizing a variety of modalities and complex domain reasoning used by individuals or groups.

