The Linköping Dialogue Corpus

The corpus consists of 60 dialogues collected in Wizard of Oz-experiments. Half of the dialogues are simulations where the subject believes that he/she is communicating with a computer and half are collected with subjects given the information that it is a simulation.
 

The cars scenario 

The scenario in the Cars experiments presented a situation where the subject, and his/her accompanying person, just got the message that their old favourite Mercedes had broken down beyond repair and that they would have to consider buying a new car. They had a certain amount of money available and using the computerized Cars consultant were asked to select three cars. They were also to give a short motivation for their choice. The Cars system is a database with information from a consumers guide on properties of used cars. The subjects specify a car, or a set of cars, and request data on various properties of cars like fuel consumption, price, top speed, etc. They also request information on what type of information is available in the database and how to interpret it, e.g. Cars:9:13> what does rust 5 mean. The Cars database is implemented in INGRES and runs on a SUN Sparc station. In the database there are 79 different cars and for each car there is information on 28 different properties. There is also a set of canned texts explaining each property. After successful database retrievals, answers from the system are tables with information on properties of used cars. 

Dialogue 1 from the Cars corpus (in Swedish). English translation. 

The Travel scenario 

The Travel dialogues were collected using scenarios two scenarios, one where the subjects were asked to gather information on charter trips to the Greek Archipelago and another where they have a certain amount of money available and were asked to use the Travel system to order such a charter trip. It was possible to provide graphical information to the subjects, i.e. maps of the various islands. The underlying domain model in the Travel dialogues is hierarchically organized and, thus, more complex than the one used in the Cars dialogues. On the top level we have Greece. Below that are the various resorts and at the bottom the hotels at each resort. The subjects can request information from any level in the hierarchy, e.g. Travel2:4:1> how is the weather in greece in july; Travel2:17:7> information about ikaria; Travel2:20:43> distance to the beach from villa ioli and koubis. The number of resorts is 6. For each resort there is information on 17 different properties, such as climate, entertainment, sights, etc. At each resort there is between 1 and 7 hotels, depending on resort. Each hotel has 11 different properties, for example, price, if breakfast is served, if there is pool, distance to the beach, etc. There are also 10 properties related to ordering a particular holiday trip, like cancellation insurance, departure date, airport, etc., and some information on Greece. 

Dialogue 4 from the Travel corpus (in Swdish). English translation 

Dialogue 3 from the Travel and order corpus (in Swedish). English translation. 

Available are also all cars dialogues (gzipped) and all travel dialogues. The dialogues are in Swedish.


Back to Arne Jönsson Home Page

Back to NLPLAB Home Page

Page responsible: Webmaster
Last updated: 2012-05-07