Project financed by:  

Development of Generic Resources for Language Technology



Research Issues


Project work plan

Project prototypes

System development and design





System development

A code repository is one important part for successful dissemination of language technology software resources. To be successful, we also need a method for systems development. Such a method should conform with general software development methods, but be tailored to fit the needs of language technology. Our current experience on methods for software development mainly involves dialogue systems. Dialogue systems are characterised by having processes for the various dialogue system's tasks, such as parsing, dialogue control and domain knowledge management being fairly complex but small, i.e. not much code. Instead, as many AI systems, dialogue systems are knowledge intensive. Furthermore, much knowledge is acquired during the development of the system. Thus, the method shall conform to an evolutionary development based on running prototypes capable of handling more and more dialogue phenomenon. However, prototype refinement often involves re-design of various aspects, thus design and coding are carried out together. We have presented such a method for implementation of dialogue management modules (postscript). The method unifies issues of conceptual design with a clear correspondence to the components of the customisation of a generic framework. The method advocates that coding and design goes together and that a dialogue system is implemented iteratively with cumulatively added capabilities. Coding should be carried out as soon as possible, before all details of the system's design are ready; coding instead of chart diagrams. A prototype is developed from the start which is gradually refined based on evaluations of its behaviour. The method has gradually grown from our work on implementation of dialogue systems, and we will not claim that it is ready yet, but believe that it provides a step towards a software engineering method for dialogue systems development. We have not yet been able to verify the method, but once the interactive news reader is complete, or near completion, we have a chance to do a more systematic evaluation.

On System Design for Language Technology

Contributions from work associated with the research prototype framework MALIN largely discusses different aspects of dialogue system architecture. These results span from discussions on roles of modules and formats of data representation. The work is motivated mainly from an empirical base and has been developed towards an increasingly domain-independent model by examination of different application areas. To turn these architectural models of MALIN into a more complete system design has been a second important part of the implementation work, in accordance with our development method. The design work has mainly followed the principles in object-oriented thinking, in particular oo-frameworks, design patterns and ideas from the Java world. The models include design decisions on several levels: on module level, procedural, data format and communication channels suitable for the MALIN architecture. At this point these models are only described briefly in documents related to the system code. The system design is suggested to focus on the representation and flow of information. The finished design should normally include discussions on:
  • modularisation: identification of central sub-units and definition of their responsibilities. Submodules are suggested to be identified on three levels: control, handlers and methods.
  • knowledge representation: identification and abstract formulation of data items. The formulation is preferably kept in formal or semi-formal terms and based on the selected use-cases.
  • interfaces: formulation of interface functionality and (sub)module dependencies that defines the central data flows of a module, both internally and towards other modules.
Moreover, to be able to re-use already developed code in a flexible way we can distinguish between different forms of re-use. We suggest the use of a combination of the following complementary forms of code modules:
  • tools: modules that introduces its own data format on a higher level, such as a grammatical parser.
  • object-oriented framework templates: domain-independent piece of code such as a generic Java class taxonomy.
  • separate library code: useful smaller piece of software that can be used in a large number of language technology systems.
  • code patterns: well-documented sample code.

Page responsible: Webmaster
Last updated: 2012-05-07