Project financed by:  

Development of Generic Resources for Language Technology



Research Issues


Project work plan

Project prototypes

System development and design





Project Work Plan

The focus in this project is on development of generically designed software with an open design, well-defined protocols, standards and formats and documentation for development of language technology. This involves understanding how to break current prototypes into usable pieces of code that will then be further refined and debugged into robust code with simple and generic interfaces. These open source modules will also be documented and protocols on application levels will be defined. The modules will be used in practice in various projects. One such usage feedback is the co-operation with Nokia Home Communications on development of a Dialogue System, which is to be further iterated including some of the modules developed during this phase. Similarly the modules on text extraction will be utilised in the project with Ida Infront. The project will be divided in the following three phases:

Start up (year 1)

Initially the main effort will be on making existing research prototypes available as the corner stones in an open source software library. In particular, this includes development of open source for the research prototypes called JavaChart and submodules from the MALIN framework. This part of the project aims at addressing this and only this, further research on the actual language technology aspects of these modules will be carried out in Mifis and The Ceniit project. Initially, the resources will be administered at public project resources, such as Source forge. Subsequently it is our intention to place the language technology software library at a future local infrastructure for open source activities. Initial work on a common infrastructure and design patterns will also be done during the start up phase.

Extension (year 2)

In the second phase, we expect that feedback and modifications from users will set off the work in new directions. Moreover, we plan to incrementally extend other research prototypes, as the prototypes become more complete from work in other projects. The second phase is also a phase where more extensive work and evaluation on infrastructure and development methodology are performed. During year two, a course on object-oriented application development using open-source techniques is suggested to be held for the national research school on language technology, as a parallel activity.

Completion (year 3)

In the final phase, an open source community will gradually have been formed. Contacts taken earlier with industrial partners are hopefully deepened into more close cooperation where the parties are taking active part of the others feedback and work. In parallel, in educational activities, students will be engaged in a more formalised way with the open source activities on language technology. Finally, for our own research on dialogue systems, we aim at stable and generic open source modules for all major parts of a spoken dialogue system: parsing, dialogue, domain handling, generation, and an animated user-interface. The modules should be well organised in a suitable infrastructure, have an associated development method, and be verified to be useful in some industrial application.

Page responsible: Webmaster
Last updated: 2012-05-07