A Hybrid System for Knowledge-Based Information Retrieval

Svend Jacobsen


This thesis is an attempt at addressing the limitations of traditional information retrieval systems, where information in a document collection is searched for by using keyword matching. Since this is often not powerful enough to make it possible to formulate queries that correspond to the desired information, we propose a model that also uses a knowledge-representation system based on description logics. This makes it possible to formulate complex queries using document structure as well as domain knowledge and semantic information associated with the documents. Such a system could, for instance, be used to search selected parts of the Internet.

We have implemented a prototype system that allows the user to formulate queries that combine traditional keyword searches with the capabilities of a description logic system. The system can also be queried via a WWW browser using HTML forms. It has been tested on a small document base, with promising but inconclusive results. Possible future research includes designing a more user-friendly interface, finding efficient ways of representing the knowledge base both in secondary and primary memory, and overcoming some of the limitations on document structure imposed by the description logic.