The CAISOR Research Agenda for Information Analysis, Software Systems, Open-Access Repositories and Knowledge Representation

The CAISOR Research Agenda

Erik Sandewall

Linköping University and KTH - Royal Institute of Technology, Sweden

The webpage structure that starts here is intended to define my research agenda, called the Combined Agenda for Information Analysis, Software Systems, Open-Access Repositories, and Knowledge Representation (CAISOR), which is carried out in cooperation with colleagues and students at Linköping University (Linköping, Sweden) and at KTH - Royal Institute of Technology in Stockholm, Sweden.

Agenda Components

The full name of the CAISOR agenda indicates its four main aspects, which can be explained as follows.

Knowledge Representation: We use a basic formalism, Knowledge Representation Expressions, (KRE) which is designed in such a way that it can be used effectively both for publication purposes in textbooks, articles and webpages, and for communicating structured information between computer and user. The Knowledge Representation Framework (KRF) adds additional layers on top of the KRE formalism and uses it for the representation of actions and change and for defeasible inheritance as used e.g. in ontologies.
Analysis and Development of Electronic Publishing Techniques (ADEPT). This includes the following timely topics:
- Authoring and management of research articles and report
- Management of institutional repositories
- Principles and software for peer review and other editorial processes
- Principles and technologies for repositories of research data, facts and knowledge.
With respect to the fourth sub-item, we address in particular the development of repositories for common knowledge, that is, things that are known by or easily learnt by anyone with a general education. This includes e.g. knowledge about entities in physical and political geography (rivers, countries, etc), knowledge about animals, plants and foodstuffs, knowledge about appliances and other generally used technical devices, and so forth. Common knowledge is therefore complementary to discipline-specific knowledge that arises in particular sciences or professions. In our approach, repositories for common knowledge are organized as libraries of knowledge modules with limited and well documented interdependencies between the modules, and they are made available under an open-access license.
Information Analysis and Knowledge Acquisition (INKA): In order to be fully usable, repositories for common knowledge must satisfy strict quality requirements of several kinds, including factual correctness and formal consistency. There are also a number of important flexibility requirements on them, including the capability for graceful extension, and the capability to accomodate information that originates in different natural languages. The development of such repositories will therefore require a considerable amount of work, which means that the organization of that work becomes an issue in itself. We propose to recognize information analysis as the major activity in this respect, namely, an activity where information that is publicly available on the Internet (or otherwise) is checked and corrected by a combination of automatic and manual means, resulting in validated knowledge modules that can be included in a repository. We also propose that several aspects of traditional scientific publication ought to be carried over to the publication of knowledge modules, in particular, the use of peer review and the requirement of proper citation of earlier work.
Software Systems; LEONARDO: The Knowledge Representation Framework is also being used for an experiment with a new way of organizing the overall software system in a computer. We observe that conventional computer software consists of a large number of separate systems, most of which are associated with their particular languages. There are operating systems, many programming languages, database systems, markup languages, and many others. We also observe that the same concepts reoccur in different systems and languages with fairly trivial variations. This state of the art is costly and wasteful in terms of both human and computer resources. It is our hypothesis that it shall be possible to organize the overall system in a more coherent and comprehensive way. An experimental software system, called Leonardo, is being developed in order to test this hypothesis. The Leonardo system is based on the Knowledge Representation Framework, and some of the application systems that have been built on the existing Leonardo platform are in actual use as a tool for information analysis and for supporting our repository of common knowledge.

Additional Information -- The CAISOR Website Cluster

A more detailed explanation of this agenda, the results so far, and other related information can be found using the following sources:

The report defining the agenda.
The CAISOR agenda website, which contains explanations of activities defined by the agenda and links to their respective project websites.
The website of the Knowledge Representation Framework which provides the systematic basis for activities within the agenda.
The CAISOR archive containing links to earlier work that led up to the present agenda, thereby providing a historical perspective.
The CAISOR hyperbook (under development) which will complement and integrate the previous two resources.

Latest update 2009-10-10. Contact information here.