Electronic Transactions on
Artificial Intelligence

Organized and published under the auspices of the European Coordinating Committee for Artificial Intelligence, (ECCAI)

Related: Auxiliary Publications, --- Software Support Overview, ---

ETAI Software Support: Distributed Architecture

Background

At present, our group has developed a bundle of support software which I use for developing and maintaining ETAI on-line materials, including both the web pages and the documents. This software, which is called AIMS (for Academic Information Management System) is also used for a number of related purposes, namely:

Generating web pages pertaining to the Linköping E-Press.
Maintaining the ACRES bibliographic database. The purpose of ACRES is to support the development of bibliographies, including both the bibliographies that belong to research articles and notes, and annotated bibliographies of the literature in a particular area.
Maintaining the Publication Register in our Computer Science department. (We have stopped issuing technical reports; instead, there is a register of all articles published by department members, with links to whereever the article may be held on-line).
Knowledge management within the WITAS project (intelligent UAV:s).

The main part of AIMS is in CommonLisp, and is used under the XLISP implementation of (approximate) CommonLisp which has the advantage of being quite fast. (Debugging support is mediocre, though). Minor parts of AIMS are written in other languages, and can be invoked from the Lisp part. Finally, a Java version of the database management parts is near completion. That work is done by John Olsson in our lab.

In addition, there is a parallel effort at the DFKI, done by Gerd Herzog, where they have built a quite large bibliographic database (LIDOS) and software services around it. There is a commonality of goals and interests between LIDOS and ACRES, and transfer of database contents between has been implemented.

Remote availability using communication directories

It is clear that several of the AIMS services could be of good use to ETAI editors. (I use this term to denote both ETAI area editors, and others who may be involved with preparing and maintaining ETAI publications). This is not to say that everyone has to use it, since each area editor is quite free to set up her or his structures with whatever tool s/he chooses. However, several area editors have expressed an interest in having this kind of software support.

At the same time, we are all familiar with the headaches of exporting software to other institutions and maintaining it there. Also, the AIMS system is being extended continuously, which means that the sending out and reception of new versions causes work for everyone involved, and that some parts may sometimes be a bit shaky.

Reimplementing everything in Java has been disscussed, but we concluded that it's not a viable idea.

The following technical solution has been found to this problem.

From the present, monolithic AIMS system, we factor out specific services as self-contained Lisp modules. This means that one doesn't have to load the whole system in order to perform one particular operation.
Each AIMS module operates in file(s) in - file(s) out mode. Even the operation of adding one more document description to the data base (to take an extreme case) is done by preparing a small text file with the bibliographic information, and giving it to the appropriate module.
The following transfer method is defined for each editor, using the term "server" for our computer system and "client" for the computer system where the editor is running. For each editor, a specific communication directory is defined on the server side, and also one on the client side. These directories should be WWW accessible, so that whatever is put there can be fetched using the HTML protocol.
Each editor has a symbolic name (identifier) for used in all communication (e.g. "smith"), and the server side software maintains a mapping from editor symbols to communication directories on both sides.
For each AIMS module, we construct a CGI script, and a web page that's able to invoke that CGI script in the server. The web page normally contains only two fields, to be filled in with the editor identifier and the name of the file to be processed. Modules requiring more than one input file will of course have more fields in their web page.
To invoke a module, the user constructs the input file(s), puts it in his client-side communication directory, and invokes the module's web page with her own identifier and the name of the file (relative to the communication directory) in the two fields.
The CGI script receives the client identifier, looks up the name of the client communication directory and concatenates the input file name to it, fetches that file over the net, runs the computation, and puts the result in the server-side communication directory of this editor. The file name of the result is a well defined function of the file name of the input (often the identity function).
Alternative methods of returning the results are also possible, for example, generating an E-mail message that is sent to the editor. Also, of course, the web page that's returned from invoking the CGI script contains the results of the computation, or a link to them.
Once the result has been placed in the server-side communication directory, it is up to the user to decide whether to leave it there or to fetch it to her own computer system.

One advantage with this arrangement is that it is inherently fairly secure, since input will only be taken from known communication directories. However, we may also impose additional security measures, such as a check against excessive computational or data storage loads (for the event that some outside would think it's funny to overload the system by invoking the same module very often), and a check that the server-side request comes from the correct computer system.

Anyway, the main advantage is that we can make the existing software available to editors anywhere, without having to go through the pains of distributing it to other sites. (By the way, you are of course welcome to have the source code if you want to, that's not the issue).

Restrictions, extensions, and alternatives for the communication directory scheme

For those operations which involve assembling the contents of a number of files into a larger structure, or analyzing a set of files possibly at several sites, one may wish to use an input file that in turn contains URL:s for other files. Technically there is no problem with this, except for the occasional unreliability of WWW file transfer.

An obvious restriction of the communication directory scheme is that it does not offer the convenience of forms-based data entry and editing or of a graphical user interface for controlling the details of the computation. However, it is still possible to let use the web page invoking the module for setting additional parameters for it.

For data entry, for example for contributing the list of contents of a conference or workshop, I am not sure that it is such a big problem to have to use text file input. As a matter of personal taste, I think it is more convenient to prepare a text file using Emacs than to have to navigate between forms and form fields using a mouse. On the other hand, text-file input means that the user has to be more careful with syntax, since incorrect data are not recognized until the whole file is processed.

The only major problem I think is for data editing: modifying objects in the data base after they have first been entered. For this purpose, we should plan to use the remote database editor being developed in Java by John Olsson. It sets up an editor process on the client side, communicating with a corresponding server-side process, and allows navigation in the object database and making changes along the way. User authorization can then be done when as process is initiated, and is in force for as long as the process is running.

How to get started

In order to get started with using these services under the communication directories scheme, just send me a mail indicating what identifier you wish to use, and what is the name (URL style) of your client-side communication directory. As soon as you receive the return message confirming that the server-side directory has been set up, you are ready to use the modules described in the on-line documentations.