Uses Netscape HTML extensions
The Laboratory for Library and Information Science
Department
of Computer and Information Science
Linköping University
Sweden
We are in the process of writing a new research program. The first - "A Program for LIBLAB, the library research laboratory at Linköping University" was written in 1982 and the second - "Research Program for LIBLAB" was written in 1988. A summary of the second program was included in the "Activity Report of the Department of Computer and Information Science for 1993-94", parts of which have been used for the text below.
Descriptions of current research activities can for the moment be found in the home pages of Liblab's members. An overview is being prepared.
The text below, written in 1995, describes the activities in 1993-94.
Research at the Laboratory for Library and Information Science - LIBLAB, is focused on long term studies of the interactions, positive and negative, between information technology and the generation, access to and use of documents and document collections. The information technologies par excellence today are of course computers and telecommunications.
Within this very broad area the main objects of study are the issues of designing and using catalogues - tools for access to large collections of documents. The application domain within which this research so far has been carried out has been libraries.
One of the effects of early information technology, writing, and later printing, was a proliferation of texts. Libraries have for a very long time been one of the two main social responses to the cumulation of writings. Libraries can be regarded as a social device for providing access to publications experiences, ideas and knowledge documented in text and made available to the public. Libraries are hence quickly and deeply influenced by any change in information technology that impinges on the creation, distribution and use of publications. New media, new forms of publications, and new methods of scientific communication and knowledge organization, and the interactions of these with library functions are hence of primary interest to LIBLAB.
The other main social response to the proliferation of texts is archives. One of the main differences between libraries and archives is in the type of texts they collect and organize. Whereas libraries are mainly concerned with publications - texts that are intended for the public, and which consequently have usually been produced in multiple copies - archives are mainly concerned with records - texts that provide evidence of actions, e.g. administrative or commercial or justiciable, and that mostly exist in one or a few copies. In the archives environment specific classes of old collections of records are progressively being digitized. The development of computerized access tools for the collections of traditional records is also growing. The computer-based creation, use and storage of records is furthermore increasing. All of these developments raise issues similar to those in libraries when access is considered.
The collections of libraries and archives are an important part of the total cultural heritage of mankind but societies can survive without them. The concept of a document can be broadened to encompass "that which serves to show or prove something" or "something written, inscribed, etc.,which furnishes evidence or information upon any subject, as a manuscript, titledeed, coin etc." (The Shorter Oxford English Dictionary on Historical principles 3rd. Ed.). Artifacts, small and large, as well as processes (e.g. customs and procedures) and structures (e.g. of organizations) are also important carriers of culture and can therefore be regarded as documents that carry a "text" that, however, is not as easily "read" as a traditional writing. The information that can be deciphered by a trained "reader" is usually transcribed and documented in "ordinary" documents, cf. the notes, photos, sketches etc. produced by an ethnographic researcher during a field study.
Museums are the social institutions that for artifacts (and objets trouvées) have the same functions as libraries and archives have for texts. For some artifacts (e.g. buildings or environments) that because of their nature (size etc.) cannot be collected and organized in one place there are usually also special national heritage institutions.
Libraries, archives, museums, and national heritage institutions are the prime examples of institutionalized memory institutions. They can also be regarded as cultural repositories in that they collect, describe, conserve, and organize for access items of material culture.
In addition there are institutions and persons performing the same tasks for natural objects. Among these a distinction can be made between those that focus on dead items: rocks, shells, stuffed animals etc., and those that are concerned with living matter: genes, seeds, plants and animals, environments. In the latter case there are specific problems in that living matter has life-cycles and thus has both strict needs with regard to the immediate environment and requirements for continued renewal of the stock.
The first research program for LIBLAB was written 1982. A second, revised, program, the main contents of which are included below, was formulated in 1988.
Within the broad area of study - catalogues as tools for access to large collections - there are two main themes:
The second theme is concerned with descriptions and representations of documents (of all kinds as indicated in the introduction) and collections and their relations at different levels. The context of these descriptions and their representations is computerized catalogues as tools for access to documents. Within this theme we have four subthemes:
A third version of the program, focusing on the next five to six years and taking into consideration the changes in the environments, and the reorientations of interests at LIBLAB discussed below is in the process of being defined.
The trend towards the convergence of communications and computers, the use of networked information resources, vis. Internet and the World Wide Web, through powerful workstations as well as PDAs (Personal Digital Assistants) by knowledge workers implies that catalogues are one kind of tool (that has wide applicability) among many other kinds. They should therefore be designed considering both the personal information management situations of the individual user and the characteristics of the collections and their items.
The scope of the research at LIBLAB has in recent years been broadened to cover implications for archives, museums and cultural heritage institutions as well as libraries. Technical documentation is another area that also is receiving attention, mainly because it seems to be the domain in which formalisms for document description and architecture, and their applications - e.g. SGML and HyTime, are having an impact.
Publications seem to be the only class of documents that only incidentally have attributes related to spatio-temporal coordinates as an important characteristic. For most of the other types of documents discussed above it is important to specify from where and when they originate
Issues relating to access to media forms that traditionally have been broadcast: recreation/entertainment, news, opinion etc. have in last the year been recognized by LIBLAB as an emerging area meriting close attention. One of the reasons is the materialization of access through cable to hundreds of TV-channels, video-on-demand etc. Two other reasons are the multimedia description issues arising, and the long neglect in library and information science of most aspects of retrieval of recreation/entertainment.
The activities since 1992 are described briefly, to provide a background to present activities as well as continuity for readers of previous reports.
The activities during 1993 and 1994 have been characterized by the work on the TemaKat/IdaKat-project, further described below, and by the continued development of an undergraduate education program in Informatics.
Research activities at LIBLAB during this period has been grouped into six areas
TemaKat/IdaKat has been the major research undertaking of LIBLAB during 1993 - 94. The original goal was to provide two of the themes, Communication, and Technology and Social Change, of the department of Themes with a catalogue that provides those of the functionalities envisaged for the HYPERCATalog that are feasible, given the constraints of equipment available at the themes, and the resources available at LIBLAB. In addition TemaKat is used as a platform for development of methodology, both with regard to systems analysis and design, and specific parts such as qualitative studies of users. TemaKat was planned to consist of traditional phases:
In the first phase most of the prospective users of TemaKat were interviewed extensively with regard to their reading and writing habits, their use of catalogues, libraries and literature. The resulting transcribed documents are being used as databases in the compilation of the results from the user studies for a qualitative analysis and synthesis of requirements.
In the second phase the findings from the first phase and the ideas and results from earlier HYPERCATalog experiments were merged and a resulting specification was used to produce a database, built on top of the RAM-DBMS WSIris by a group of stu dents, and to initiate work on an interface.
In the third phase when the TemaKat was to be introduced and its uses to be studied, as a part of an evaluation, the supporting infrastructure, a LAN was not in place at the Themes department. To save the work done it was decided to do an implementation at LIBLAB's home department, IDA, instead. (The presence of the necessary LAN was not assured at the time of original planning for the project and IDA was thus proposed as a testing ground in the first proposal to the funding agency, which, however, decid ed that the Themes department should be the user group.)
The reorientation (with the concomitant name change to IdaKat) had deep conse quences. The user interface that had been designed for a Mac + (which was the least common denominator at the themes) had to be completely redesigned for Open Win dows on SUN workstations, taking into account larger screen and the capabilities for multiprocessing etc. The database content which was to be taken from the catalogue of the library of the Themes department (available in MARC-format) had to be re placed by the file produced at IDA of reports and other publications acquired during the years and produced by many people with no training in cataloguing. The first two attempts at designing and implementing an interface were failures for a number of reasons and at present work is going on at extending WILLOW (the Washington In formation Looker Upper Layered on Windows), a Z39.50 client, to handle links and input of bibliographic information, and provide a two-layered access restriction. The actual use of IdaKat is expected to begin in the fall of 1995.
One of the, few, benefits of this not too uncommon delay in a long-term project is that Internet meanwhile has become a household word and that we have recognized that we in our HyperCatalogs vision cannot isolate the local catalogue, that TemaKat and IdaKat were designed to be, from the rest of the world's information resources.
The research activities in Geographical Information Systems - GIS are focused both on two distinct, but related areas concerning future GIS - generalization and architec tures for spatio-temporal structuring, and on applications of traditional GIS to envi ronmental issues, especially waste management.
The main investigator in this activity is Jonas Persson, who intends to base his disser tation on the ideas presented below.
Cartographic generalization has been used for long to obtain simpler maps and maps in smaller scales. As the spatial information systems has evolved from being map pre sentation systems to true information systems with reasoning capabilities it has turned out that generalization also can work as a powerful heuristic for solving some prob lems.
One such problem is finding the shortest path in a complex environment. The more complex the map is the greater problem. In fact the size of the problem increases ex ponential compared to the map complexity. To get rid of this problem it is possible to generalize the map by filtering, straightening simplification etc. and then solve the problem on the simpler map. After that the path is projected to the original map (if necessary) to give a hint of a possible solution. The preliminary solution is then re fined until a reasonable one is achieved.
In management and command control systems there are several other knowledge structures, except for the map, to keep in mind, e.g. personnel, supply, time and plans. If a similar kind of generalized reasoning could be used for them much work might be saved.
A feature with generalizations is that they reduce the amount of information. This should not be a disadvantage since it is always possible to recreate the previous states from the original data.
However a problem is how the generalizations should be performed and on what as sumptions they should be based. In traditional data handling these approximations are performed on quantitative (exact) information, but in many cases these apparently precise data are not at all that precise. Qualitative reasoning is a recent trend in artifi cial intelligence and spatial information handling. Instead of using quantitative, nu meric data, relative and approximate terms are used, e.g. north of, close to, straight and small. This way of reasoning is more like the one performed by humans when try ing to solve spatial problems. By using qualitative reasoning the hope is to achieve more appropriate generalizations in a natural way.
The main investigator in this activity is Andreas Björklind, who intends to base his dissertation on the ideas presented below.
Today's information systems are stretched to their limits in situations of continuous, large quantity flows of spatio-temporal data. Such situations are encountered in, for example, assessment and control of crises and emergencies. Meta-databases for the management of data about data have become a necessity. There are four main differ ences between the meta-databases for assessment and control systems and traditional information retrieval systems (e.g. bibliographic information systems). The first is the need for speed in generating and updating the meta-database. Secondly there are spa tio-temporal aspects of the handled information. The third difference is the large vol ume and variety of information to be managed and to be transformed to symbolic form. Fourth is the collaborative aspect of the interaction.
In this project an architecture for organizing heterogeneous spatio-temporal multime dia data is proposed. In large scale assessment and control operations the scale of in formation flow is vast. One of the most important problems is not just to allow for the flow itself, but rather to automatically interpret the structure and contents of docu ments, and relate them to each other.
Common to all documents in this project are that they are coordinate based, i.e. linked to one or more points or regions in space and time. For this reason all information flowing in to the system can be indicated on a map together with a time stamp. As a consequence a geographical information system (GIS) will be required, and a GIS that has been developed at FOA in Linköping (FOA is the Defence Research Estab lishment in Sweden) will be used.
The international standard for Hypermedia/Time-Based Structuring Language - HyTime (ISO 10744) - an extension of SGML (Standard Generalized Mark-up Language, ISO 8879), provides a formal framework in this project. It is suitable for the descriptive tasks arising in the context of assessment and control systems, both for the planning tasks and for the meta-database design. One of the tasks in the project is to analyze the various kinds of incoming information with regard to its various char acteristics. The resulting analytic information will be used to design methods of sym bolic description for the meta-database. It will also be used in design appropriate indexing and classification schemes, and for creating thesauri. The transformation of the information to symbolic form, using HyTime, will enable it to be used both in rea soning and for subsequent presentation in the GIS-system.
The meta-databases in this project is concerned with the logical structure and access methods of documents in the database. The main problem is to design a meta-data base that can handle a large flow of documents in which spatio-temporal data are of great concern. The meta-database cannot be fixed in its structure, but must be dynam ic depending on the actual crisis. The meta-database must also include means for de cision making based on various methods for spatio-temporal reasoning that can be used in other subsystems or applications that will be using the meta-database. It will be necessary to have means for selecting the information that is available in the data base with respect both to time and space.
The main investigator in this activity is Åke Sivertun.
The main investigator in this activity is Roland Hjerppe, together with Erland Jungert, FOA.
Work in this area is reported e.g. in a report from the COST-14 Project CO-TECH, Working Group 3. (see "other publications").
The main investigator in this activity is Roland Hjerppe.
The work on generalization of the concept of document, and the consequences for de scriptions, has been reported e.g. at the Third ISKO Conference (see under "External refereed publications").
The main investigator in this activity is Lisbeth Björklund, who is writing her disser tation based on the ideas presented below.
One area which have a long tradition in designing large public systems, is Library and Information Science. We have here two kinds of systems, used by many people, who have not been involved in their development and design, Information Retrieval sys tems for bibliographical databases and OPACs (On-line Public Access Catalogues) offered for access to information on library holdings. These systems were originally designed for other user groups, librarians and intermediaries, and later on adapted for use by other people. These adaptions have been guided by studies of the use of the systems, which has led to a long tradition of user studies in LIS.
User studies in this area have focused on either studying the over-all information be havior of groups of people, studying the actual use of different systems in order to im prove the usability or studying the interaction of an individual and a system in order to improve the interface or the retrieval technique. The techniques used have mainly been elicitation, trying to drag out the information from the user, or discovery, finding out by intensive study of the information seeking behavior. Real user participation in design situations, e.g. the development of new OPACs are rare, although there are some recent approaches in this direction.
User studies in library and information science have changed over the years. In the 1986 ARIST chapter on Information Needs and Uses, Dervin and Nilan described a forthcoming paradigm shift. They described the traditional paradigm, where the infor mation is subjective, processed by the user and external observation is used for mak ing propositions on the use of systems. "What"-questions are sought answers for by the use of quantitative methods. In the alternative paradigm, exemplified by three dif ferent approaches, the user is seen as the one who is constructing information, by us ing systems as one tool for their understanding. The focus is on the user, and "how"- questions are studied preferably with qualitative methods, since these parameters are not measurable. In 1990 Hewins validated this paradigm shift, and pointed at the cog nitive approach as the next paradigm to come. And, in 1991, Allen wrote an ARIST chapter on cognitive research in information science.
If we look at other disciplines, as Human Computer Interaction, which has evolved in a similar way, going from system orientation to cognitive psychology, there is a shift under way, turning from the individual and the machine towards a "social paradigm", the user as she behaves as an individual in a certain situation, in relation to other peo ple. This has emerged in the field of Computer Supported Cooperative Work (CSCW) where the importance of the surroundings and the relation to other people in different working situations are acknowledged as an important factor in system development. In this research area (CSCW) we find many examples on very profound studies of people in their every-day work, performed in order to guide the design of computer ized support systems. Approaches in this direction in Library and Information Science are rare, but Ellis acknowledge this as he writes on the problems of choosing observa tion methods in the studies of information-seeking patterns of academic researchers - "as information seeking is integrated with the rest of their activities in a way that makes observation almost totally impracticable."
The overall trend in these different research areas, is that the user is in focus, as part of a social setting, and the methods used to study her must, at least partly, be of a qualita tive nature, in order to capture her needs and preferences in design processes.
The aim of the dissertation is to define and describe a formal method for the manage ment of qualitative data, i.e. from user interviews. A comparative study of this formal approach, and some traditional tools for computer supported qualitative analysis, in the light of systems design, is also part of the work. The method is developed for de sign of bibliographic information systems, but might be generalized to support user- oriented design of large, public systems in general.
The whole LIBLAB group has been involved in investigating this area, since it is very tightly connected to the undergraduate courses in Informatics. This is an area which will have great impact on our next research program. Activities during the years cov ered by this report, has mainly been devoted to initial studies and extensive use of In ternet. Some results and discussions are presented in e.g. (Björklund et.al. 1995). Much effort has been put into the presentation of possibilities and limitations of the Net to a broad audience. This has taken the form of a number of open seminars, giv ing people a chance to get a first real contact with Internet.
Roland Hjerppe, rolhj@ida.liu.se 1996-03-19