Papers at the workshop will be refereed for a special issue of the Machine Intelligence area of the ETAI (Electronic Transactions on Artificial Intelligence) Journal. Referee comments for inclusion of papers will be posted on the MI18 ETAI web page.
We would like to acknowledge generous support of the Machine Intelligence 18 workshop by EPSRC Grant GR/R69655/01.
Programme for Machine
Intelligence 18 (PDF - includes information about breaks and session
chairs)
Contributors
Accommodation Information - Please contact Jane White (jane@cs.york.ac.uk) concerning booking of
accomodation at the Dean
Court and Minster
Hotels.
Speaker: Kiyoshi Asai
Title: Finding Genes from DNA Sequences by Using Multi-stream Hidden
Markov Models
Abstract: It is necessary to integrate various
types of information in order to recognize the genes in DNA sequences. The
information might be the four letters of DNA sequences, stochastic frequency of
the corresponding codons, homology search scores, splice cite signal strengths.
We have developed a software system of multi-stream Hidden Markov Models with
which those information can be easily integrated with a consistent measure of
probabilities. The output symbols of HMMs are the signals that we can observe.
In the field of bioinformatics, the output symbols of the HMMs are mostly the
four letters of nucleic acids or the twenty letters of amino acids. However, the
output symbols can be anything, as far as we can attach their probability
distributions. They can be discrete symbols, real values, real valued vectors,
and multiple streams of those values. We propose gene annotation using HMMs with
multiple streams, which combine the sequence and other pre-processed
information. The important feature of multi-stream HMMs is that the weights of
the streams can be optimized for each model. The multi-stream HMM with
adjustable weight permits very flexible design of the gene finding systems.
(URL: http://www.genedecoder.org/)
TBA
Speaker: Michael Ashburner
Title: On the representation of "gene function and process" in genome
databases: The Gene Ontology Consortium
Abstract:
Genes are
expressed in temporally and spatially characteristic patterns. Their products
are (often) located in specific cellular compartments and may be components of
complexes. Gene products possess one or more functions which may play a role in
one or more biological processes. We are developing a method to describe these
attributes in a rigorous way that will allow researchers to consistently
annotate and explore the universe of genomes. This is a collaborative project
between several model organism databases; FlyBase for the fruitfly, the
Saccharomyces Genome Database for yeast, Mouse Genome Database for mouse, TAIR
for Arabidopsis and WormBase for C. elegans. GO is now being used quite
extensively in industry. The three organizing principles of GO are function,
process and cellular component. A gene product can have one or more functions
and can be used in one or more processes; a gene product may be a component of
one or more supra-molecular complexes. Function is what something does. It
describes only what it can do without specifying where or when this usage
actually occurs. Process is a biological objective. A process is accomplished
via one or more ordered assemblies of functions. The model organism database
curators assign controlled vocabulary terms to gene products based on current
knowledge. This information describes the specific function(s) of a gene
product, the role(s) it plays in cellular processes and its localization and
associations. The structure of the "Gene Ontology" Database allows both
attribution and querying at different levels of granularity. The Gene Ontology
database provides annotation, definitions, and unique identifiers for all terms.
The cooperation between the model organism databases ensures the use of a common
ontology and, therefore, facilitates queries across all cooperating model
organism databases. Gene Ontology, therefore, allows users to make queries
exemplified by: "What mouse genes encode products playing a role in RAS signal
transduction ?"; "What fly genes encode products in this biological process ?";
"What genes in flies and yeast have products that are components of the
mitochondrial inner membrane and function as enzymes ?". GO is modelled as a
directed acyclic graph, with two classes of relationship between parent and
child - instance (isa) and part-of. The semantics of these relationships are not
as well defined as we would wish. In principle, all GO terms have formal
definitions (in practice only 11% as I write). GO is also being instantiated in
DAML+OIL by the Manchester group. The GO database is open to all without
restraint. It, a GO browser and editor, are available from
http:://www.geneontology.org/. We thank AstraZeneca & the NIH for financial
support.
Speaker: Christopher H. Bryant
Title: Combining Inductive Logic Programming, Active Learning and
Robotics to Discover the Function of Genes
Abstract:
We aim
to partially automate some aspects of scientific work, namely the processes of
forming hypotheses, devising trials to discriminate between these competing
hypotheses, physically performing these trials and then using the results of
these trials to converge upon an accurate hypothesis. We have developed
ASE-Progol, an Active Learning system which uses Inductive Logic Programming to
construct hypothesised first-order theories and uses a CART-like algorithm to
select trials for eliminating ILP derived hypotheses. We have developed a novel
form of learning curve, which in contrast to the form of learning curve normally
used in Active Learning, allows one to compare the costs incurred by different
leaning strategies. We plan to combine ASE-Progol with a standard laboratory
robot to create a general automated approach to Functional Genomics. As a first
step towards this goal, we are using ASE-Progol to rediscover how genes
participate in the aromatic amino acid pathway of Saccharomyces cerevisiae. Our
approach involves auxotrophic mutant trials. To date, ASE-Progol has conducted
such trials in silico. However we describe how they will be performed
automatically in vitro by a standard laboratory robot designed for these sorts
of liquid handling tasks, namely the Beckman/Coulter Biomek 2000.
Speaker: Jacques Cohen
Title:
Classification of approaches used to study cell regulation: Search for a unified
view using constraints and machine learning
Abstract:
Historically, the first attempts to study cell regulation were based on
establishing systems of nonlinear differential equations that "simulate" the
behavior of the variables involved as a function of time. In that context the
variables correspond to genes capable of producing compounds (mRNA and proteins)
that in turn influence the production capacity of other genes. This problem has
been thoroughly studied in the mathematical theory of dynamic systems. Graphs
expressing the interactions among genes are often used as convenient
abstractions enabling the formulation of the appropriate systems of differential
equations. The nodes of those graphs correspond to the variables (genes) and the
directed edges -- labeled by plus or minus signs -- express the positive or
negative interactions among the variables. In the case of cell regulation
thousands of genes may interact with each other. The advent of micro-arrays
urgently motivates the solution of the inverse problem: given the curves
expressing the capacity of each gene of producing known compounds as time
progresses, attempt to infer the original system of equations or equivalently
the graph expressing the interactions among genes. Presently biologists are
capable of generating huge sets of data expressing the amount of production of
proteins or mRNA as a function of time. One then wishes to establish the
corresponding gene interaction graph. The inverse problem may be called
"modelization" in contrast with "simulation", the term used to describe the
direct problem of determining the behavior of the variables in a system of
differential equations. There have been many attempts to discretize (render
discrete) the cell regulation problem. By considering Boolean variables instead
of continuous variables one can simplify remarkably the complexity of the
original problem. Several papers are now available that describe solutions of
both the direct and inverse problem using the Boolean approach. An important
aspect of the inverse problem is that the data obtained by biologists is not
reliable in the sense that many experimental errors occur when dealing with
thousands of genes. It then becomes imperative to take into account a
probabilistic approach of the inverse problem. In other words, given imperfect
data expressing the behavior of variables with time, can one infer the
corresponding gene interaction graph? More recently this last problem has been
simplified by having biologists conjecture the interaction graph and propose the
simpler question: How likely is the hypothesized graph capable of generating the
given data set? The approach used in this paper is to first attempt to classify
the existing methods proposed in the literature to solve the direct and inverse
problems. An initial classification involves three parameters and thus eight
subclasses: Probabilistic versus Nonprobabilistic Direct versus Inverse
Continuous versus Discrete In a recent article de Jong and Page propose a
Direct-Discrete-Nonprobabilistic type of approach; that amounts to a qualitative
(a la Kuipers) simulation of the systems of differential equations representing
a given influence graph. Even though their algorithms are described in a
standard programming language, the authors make extensive use of constraints in
formulating the solution to that problem. In this paper I first propose the use
of constraint logic programming as a paradigm for solving the discrete versions
of the cell regulation problem. Basically one considers regions in an
n-dimensional space, and transitions that specify possible paths between the
regions. Both regions and transitions are specified by sets of constraints that
restrict the values of the variables being studied. A path linking an initially
specified region, to subsequent regions via the transitions, expresses the
behavior of the system. (That description is inherent to the approach taken by
de Jong and Page.) The inverse problem amounts to determining regions and
transitions when given the paths that may be derived from the data expressing
the behavior of the variables. From those regions and transitions one should be
able to determine the desired gene interaction graph. In the particular case of
Boolean constraints -- a special case among the discrete approaches -- the
determination of the regions and transitions amounts to the problem of
synthesizing Boolean circuits from input and output data. The regions correspond
to subsets of the input data and the transitions correspond to Boolean formulas
that allow transforming the input data into output. Therefore the problem using
Boolean variables can be expressed in terms of program synthesis or machine
learning. Similarly, it is hoped that the synthesis of constraints from data
should play a role in the discrete solution of the inverse problem. An
interesting payoff of the use of constraints is that they may well pave the way
for probabilistic solutions to both the direct and indirect versions of the cell
regulation problem. The approach reinforces the importance of developing
probabilistic machine learning algorithms expressed in logic programming and in
constraint logic programming.
Speaker: Adrian Cootes
Title: Automatic
determination of protein fold signatures from structural superpositions
Abstract:
It remains unclear what principles underlie a protein
sequence/structure adopting a given fold. Local properties such as the
arrangement of secondary structure elements adjacent in sequence or global
properties such as the total number of secondary structure elements may act as a
constraint on the type of fold that a protein can adopt. Such constraints might
be considered "signatures" of a given fold and their identification would be
useful for the classification of protein structure. Inductive Logic Programming
(ILP) has been applied to the problem of automatic identification of structural
signatures. The signatures generated by ILP can then be both readily interpreted
by a protein structure expert and tested for their accuracy. A previous
application of ILP to this problem indicated that large insertions/deletions in
proteins are an obstacle to learning rules that effectively discriminate between
positive and negative examples of a given fold. Here, we apply an ILP learning
scheme that reduces this problem by employing the structural superposition of
protein domains with similar folds. This was done in three basic steps. Firstly,
a multiple alignment of domains was generated for each type of fold studied.
Secondly, the alignment was used to determine the secondary structure elements
in each of those domains that can be considered equivalent to one another (the
"core" elements of that fold). Thirdly, an ILP learning experiment was conducted
to learn rules defining a fold in terms of those core elements.
Speaker: Luc De Raedt
Title: Molecular Feature Mining in HIV Data
Abstract:
We present the application of Feature Mining techniques to the Developmental
Therapeutics Program's AIDS antiviral screen database. The database consists of
43576 compounds, which were measured for their capability to protect human cells
from HIV-1 infection. According to these measurements, the compounds were
classified as either active, moderately active or inactive. The distribution of
classes is extremely skewed: Only 1.3% of the molecules is known to be active,
and 2.7% is known to be moderately active. Given this database, we were
interested in molecular substructures (i.e., features) that are frequent in the
active molecules, and infrequent in the inactives. In data mining terms, we
focused on features with a minimum support in active compounds and a maximum
support in inactive compounds. We analyzed the database using the level-wise
version space algorithm that forms the basis of the inductive query and database
system MolFea (Molecular Feature Miner). Within this framework, it is possible
to declaratively specify the features of interest, such as the frequency of
features on (possibly different) datasets as well as on the generality and
syntax of them. Assuming that the detected substructures are causally related to
biochemical mechanisms, it should in principle be possible to facilitate the
development of new pharmaceuticals with improved activities.
Speaker: Koichi Furukawa
Title:
A Computational Model for Children's Language Acquisition using Inductive Logic
Programming
Abstract:
This paper describes our research
activity on developing a computational model for children's word acquisition
using inductive logic programming. Modeling human language acquisition is a very
hard problem. The reason why we focused children's word acquisition is that it
is an initial phase of the entire human language acquisition and therefore the
problem is rather easy compared to the entire learning activity. We tried to
incorporate cognitive biases developed recently to explain the efficiency of
children's language acquisition. We also designed a co-evolution mechanism of
acquiring concept definitions for words and developing concept hierarchy.
Concept hierarchy plays an important role of defining contexts for later word
learning processes. A context switching mechanism is used to select relevant set
of attributes for learning a word depending on the category which it belongs to.
On the other hand, during acquiring definitions for words, concept hierarchy is
developed. This mechanism is supposed to solve the selection problem of relevant
sensory inputs for each word learning task. We developed an experimental
language acquisition system called WISDOM (Word Induction System for Deriving
Object Model) and conducted virtual experiments or simulations on acquisition of
words in two different categories. The experiments showed feasibility of our
approach.
Speaker: Jean
Hayes-Michie
Title: Real Time Skill: the role of semantic
information in learning, performance and report
Abstract:
The
literature on human skills has seen the rise of two successive models described
below in simplified form: (1) BEHAVIOURIST: skill is formed by pure practice.
Trial-and-error experience builds a black box of stimulus-response bonds,
impervious to both instruction and introspection. Subjects' post hoc reports are
confabulations. (2) COGNITIVIST: skill is first acquired in 'declarative' form,
though either or both of (a) construction of a conscious model from the task's
perceived behavioural semantics; (b) tutorial exposition of the task's structure
and of applicable heuristics. Declarative forms are then internally remoulded
into procedural black boxes, but also retained for purposes of post hoc report.
EXPERIMENTAL FINDINGS The learning by 51 male and female subjects of a computer
simulation of a well-known real-time control problem supplied some useful
observations. Under certain conditions of 'cognitive overload' an otherwise
unlearnable task bacame learnable only when the screen presentation was
descriptive of the task's behavioural semantics. The cognitive overload
condition was the combination of a substantially visual task with female gender.
It is well-attested that females are in general at a disadvantage when dealing
with highly visual information, especially if it involves motion. When a
meaningful animated cartoon was substituted for an abstract representation,
female subjects showed a capacity to learn, a capacity quite absent when they
were asked to use the simple 'instrument panel' variant. This corroborates
elements of the cognitivist model, and is not explicable in terms of pure
behaviourism. But discovery by male subjects of key heuristics by trial and
error, regardless of representation, contradicts certain elements of the
cognitivist position. A need is apparent for a cognivitist-behaviourist
synthesis. A new model is suggested.
Speaker: Ross King
Title: Representing
molecules for induction using logic and quantum mechanics
Abstract:
The key to applying machine inference to scientific
problems is to have an appropriate representation of the problem. The most
successful applications of inductive logic programming (ILP) have been in the
field of learning molecular quantitative structure activity relationships
(QSARs). The basic representation used in these applications has been based on
atoms linked together by bonds. This representation, although successful and
intuitive, ignores the most basic discovery of 20th century chemistry, that
molecules are quantum mechanical objects. Here we present a new representation
using quantum topology (StruQT) for machine inference in chemistry which is
firmly based on quantum mechanics. The new representation is based on Richard
F.W. Bader's quantum topological atoms in molecules (AIM) theory. Central to
this representation is the use of critical points in the electron density
distribution of the molecule. Other gradient fields such as the Laplacian of the
electron density can also be used. We have demonstrated the utility of the use
of AIM theory (in a propositional form) in the regression problem of predicting
the lowest UV transition for a system of 18 anthocyanidans. The critical point
representation of fields can be easily mapped to a logic programming
representation, and is therefore well suited for ILP. We have developed an
ILP-StruQT method and are currently evaluating it for QSAR problems.
Speaker: Jan Komorowski
Title:
On Discovering Gene Function with Approximate Methods
Abstract:
Functional Genomics is the study of gene function on a large scale that is done
by conducting parallel analysis of gene expression for thousands of genes. This
research is still in an early phase but has been recently sped up by the advent
of high-throughput technologies such as, for instance, microarrays. With the
completion of several genomes, the attention is shifting towards discovering
function of new genes. Critical to any scientific discovery is the ability to
re-use the existing knowledge. Somewhat surprisingly, most if not all work
currently reporting on the analysis of gene expression applies unsupervised
approaches that do not really use the huge repositories of biological knowledge.
In our recent work, we have taken a supervised approach to predicting gene
function from gene expression time-series profiles and from background
biological knowledge. To this end we annotated known genes with terms from Gene
Ontology. Using rough sets, as implemented in the publicly available ROSETTA
toolkit, we then generated a predictive model of high quality (AUC value 0.83)
that provided, among others, function hypotheses for uncharacterized genes. The
method is robust in the following sense. Despite missing annotations of some
known genes, the model appeared to be able to classify and re-classify several
of the known genes in a way that was new to biologists. In other words, the
model discovered biological knowledge that was not present in the learning
examples. These discoveries, represented as false positive classifications, were
initially dismissed. After a literature-based re-examination by biologists, a
significant portion of the false positives appeared to be correct. A
considerable number of the literature confirmations of false positive
classifications could be visualized by using PubGene (http://www.pubgene.org/),
a tool developed by us for effective retrieval of literature-based knowledge
about relations between genes. Our recent publication in Nature Genetics 28,
21-28 (2001) gives details about that work. Our work demonstrates that Gene
Ontology annotations emulate deep biological knowledge that may be functionally
linked to gene expression profiles thereby aiding the discovery of new gene
functions. We conclude that methods of approximate reasoning that are able to
deal with large amount of conflicting and missing knowledge will be instrumental
to the further development of Functional Genomics.
Speaker: Donald Michie
Title:
Return of the "imitation game"
Abstract:
Very recently there
has been an unexpected rebirth of Turing's imitation game in the context of
commercial demand. To meet the new requirements the following properties of
human conversation constitute a minimal list of what must be simulated. Real
chat utterances are concerned with associative exchange of mental images. They
are constrained by contextual relevance rather than by logical or linguistic
laws. Time-bounds do not allow real-time construction of reasoned arguments, but
only the retrieval of stock lines and rebuttals, assembled Lego-like on the fly.
A human agent has a place of birth, age, sex, nationality, job, family, friends,
partners, hobbies etc. On meeting again with the same conversational partner, a
human agent is expected to recall the gist of previous chats, as well as what
has passed in the present conversation so far. A consistent personality emerges
from a human's expression of likes, dislikes, pet theories, humour, stock
arguments, superstions, hopes, fears, aspirations etc. A human agent typically
has a main goal, of fact-provision, fact elicitation, wooing, selling, "conning"
etc. at each stage. A human agent remains ever-ready to maintain or re-establish
rapport by switching from goal mode to chat mode. Some mechanisms and tactics
for simulating these things will be illustrated from work in progress.
Speaker: Satoru
Miyano (represented by Hideo Bannai)
Title: HypothesisCreator: Concepts for Accelerating the Computational
Knowledge Discovery Process
Abstract:
We summarize and
discuss the work accomplished through the HypothesisCreator (HC) project, an
ongoing effort whose aim is to develop systematic methods to accelerate the
computational knowledge discovery process. A novel approach to describe the
knowledge discovery process, focusing on a generalized form of attribute called
view, is presented. It is observed that the process of knowledge discovery can,
in fact, be modeled as the design, generation, use, and evaluation of views,
asserting that views are the fundamental building blocks in the discovery
process. Also, we note that the trial-and-error cycle inherent in any knowledge
discovery process, can be formulated as the composing and decomposing of views.
To assist this cycle, a programming language called VML (View Modeling Language)
was designed, providing facilities for this purpose. VML is designed as an
extension to the functional language ML with the extensions `view' and `vmatch',
which are used for defining and decomposing views. We describe, as VML programs,
successful knowledge discovery tasks which we have actually experienced, and
show that computational knowledge discovery experiments can be efficiently
developed and conducted using this language. The original idea for VML, however,
was only informally described and therefore contained subtle problems both in
theory and practice. This was settled by Sumii and Bannai (2001) who gave the
formal foundation of VML by defining VMlambda, an extention to the standard
simply typed call-by-value lambda-calculus.
Speaker: Shinichi Morishita
Title: Enumerating Alternative Spliced Transcripts for Understanding
Complexity of the Human Genome
Abstract:
According to recent
several reports, the number of human genes is estimated fewer than 40,000, which
for example only doubles 19,000 of Caenorhabditis elegans. This fact indicates
that the correlation between complexity of a species and its gene number appears
to be loose, motivating studies on other sources of complexity such as
alternatively splicing and cis-regulatory machinery. Machine learning would be
useful to uncover relevance between those mechanisms and biological
consequences, but the first step towards this research direction would be to
list candidates of alternatively spliced transcripts and promoters. Because of
the release and updates of the human genome, this enumeration could be achieved
by aligning millions of expressed sequence tags (ESTs, for short) to a draft
human genome and then organizing alignments into related groups, which is
however computationally costly. We therefore have developed an efficient
algorithm for this purpose, taking only one day to align millions of ESTs to a
newly revised draft genome from scratch. Analysis of 2.2 millions of alignments
identifies about 8,000 groups of alternatively spliced transcripts. A typical
group contains tens of alignments, demanding a method of selecting
representatives. All the results are accessible through the WWW at our Gene
Resource Locator web site "http://grl.gi.k.u-tokyo.ac.jp/."
Speaker: Stephen Muggleton
Title:
Are grammatical representations useful for learning from biological sequence
data?- a case study
Abstract:
This paper investigates whether
Chomsky-like grammar representations are useful for learning cost-effective,
comprehensible predictors of members of biological sequence families. The
Inductive Logic Programming (ILP) Bayesian approach to learning from positive
examples is used to generate a grammar for recognising a class of proteins known
as human neuropeptide precursors (NPPs). Collectively, five of the co-authors of
this paper, have extensive expertise on NPPs and general bioinformatics methods.
Their motivation for generating a NPP grammar was that none of the existing
bioinformatics methods could provide sufficient cost-savings during the search
for new NPPs. Prior to this project experienced specialists at SmithKline
Beecham had tried for many months to hand-code such a grammar but without
success. Our best predictor makes the search for novel NPPs more than 100 times
more efficient than randomly selecting proteins for synthesis and testing them
for biological activity. As far as these authors are aware, this is both the
first biological grammar learnt using ILP and the first real-world scientific
application of the ILP Bayesian approach to learning from positive examples. A
group of features is derived from this grammar. Other groups of features of NPPs
are derived using other learning strategies. Amalgams of these groups are
formed. A recognition model is generated for each amalgam using C4.5 and
C4.5rules and its performance is measured using both predictive accuracy and a
new cost function, Relative Advantage (RA). The highest RA was achieved by a
model which includes grammar-derived features. This RA is significantly higher
than the best RA achieved without the use of the grammar-derived features.
Predictive accuracy is not a good measure of performance for this domain because
it does not discriminate well between NPP recognition models: despite covering
varying numbers of (the rare) positives, all the models are awarded a similar
(high) score by predictive accuracy because they all exclude most of the
abundant negatives.
Speaker: Nils Nilsson
Title: Teleo-Reactive Programs and the Triple-Tower Architecture
Abstract:
I describe how a new programming format is used in a
proposed architecture for linking perception and action in a robot. Along the
way, I mention some analogies with control mechanisms in animals. The
programming format is called 'teleo-reactive' because it allows programs to
react dynamically to changing situations in ways that lead inexorably toward
their goals. The proposed architecture consists of three "towers" of layered
components. The "perception tower" consists of rules that create ever-higher
descriptions of the environmental situation starting with the primitive
predicates produced by the robot's sensory apparatus. These descriptions,
themselves high-level predicates, are deposited in a "model tower" which is
continuously kept faithful to the current environmental situation by a
"truth-maintenance" system. The predicates in the model tower, in turn, evoke
appropriate action-producing programs in the "action tower." The action tower
consists of teleo-reactive programs that are organized more-or-less
hierarchically. Execution of the lowest-level action programs causes the robot
to take primitive actions in its environment. The effects of the actions are
sensed by the robot's sensory mechanism, completing a sense-model-act cycle that
is quiet only at those times when the robot's goal is satisfied. I demonstrate
the operation of the architecture using a simple block-stacking task.
Speaker: Philip
Reiser
Title: Developing a Logical Model of Yeast Metabolism
Abstract: With the completion of the sequencing of genomes of an
increasing number of organisms, the focus of biology is moving to determining
the role of these genes (functional genomics). To this end it is useful to view
the cell as a biochemical machine: it consumes simple molecules to manufacture
more complex ones by chaining together biochemical reactions into long sequences
referred to as metabolic pathways. Such metabolic pathways are not linear but
often intersect to form a complex network. Genes play a fundamental role in this
network by synthesising the enzymes that catalyse biochemical reactions.
Although developing a complete model of metabolism is of fundamental importance
to biology and medicine, the size and complexity of the network has proven
beyond the capacity of human reasoning. This paper presents intermediate results
in the Robot Scientist research programme that aims to discover the function of
genes in the metabolism of the yeast Saccharomyces cerevisiae. Results include:
(1) the first logical model of metabolism; (2) a method to predict phenotype by
deductive inference; and (3) a method to infer reactions and gene function by
abductive inference. We describe the in vivo experimental set-up which will
allow these in silico inferences to be automatically tested by a laboratory
robot.
Speaker: Zhanna Reznikova
Title: How to study ant's numerical competence using their own
communicative means and ideas of information theory
Abstract:
The main point of proposed approach to study ants' cognitive abilities is
that our experiments provide a situation in which insects have to transmit
information quantitatively known to the experimentalist in order to obtain food.
One may estimate some properties of ant intelligence by measuring complexity of
tasks they solve in order to pass definite pieces of information from scouts to
foragers. Our previous experiments, basing on ideas of Information Theory, have
shown that ants are able to memorize and transmit messages concerning sequence
of turns toward a trough of syrup and use the simplest regularities to compress
the information. To reveal counting and number related skills, we suggested red
wood ants Formica polyctena to transmit information on the number and
coordinates of objects. One of the experimental set-ups consisted of a «tree
trunk» with branches that ended in empty troughs, except for one which was
filled with syrup. Another set-up consisted of a lattice which simulated
Cartesian coordinates. The foragers of F. polyctena were separated into teams of
5-8 individuals , each with one scout. All laboratory ants were marked with
coloured labels. To start the experiment, an ant scout was placed at the
randomly numbered trough containing food and then returned to the nest on its
own. The duration of the contact between foragers and the scout was measured.
Then we removed the scout and the foragers had to search for the food by
themselves. The experiments were so devised as to eliminate all possible ways
that may help to find food, except for distant homing. It turns out that the
ants are able to count within several tens, and transmit this information to
their nestmates. The analysis of time duration of ants' contacts enable us to
create a hypothesis of how they use numbers and coordinates in their
communication. We suppose that only a few highly social ant species use such a
complex communication system based on cognitive processes. At the same time, we
believe that the experimental schemes described can be used to study the
communication systems and numerical competence of other animals.
Speaker: Boris Ryabko
Title: The
nonprobabilistic approach to learning the best prediction
Abstract:
The problem of predicting a sequence x_1, x_2,....
where each x_i belongs to a finite alphabet A is considered. Each letter x_{t+1}
is predicted using information on the word x_1, x_2, ...., x_t only. We use the
game theoretical interpretation which can be traced to Laplace where there
exists a gambler who tries to estimate probabilities for the letter x_{t+1} in
order to maximize his capital . The optimal method of prediction is described
for the case when the sequence x_1, x_2,.... is generated by a stationary and
ergodic source. It turns out that the optimal method is based only on
estimations of conditional probabilities. In particular, it means that if we
work in the framework of the ergodic and stationary source model, we cannot
consider pattern recognition and other complex and interesting tools, because
even the optimal method does not need them. That is why we suggest a so-called
nonprobabilistic approach which is not based on the stationary and ergodic
source model and show that complex algorithms of prediction can be considered in
the framework of this approach. The new approach is to consider a set of all
infinite sequences (over a given finite alphabet) and estimate the size of sets
of predictable sequences with the help of the Hausdorff dimension. This approach
enables us first, to show that there exist large sets of well predictable
sequences which have zero measure for each stationary and ergodic measure. (In
fact, it means that such sets are invisible in the framework of the ergodic and
stationary source model and shows the necessity of the new approach.) Second, it
is shown that there exist quite large sets of such sequences that can be
predicted well by complex algorithms which use not only estimations of
conditional probabilities.
Speaker: Claude Sammut
Title:
Managing Context in a Conversational Agent
Abstract:
Traditional methods for Natural Language Processing have failed to deliver
the expected performance required in a Conversational Agent. This is mainly due
to the fact the speakers rarely use grammatically complete and correct sentences
in conversation. Furthermore, speakers make significant assumptions about the
background knowledge of the listener. Thus, pragmatic knowledge about the
context of the conversation turns out to be a much more important factor in
understanding an utterance than traditional linguistic analysis. For this
reason, the a majority of conversational systems being implemented favour
shallow parsing techniques. The challenge for such systems is in managing
information about the context of the conversation. This paper describes a
conversational agent, called "ProBot", that uses a novel structure for handling
context. The ProBot is implemented as a rule-based system embedded in a Prolog
interpreter. The rules consist of patterns and responses, where each pattern
matches a user's input sentence and the response is an output sentence. Both
patterns and responses may have attached Prolog expressions that act as
constraints in the patterns and can invoke some action when used in the
response. Rules are grouped into sets, called contexts. There is always a
"current" context. This is the set of rules that is considered most applicable
in to the current state of the conversation. A user's input is processed by
attempting to match the sentence with the pattern in each rule. The first match
causes that corresponding rule's response to be invoked. In addition to
outputting an appropriate response, the rule can also invoke a Prolog program.
One such program switches the current context. We can represent the course of a
conversation as a the traversal of a graph in which each node is a context. The
rules of the current context are used to carry on the conversation for some
period, a transition to a new context corresponds to the traversal of an arc
from one node in the graph to another. While this scheme works well for a highly
structured conversation, it fails to take into account sentences that either
change the topic of the conversation or those that contain expressions that the
rules of the current context are not able to handle. To avoid these problems, we
introduce the "filters" and "backups". A filter is a set of rules that is
prepended to the current context. That is, the filter rules are examined for a
match before any of the rules in the current context. Typically, filter rules
contain keywords that assist the system in guessing that a user is changing the
topic of the conversation. When a filter rule matches, the usual response is to
switch the current context. Backup rules are appended to the current context.
Their function is to catch user inputs that do not match any rules in the
current context. Usually backup rules produce responses that temporise,
provoking the user into providing further input that may be recognised by the
system. In this paper, we give a detailed discussion of the architecture of the
ProBot and explain the origins of this design. We also give some indication of
future work. In particular, writing scripts for conversational agents is
extremely labourious, but automating the construction of scripts through Machine
Learning is very challenging. We discuss some of these challenges.
Speaker: Erik Sandewall
Title: On the Design of Software Invididuals
Abstract:
We use the term 'software individuals' to designate aggregates of programs
and data that have the following properties: - They exist in a population of
similar, but not identical individuals. - Individuals are able to interact with
their surrounding environment, with each other, and/or with people. While doing
so they may modify their internal state. - Each individual contains the
safeguards that may be required in order to select which influences to
accomodate and which ones to ignore. - The aggregate of programs and data that
define an individual, and that in particular define its behavior, is a part of
its internal state and can therefore be modified as the result of the
interactions where the individual is involved. - Individuals or small groups of
individuals are able to create new individuals that inherit the features of
their parent(s). - The program/data aggregate that defines an individual is
symbolic in character. The ability for knowledge representation is designed into
individuals from the start. The last item distinguishes software individuals
from the software that is evolved by genetic programming. In this article we
address the question of design principles for software individuals, and approach
it as a software design issue. This problem is of course an unconventional one
from the point of view of ordinary software engineering. The emphasis on
software self-modification runs counter to conventional software design
principles, where the inherent ability of the von Neumann machine to modify its
own programs is strongly restricted in particular through the design of
conventional programming languages. In our approach, program self-modification
is viewed as an important possibility, and not as an abberration. Our proposed
Software Individual Architecture is characterized by the following principles: -
The use of a Software Individual Kernel (SIK) which is a minimal individual
having the basic properties of self-replication, self-modification, interaction,
and ability to administrate its own state. - The use of a Knowledge Object
Repository that is accessible by, but external to software individuals, and that
can be shared by several individuals in a population. This repository is also
the basis for several of the user services that are built using the
architecture. - Each Software Individual Kernel is set up as a directory
structure (directory and all its sub-directories) in the file system of a
particular computer host. A population may 'live' within one host, or in a
network of several hosts. Each individual is self-contained in this structure
and each carries with itself (i.e., contains) a full set of the required
programs and data defining it. - An individual can acquire additional features,
besides those present in its Kernel, by adopting additional program and data
modules, which either are sent to it by other individuals, or by importing them
from the Repository. A number of previously developed knowledge management
services have been ported so that they can be acquired by the new Software
Individual Kernel. - We have worked the Software Individual Kernel so as to
achieve maximal simplicity and generality. The full paper will describe the
design of the Software Individual Kernel. It will also discuss how the design
and use of this kind of software differs from the design of conventional
software. Finally it will discuss the relevance of Software Individuals as the
basis for implementing A.I. systems.
Speaker: Steffen Schulze-Kremer
Title: Ontologies for Molecular Biology
Abstract:
Molecular biology has a communication problem. There are many databases
using their own labels and categories for storing data objects and others using
identical labels and categories but with a different meaning. One prominent
example is the concept "gene" which is used with different semantics by major
international genomic databases. Ontologies are one means to provide a semantic
repository to systematically order relevant concepts in molecular biology and to
bridge the different notions in various databases by explicitly specifying the
meaning of and relation between the fundamental concepts in an application
domain. Here, design principles for ontology building and the upper level and a
database branch of a prospective ontology for molecular biology is presented and
compared to other ontologies with respect to suitability for molecular biology
(http://igd.rz-berlin.mpg.de/~www/oe/mbo.html).