ETAI, Reasoning about Actions and Change

NRAC Panel Discussion on Theory Evaluation

This on-line session is a continuation of a panel discussion at the NRAC, chaired by Leora Morgenstern. We begin with the written notes of the panelists, and continue with the comments by the participants which can have the same style as in conference panel discussion: comments between the panelists, questions to the panelists from the "floor" (that is, all of you who see this page). Then we'll see where the discussion takes us.

The present page will contain the full text of the contributions, as often as possible, and will be updated as the discussion proceeds. Exception: if a discussion contribution is quite heavy with formulas, then possibly it will only be posted in postscript format, and a link to it will be given here.

The headlines fields contain the date of the Newsletter issue where the contribution was sent out, the name of the author, and the title (if any) that was used for the contribution. Headlines with reddish background color is used for position statements.

The monthly News Journals postscript version contains the debate contributions that were received during the month in question.

  Leora Morgenstern Panel on Theory Evaluation: Issues and Questions.

Why have a panel on theory evaluation? Nonmonotonic reasoning, action, and change have been studied by the AI community for the past 2 - 3 decades. There has been much churning out of new theories, but limited attempt at analysis of these theories or at introspection. We tend to have little perspective about our work. There's been very little discussion of what makes a theory good, what makes a theory last, how much progress we've really made, and what are good ways to encourage progress in the future. This panel is intended to jump start a discussion on these issues.

Questions and issues to be discussed are divided into 2 broad categories:

  1. By which criteria do we evaluate theories?
  2. Can we understand the history of research on nonmon, action, and change in broader historical terms, as suggested by Kuhn, Lakatos, and Laudan?

Criteria for evaluation of theories

What makes a theory of nonmonotonic reasoning, action, and/or change a good theory? (These may be the same things that make any AI theory good.) Do we judge a theory by

What gives a theory staying power? What are some examples of theories with staying power? Are these always the good ones? Specifically, are there examples of good theories which didn't last very long in the AI community? Examples of bad theories which did last long? (And who will be brave enough to identify these ;-))

Understanding research in a broader, historical perspective

Thirty-five years ago, Thomas Kuhn suggested that the history of science is best understood as a cycle of periods of "normal science" followed by "revolutionary science." It works as follows: A theory is developed which solves some problems. The theory is associated with a "paradigm," which is, to quote Kuhn, "the entire constellation of beliefs, values, techniques, and so on shared by the members of a given community." As time goes on, new problems are discovered which the theory doesn't solve; the theory is modified slightly, and the process continues, until all of a sudden, it becomes apparent to some that the old paradigm just doesn't work. Then comes the "revolutionary" phase, in which a new paradigm is suggested and refined, and the "normal" phase starts again. (The classic example of this is the geocentric theory of the universe, which explained certain phenomena; as new phenomena were discovered, this theory had to be modified (epicycles and deferents), until it became clear (to Copernicus, Galileo, Kepler, etc.) that the geocentric theory just wouldn't work. The revolutionary phase supplanted the geocentric paradigm with the heliocentric paradigm, which then became normal science.)

Questions: can we understand the history of our field in this way? If so, are we in a "normal" phase or a "revolutionary" phase? Can we identify any such phases? Or are we still in one of the prehistoric phases?

Or -- perhaps we are better off viewing our history from another perspective. Lakatos suggests that there's no one "normal" paradigm at any one time, but a number of competing research programmes. What unites these programmes is a core set of assumptions; however, there are different auxiliary assumptions. What research programmes can we identify? Do we subscribe to a core set of beliefs? Which programmes, to use Lakatos's terms, are progressive? Which are degenerative? Have any become degenerative and then popped back to being progressive?

Or should we subscribe to Laudan's description of "research traditions" which deny a core set of beliefs, but assert a common set of ontological assumptions and a common methodology for revising old theories and developing new ones?

Any other suggestions?

Is it worthwhile going through this exercise at all? It could be argued that the major developments of physics, astronomy, biology, occurred without much introspection at all, and this is perhaps valueless. On the other hand, we could argue that given the miserable state of research in nonmonotonic reasoning and action today, we need all the analysis and introspection we can get.

Any more ideas?

Finally, if you want to get into the swing of theory evaluation, you may want to look at: Erik's book (Features and Fluents) My article "The Problems with Solutions to the Frame Problem" available at http://www-formal.stanford.edu/leora (available also in the collection of papers "The Robot's Dilemma Revisited", Ablex, 1996, but the web is more accessible).
  David Poole Modelling Language vs. Repository of Common-Sense Facts.

My guess is that we are in a phase of normal science. The revolution is coming. When we have to explicitly consider uncertainty much of what we think we understand now will have to be thrown out.

In order to go about evalation, we have to make our goals clear. (If it doesn't matter where you want to get to, it doesn't matter much which way you go, to paraphrase Lewis Carrol). There are two quite different goals people have in building KR system; there is much confusion generated by not making it clear what you are doing (so much so that the researchers who take one view often don't understand what the others are doing and why). These are:

1. A knowledge representation as a modelling language. If you have a domain in your head you can use the KR to represent that domain. The builder of a KR is expected to give a user manual on how to axiomatize the domain. There are right ways of saying something and there may be wrong ways of saying it. Missing knowledge may mean something. Prolog and Bayesian networks are examples of such knowledge representations.

2. A knowledge representation as a repository of facts for commonsense reasoning. Under this scenario, you assume you are given a knowledge base and you are to make as much sense out of it as possible. It isn't OK for the designer of the KR to prescribe how a domain should be axiomatized. The KR should be able to get by with whatever knowledge it has. Much of the nonmon work assumes this (as far as I can see).

If you goal is the first, you probably want a very lean language which doesn't provide multiple ways of doing the same thing. You want to provide a recipe book about how to go about modelling a domain. It should be judged by whether someone can go from an informal problem (not a representation of a problem) to a solution efficiently. Does it provide a good way to think about the world? Can it exploit any structure of the domain for efficiency?

If your goal is the second, you probably want a rich language that lets you state as much as possible. It should be some free-form language that doesn't constrain you very much. Here we need to go from a representation of a problem into a solution. Does it provide reasonable answers? Can the user debug the knowledge base if an aswer is wrong?

I have two ways of judging a representation:

  1. Can I teach it to cynical undergraduates without squirming? Can I make a case that this is the obvious answer?
  2. How well does it work in practice? What is the range of practical problems for which it provides a solution?
  Erik Sandewall What Should Count as a Research Result in our Area?.

I want to focus on Leora's first issue - criteria for the evaluation of theories, and I think the first thing to discuss is what could or should reasonably count as a research result in our area, that is, what things ought to be citable in the literature. "Citable" means that they are crisp, have a lasting value, that later researchers can build on earlier results, etc. Then, presumably, some of the respectable research results are those which tell us something about the qualities of a particular theory/approach/formalization/...

Research results presumably come in several colors and shapes; I am thinking of categories such as the following:

To the extent that we have this kind of solid results, we can evaluate proposed theories (= formalism + semantics + entailment method ??) with respect to their range of applicability and their computational properties.

With respect to David's distinction between knowledge representations that are modelling languages and those that are intended for repositories of common-sense facts, my heart is with the former kind. Among the above categories of results, those that concern or make use of a formal semantics probably only make sense in the context of a modelling language, since the notion of common sense is so vague and inherently difficult to capture.

  Pat Hayes Contribution to the Panel Debate on Theory Evaluation.

First, let me urge caution about getting too tied up in the Kuhnian vocabulary which Leora has introduced us to, for several reasons. First, Kuhn was talking about rather larger changes in scientific view than our entire field can realistically aspire to: things like the Newtonian revolution in physics. Second, Kuhn's story is easily distorted by being too quickly summarised, as Kuhn himself complained on several occasions; and third, because Kuhn himself later rejected it as overly simple and potentially misleading. The last thing we need is broad, historical discussion by amateur historians using an over-simplified theoretical vocabulary which is already out of date.

So, to turn to more practical matters:

Leora writes:

What makes a theory of nonmonotonic reasoning, action, and/or change a good theory? (These may be the same things that make any AI theory good.) Do we judge a theory by

Well surely we need to first focus on what problems we are expecting it to solve. Suppose someone in this field were to announce that they had the whole thing finished, all the problems solved, etc. What tests would we ask them to pass before we believed them? What do we expect these nonmonotonic logics to be able to DO, exactly?

Its not enough to just say, 'to reason properly'. We need some characterisation of what that proper reasoning is, or at least some examples of where it can be found. For 25 years we have been appealing to a vague sense of intuitive reasonableness, but this is a very weak basis to test theories on. Even linguists, whose empirical methods are treated with ridicule by experimental psychologists, have some statistical data to back up their 'seems-grammatical-to-native-speaker' criteria, but we don't have any hard data about 'common sense' at all, and the intuitions we appeal to often confuse linguistic, psychological and pragmatic issues.

One worry I have is that it seems to be impossible to test a logic or formalism as such, since the intuitiveness or otherwise of any example depends as much on the way the intuition is encoded in that logic as on the logic itself. Logics seem to require a kind of two-stage evaluation. Knowledge-hackers try to formalise an intuition using logic A and find it hard to match formal inference against intuition no matter how ingenious they are with their ontologies and axioms; so they turn to logic B, which enables them to hack the examples to fit intuition rather better. But the intuitive test is always of the axioms/ontologies, not of the logics themselves: there is always the possibility that a more ingenious hacker could have gotten things right with logic A, if she had only thought of the right ontological framework. For example, it has become almost accepted as revealed truth in this field that common sense reasoning isnt compatible with monotonic logic, because of examples such as if you are told that an automobile exists then you infer that its in working order, but if you later hear its out of gas you change your mind. (Or if you hear its only a toy bear, or a penguin, etc.) All of these examples assume that the new knowledge is simply conjoined onto the previous knowledge: you know some stuff, new stuff arrives, and you just chuck it into the set of mental clauses and go on running the mental inference engine. But maybe the updating process is more complicated than that. Maybe when you hear that the car tank is empty, you don't just add some new information, but also remove some previous assumptions; and maybe this is not part of the reasoning process but of the linguistic comprehension process. If so, then the representation may, perhaps, be able to use monotonic logic perfectly happily. Maybe not; but my point is only that the argument that it must be nonmonotonic makes assumptions about other mental processes - specifically, those involving the integration of new information - which have not been examined critically.

What gives a theory staying power? What are some examples of theories with staying power? Are these always the good ones? Specifically, are there examples of good theories which didn't last very long in the AI community? Examples of bad theories which did last long? (And who will be brave enough to identify these ;-))

I'll take on that onerous task. At the risk of treading on almost everyone's toes, let me propose the situation calculus; or more properly the idea behind it, of describing change in terms of functions on world-states. "Bad theory" isnt really right: it was a really neat theory for a while, and better than anything going, and its still useful. But it has some pretty dreadful properties; and yet not only has it lasted a long time, but its almost considered to be inviolable by many people in the field. And even its critics - for example, Wolfgang Bibel's recent IJCAI survey gives an alternative approach based on limited-resource logics - seem to me to miss the essential things that are wrong with it.

This deserves a much longer treatment, but here are a few of the things that are wrong with sitcalc. First, its based on an overly simplistic view of the way things happen in the everyday world, one obviously inspired by reasoning about what happens inside computers. The everyday world just doesnt consist of static states and functions between them: its not organised like a series of snapshots. Sitcalc belongs with SHAKEY, in a world where only the robot can move and nothing else is happening.

Second, sitcalc only works properly if we are careful only to mention processes which can be acted upon; that is, it confuses change with action. (Consider how to describe the growth of a plant in sitcalc. It seems easy enough: something like this might be a beginning:


    (Alive(p, s) & Height(p,s)=h & Watered(p,s)) =>
       (Alive(p, grow(s)) & Height(p, grow(s)) = h)
But in the sitcalc this would mean that there was an action called 'grow'. (All gardeners would find this action very useful, no doubt.)

Third, it confuses action with inference. The way that actions are described in the sitcalc involves asserting conditions on the past and inferring conclusions about the future: axioms have the general form ...(s) => ...(action(s)). But common-sense reasoning often involves reasoning from the present to the past (as when we infer an explanation of something we see) or more generally, can move around in time quite freely, or may have nothing particularly to do with time or action. We are able not just to say that if the trigger is pulled then the target will be dead, but also, given the corpse, that someone must have pulled the trigger. In the sitcalc this would require giving necessary and sufficient conditions for every action description, and Reiter's recent attempt to rejuvenate it does. (This conception of intuitive thought as being a progressive inferential progress in a past-to-future direction has been responsible for many other blind alleys, such as much of the work on principles for 'maintaining' truth as long as possible.)

Most intuitive reasoning done by humans lies entirely outside the purview of the situation calculus. Yet so firm has been the grip of the sitcalc ontology on people's thinking that examples which do not immediately fit into it are routinely ignored, while entire libraries are devoted to overcoming artificial problems, such as the frame problem and the YSP, which only arise in the sitcalc framework. Which brings us to the fourth thing wrong with sitcalc: it has many fatal, or at any rate very intractable, technical problems. Why is it that the only people who feel at all bothered by the frame/ramification/qualification problems are philosophers (who mostly dont even understand what they are) and people working in this rather isolated part of KR? Why hasnt the FP become a central difficulty in, say, natural language work, or qualitative physics, or planning (as used in industrial applications)? Because those fields typically dont use this clumsy ontology, that's why. These problems are all artifacts of the sitcalc; they are all concerned with how to keep track of what is true in what 'state' .

Then, Erik writes:

What Should Count as a Research Result in our Area?.

I want to focus on Leora's first issue - criteria for the evaluation of theories, and I think the first thing to discuss is what could or should reasonably count as a research result in our area, that is, what things ought to be citable in the literature. "Citable" means that they are crisp, have a lasting value, that later researchers can build on earlier results, etc. Then, presumably, some of the respectable research results are those which tell us something about the qualities of a particular theory/approach/formalization/...

Yes, but 'theory' is crucially ambiguous here. One of the biggest failures of the KR community generally is that it is virtually impossible to actually publish a knowledge representation itself! One can talk about formalisms and semantics and equivalences etc. etc. (the stuff in Erik's list), but this is all part of the metatheory of knowledge representation. But when it comes to actually getting any representing done, we hardly hear about that at all. Examples of actual formalizing are usually given as counterexamples to some conjectured technique rather than as things to be studied and compared in their own right.

There's nothing wrong with metatheory, provided there is something there for it to be the metatheory of. Right now, the chief problem with this field is that we've run out of subjectmatter. McCarthy set out in a pioneering direction, but instead of continuing his movement, we've set camp and are arguing interminably about what kind of compass to use. Let's get some actual knowledge represented, and only then study how it works and fit our theories to the things we find.

For example, here's an issue which might have some meat on it. Erik mentions Allen's time-interval algebra. Now, timepoints and intervals are a pretty simple structure, mathematically speaking, but nevertheless Allen's algebra has its problems. In particular, its not really compatible with the usual view of intervals as sets of points on a line -- for details, see


    http://www.coginst.uwf.edu/~phayes/TimeCatalog1.ps
    http://www.coginst.uwf.edu/~phayes/TimeCatalog2.ps
I used to be convinced, all the same, that having intervals as a basic ontological category was fundamentally important, and spent a lot of time finding ways to show that certain interval theories were reducible to others and that points were definable in terms of intervals, etc.. But when I try to actually use these concepts to write axioms about clocks, timedurations, calendars and dates, I find that in fact the concept of interval is almost useless. One can think of an interval as just a fancy way to talk about a pair of points; and when one does so, the entire Allen apparatus just dissolves away into a simple theory of total linear order, all the axioms become simpler (for example, instead of writing 'duration (between(p,q))' one simply writes 'duration(p,q)'; there is no need to refer to the interval defined by the endpoints) and everything becomes clearer and more intuitive (for example, many quite natural relations on intervals become awkward disjunctions in the Allen framework, such as {before, meet, overlap, start, equal}, which is p1 => p2 ). So maybe there isnt much use to the concept of 'interval' at all: or, more exactly, since Allen intervals can't be thought of as sets of points but are uniquely specified by their endpoints, maybe thats really all they are, and the elaborate Allen theory is like the Wizard of Oz.

So, two points. First, in response again to Erik, when do we decide that something warrants the title of "theory/ approach/ formalisation.."? The sit. calc. is just a style of writing axioms, and the Allen algebra is just a complicated way to arrange order relationships. These seem to be little more than what Apple tried to sue IBM for, ie something like a 'look-and-feel'.

Second, more substantially: this is all because time is one-dimen\-sional. I bet the story for spatial reasoning will be quite different, as there is no way there to encode the topology into an ordering. Now, what kinds of action and change make essential reference to two- or -three-dimensional things, and how can we formalise these? For example, consider the verbs 'spread', 'cover', 'surround', 'embed', 'emerge','penetrate' and similar actions that refer to a change in some spatially extended relation. Any ideas on this? Has anyone in this area even considered such actions/changes?

 
  23.10 Murray Shanahan    

Pat wrote:
  Suppose someone in this field were to announce that they had the whole thing finished, all the problems solved, etc.. What tests would be ask them to pass before we believed them?

We need some characterisation of what ... proper reasoning is, or at least some examples of where it can be found. ... we don't have any hard data about 'common sense' at all, and the intuitions we appeal to often confuse linguistic, psychological and pragmatic issues.

This is where building robots based on logic-based KR formalisms comes into its own. When we construct a logical representation of the effects of a robot's actions and use that theory to decide the actions the robot then actually performs, we have one clear criterion for judging the formalisation. Does the robot do what it's supposed to? There are other criteria for judging the formalisation too, of course, such as its mathematical elegance. But when our formalisations are used to build something that actually does something, we're given an acid test. Furthermore, when the "something that actually does something" is a robot, we're forced to tackle issues to do with action, space, shape, and so on, which I think are crucial to common sense.

  One of the biggest failures of the KR community generally is that it is virtually impossible to actually publish a knowledge representation itself! One can talk about formalisms and semantics and equivalences etc. etc. (the stuff in Erik's list), but this is all part of the *metatheory* of knowledge representation. But when it comes to actually getting any representing done, we hardly hear about that at all.

Absolutely! More papers in the Naive Physics Manifesto vein, please. However, I did manage to "actually publish a knowledge representation itself" in ECAI-96, and won the best paper prize for it. The paper supplies axioms describing the relationship between a mobile robot and the world, specifically the effect of the robot's actions on the world and the impact of the world on the robot's sensors. Two papers on the same theme appear in AAAI-96 and AAAI-97. (See

    http://www.dcs.qmw.ac.uk/~mps/pubs.html
under the Robotics heading.)

 
  23.10 Erik Sandewall    

Pat,

I am puzzled by your remarks, because while I agree with most of your points, I think they have already been answered by research especially during the last five years. With respect to your second point, concerning the situation calculus as an example of a theory with staying power but considerable weaknesses, exactly those observations have led to the work on reasoning about actions using first-order logic with explicit metric time (integers and reals, in particular). This approach was introduced in systematic fashion by Yoav Shoham. It has been continued under the banners of "features and fluents" (in my own group) and "event calculus" (Shanahan, Miller, and others). To check off your points, we do model the world with successive and (if applicable) continuous change, we are able to reason about exogenous events, and of course we can combine prediction, postdiction, planning, and so on in the same formal system. Also, we do use pairs of numbers to characterize intervals. It is true that the classical Kowalski-Sergot paper from 1986 about the event calculus is formulated in terms of intervals and does not mention metric properties, but the more recent event-calculus literature uses timepoints and defines intervals as pairs of timepoints.

With respect to worlds where there is change that's not being caused by actions, see my KR 1989 paper which proposes how to embed differential calculus into a nonmonotonic logic, and to generalize minimization of change to minimization of discontinuities for dealing with mode changes in a hybrid world. See also the IJCAI 1989 paper which shows how to reason about actions in the presence of such external events, under uncertainty about their exact timing. The same general approach has been pursued by Dean, Shanahan, Miller, and others, and Murray Shanahan's award paper last year shows that this is now a very productive line of research.

We can certainly discuss whether the shortcomings in the basic sitcalc can be fixed by add-ons, or whether a metric-time approach is more fruitful, and this discussion is likely to go on for a while (see also Ray Reiter's comments, next contribution). However, if we agree about the shortcomings of sitcalc, it might also be interesting to discuss why it has been able to maintain its dominance for so long. What kind of inertia is at work here?

Also, with respect to your first observation:

Knowledge-hackers try to formalise an intuition using logic A and find it hard to match formal inference against intuition no matter how ingenious they are with their ontologies and axioms; so they turn to logic B, which enables them to hack the examples to fit intuition rather better...

this is true, of course, but the remedy exists and has been published: it is the systematic methodology which I introduced in (the book) "Features and Fluents". In brief, the systematic methodology program proposes to work in the following steps:

In this way, we don't have to validate the logics against the ill-defined notion of common sense. As an additional step may also be appropriate, namely, to compare the intended conclusions (as specified by the underlying semantics) with the conclusions that people would actually tend to make by common sense. However, that would be a task for psychologists, and not for computer scientists.

With respect to your final point:

... when do we decide that something warrants the title of "theory/ approach/ formalisation.."? The sit. calc. is just a style of writing axioms, and the Allen algebra is just a complicated way to arrange order relationships. These seem to be little more than what Apple tried to sue IBM for, ie something like a 'look-and-feel'.

it seems to me that what really counts in the long run is things like proven range of applicability results, proven methods for transforming logic formalizations to effectively computable forms, etc. However, we can't avoid the fact that whoever writes a paper using formalization F is well advised to include the standard references to where the formalization F was first introduced and defended. Again, Leora's question about staying power becomes significant: if introducing a new formalism can give you a high Citation Index rating for very little work, what are the factors that dictate success and failure for formalizations? Does a formalization win because it solves problems that previously proposed formalizations didn't - or is it more like in the world of commercial software, where people tend to go for the de facto standard?

 
  31.10 Pat Hayes    

I wrote and Murray answered as follows:

  We need some characterisation of what ... proper reasoning is, or at least some examples of where it can be found. ... we don't have any hard data about 'common sense' at all, and the intuitions we appeal to often confuse linguistic, psychological and pragmatic issues.

  This is where building robots based on logic-based KR formalisms comes into its own. When we construct a logical representation of the effects of a robot's actions and use that theory to decide the actions the robot then actually performs, we have one clear criterion for judging the formalisation. Does the robot do what it's supposed to? There are other criteria for judging the formalisation too, of course, such as its mathematical elegance. But when our formalisations are used to build something that actually does something, we're given an acid test. Furthermore, when the "something that actually does something" is a robot, we're forced to tackle issues to do with action, space, shape, and so on, which I think are crucial to common sense.

I'm sympathetic to the fact that robot-testing forces one into the gritty realities of the actual world, and admire Murray's work in this direction. However, I think that to use this as a paradigm for testing formalizations gets us even deeper into the other problem I worry about, which is how to separate the formalism itself from all the rest of the machine it is embedded in. With robots there are even more things that stand between the formalization and the test: all the architectural details of the robot itself, the ways it sensors work, etc., are likely to influence the success or otherwise of the robot's performance; and perhaps a better performance can be achieved by altering these aspects rather than doing anything to the logic it uses or the ontology expressed in that logic.

The same kind of problem comes up in cognitive psychology. It is very hard to design experiments to test any theories of cognitive functioning in humans. Noun meanings in psycholinguistics is about as far into the mind as any empirical tests have been able to penetrate; other, non-cognitive, factors interfere so much with anything measurable that hard data is virtually unobtainable.

(On the other hand, maybe this is something to be celebrated rather than to worry about! On this view, influenced by 'situatedness', one shouldnt expect to be able to divorce an abstract level of logical representation completely separated from the computational architecture it is supposed to be implemented on. I expect this view is not acceptable to most subscribers to this newsletter, however, on general methodological grounds. :-)

Erik wrote:
  Pat,

I am puzzled by your remarks, because while I agree with most of your points, I think they have already been answered by research especially during the last five years.....

Even if I were to agree, just cast my remarks entirely in the past tense and only point to the fact that sitcalc exercised a remarkably strong hold on everyone's imaginations for a very long time in spite of its shortcomings. It still provides an example for Leora's query. As you say:

  ... since we agree about the shortcomings of sitcalc, it might also be interesting to discuss why it has such remarkable inertia. Does the frame assumption apply to theories, and what actions affect the research community's choice of theoretical framework?

Yes, I think that there was (and still is) a tendency for the field to go through the following loop. We start with a genuine research problem; make some initial progress by inventing a formalism; the formalism fails to fit the original goals, but itself becomes the subject of investigation, and its failings themselves the subject of research; and then this research effort detaches itself completely from the original goal and becomes an end in itself. You provide a very elegant example of this with the methodology you suggest for evaluating formalisations:

  .... the remedy exists and has been published: it is the systematic methodology which I introduced in (the book) "Features and Fluents". In brief, the systematic methodology program proposes to work in the following steps:

  • Define an underlying semantics for a suitable range of problems. The definition must be strictly formal, and should as far as possible capture our intuitions wrt inertia, ramification, etc. ...

  • Define a taxonomy of scenario descriptions using the underlying semantics. ...

  • Analyse the range of applicability of proposed entailment methods (for example involving chronological minimization, or occlusion, and/or filtering). ...

In this way, we don't have to validate the logics against the ill-defined notion of common sense; validation is performed and range of applicability is defined from perfectly precise concepts.

Yes, but to what end? The things you characterise as 'ill-defined' are the very subject-matter which defines our field. There is no objective account of 'action', 'state', etc. to be found in physics, or indeed any other science; our inttuitions about hese things is the only unltimate test we have for the correctness or appropriateness of our formalisms. Theres no way for us to escape from philosophical logic into the clean, pure halls of JSL. For example, your first step requires a formal semantics which captures our intuitions regarding "inertia, ramification, etc.". But these are technical terms arising within the theory whose validity we are trying to test. People don't have intuitions about such things: they have intuitions about space and time, tables and chairs, liquids and solids, truth and lies; about the stuff their worlds are made of. Even if people did have intuitions about inertia and ramification, those intuitions wouldnt be worth a damn, because they would be intuitions about their own reasoning, and one thing that psychology can demonstrate very clearly is that that our intuitions about ourselves are often wildly mistaken.

  And how do these formal structures relate to real common sense? Well, an additional step may also be appropriate, namely, that of comparing the intended conclusions (as specified by the underlying semantics) with the conclusions that people would actually tend to make by common sense. However, that would be a task for psychologists, and not for computer scientists.

Surely this must be done first (if we are to pretend to be still pursuing the original research goals which gave rise to this field.) Until the 'psychologists', or somebody, has told us what it is that our formalisms are supposed to be doing, speculation about their properties is just an exercise in pure mathematics.

 
  3.11 Michael Gelfond    

I found parts of it difficult to follow since I am not sure what the participants mean by the word ``theory''. Are you referring to theory as organized body of knowledge about some subject matter, to theory in mathematical sense (like theory of probability), or to logical theory - collection of formulae in some language with precisely defined entailment relation? (This is of course a very incomplete list of possibilities).

It is important to be somewhat more precise here because in the AI community ``theory'' is sometimes identified with an ``idea'' and I am not sure that it is very useful to publicly judge ideas until they develop into theories. Sometimes this process takes much more than 25 years especially if the idea is prevented from its natural development by premature judgments or if development of a theory requires more than one basic idea.

 
  5.11 Hector Geffner    

  1. I think the goal in KR/Non-Mon is modeling, not logic. A formalism may be interesting from a logical point of view, and yet useless as a modeling language.

    A "solution" is thus a good modeling language:

    declarative, general, meaningful, concise, that non-experts can understand and use, etc. (I agree with David's remark on teaching the stuff to "cynical" undergrads)

    The analogy to Bayesian Networks and Logic Programs that David makes is very good. We want to develop modeling languages that are like Bayesian networks, but that, on the one hand, are more qualitative (assumptions in place of probabilities), and on the other, more expressive (domain constraints, time, first-order extensions, etc).

  2. For many years, it was believed that the problem was mathematical (which device to add to FOL to make it non-monotonic). That, however, turned out to be only part of the problem; a part that has actually been solved: we have a number of formal devices that yield non-mon behavior (model preference, kappa functions, fixed points, etc.); the question is how to use them to define good modeling languages

  3. The remaning problem, that we can call the semantic problem, involves things like the frame problem, causality, etc.

    To a large extent, I think the most basic of these problems have also been solved:

    Basically, thanks to Michael and Vladimir, Erik, Ray, and others we know that a rule like:

    if A, then B

    where A is a formula that refers to time  i  or situation  s , and B is a literal that refers to the next time point of situation, is just a constraint on the possible transitions from the the states at  i  or  s , and the following states.

    Or put in another way, temporal rules are nothing else but a convenient way for specifying a dynamic system (or transition function)

    Actually, for causal rules, the solution (due to Moises, Judea, and others) is very similar: causal default rules are just a convenient way for specifying (qualitative) Bayesian Networks

  4. These solutions (that appear in different dresses) are limited (e.g., in neither case B can be an aribtrary formula) but are meaningful: not only the work, we can also understand why.

    We also understand now a number of things we didn't understand before.

    e.g., 1. a formula can have different "meanings" according to whether it represents a causal rule, an observation or a domain constraint.

    (this is not surprising from a Bayesian Net or Dynamic systems point of view, but is somewhat surprising from a logical point of view)

    2. reasoning forward (causally or in time) is often but not always sound and/or complete; i.e., in many cases, forward chaining and sleeping dog strategies will be ok, in other cases, they won't.

  5. It's not difficult to change the basic solutions to accommodate additional features (e.g., non-deterministic transition functions, unlikely initial conditions, concurrent actions, etc.) in a principled way.

    So, I think, quite a few problems have been solved and default languages, in many cases, are ripe for use by non non-mon people.

  6. We have to make a better job in packaging the available theory for the outside world, and in delineating the solved problems, the unsolved problems and the non-problems, for the inner community and students.

    Actually I have been doing some of this myself, giving a number of tutorials in the last couple of years at a number of places (I invite you to look at the slides in http://wwww.ldc.usb.ve/~hector)

 
  5.11 Pat Hayes    

I think this meta discussion, though at times confused (mea culpa, of course), has been useful in revealing a clear divergence between two methodologies, giving different answers to the original question about how we should evaluate work in the field. ("NRAC panel on theory evaluation", ENRAC 21.10).

One view appeals to our human intuitions, one way or another. In this it is reminiscent of linguistics, where the basic data against which a theory is tested are human judgements of grammaticality. We might call this a 'cognitive' approach to theory testing. Talk of 'common sense' is rife in this methodology. Based on the views expressed in these messages, I would place myself, Erik Sandewall, Michael Gelfond in this category. The other, exemplified by the responses of Ray Reiter, Mikhail Soutchanski and Murray Shanahan, emphasises instead the ability of the formalism to produce successful behavior in a robot; let me call this the 'behavioral' approach.

This distinction lies orthogonal to the extent to which people find formality more or less congenial. Both Ray and Erik dislike 'vague claims', and Erik's suggested methodology (Newsletter 23.10) meticulously avoids all contact with psychology, as he emphasises; yet he ultimately appeals to capturing our intuition, rather than any successful application in a robot, to tell us which kinds of model-theoretic structures are more acceptable than others. It also lies orthogonal to the extent to which people see their ultimate goal as that of creating a full-blown artificial intelligence (as both Wolfgang Bibel and Mikhail Soutchanski seem to, for example, along with our founder, John McCarthy), or might be satisfied with something less ambitious. This distinction in approaches - start with insects and work 'up', or start with human common sense and work 'down' - is also a methodological split within AI in general, and seems to be largely independent of whether one feels oneself to be really working towards a kind of ultimate HAL.

Do people find this distinction seriously incomplete or oversimplifying? (Why?) Or on the other hand if they find it useful, which side of the division they would place themselves? In a nutshell, is the immediate goal of the field to understand and accurately model human intuitions about actions, or is it to help produce artifacts which behave in useful or plausible ways? I think this is worth getting clear not to see which 'side' wins, but to acknowledge that this difference is real, and likely to produce divergent pressures on research.

 
  6.11 Erik Sandewall    

Pat,

I agree with you that it's time to sort out the different perspectives, goals, and methods for reaching the goals that have confronted each other here. You write of two dimensions; in the first one you make the following distinction:

  One view appeals to our human intuitions, one way or another. In this it is reminiscent of linguistics, where the basic data against which a theory is tested are human judgements of grammaticality. We might call this a 'cognitive' approach to theory testing. Talk of 'common sense' is rife in this methodology. Based on the views expressed in these messages, I would place myself, Erik Sandewall, Michael Gelfond in this category. The other, exemplified by the responses of Ray Reiter, Mikhail Soutchanski and Murray Shanahan, emphasises instead the ability of the formalism to produce successful behavior in a robot; let me call this the 'behavioral' approach.

I agree with this, except that the term `behavioral' is maybe not the best one, and also you put me in the wrong category; more about that later. Anyway, the distinction you make here seems to coincide with the one that David Poole made in his position statement:

  There are two quite different goals people have in building KR system; --- These are:

1. A knowledge representation as a modelling language. If you have a domain in your head you can use the KR to represent that domain. ---

2. A knowledge representation as a repository of facts for commonsense reasoning. Under this scenario, you assume you are given a knowledge base and you are to make as much sense out of it as possible. ---

If you are going to design a robot in a good engineering sense, you are going to need to model both the robot itself and its environment. That's why what you call the `behavioral' approach coincides with the use of KR for modelling physical systems. Since `modelling' can mean many things, I'll further qualify it with the term `design goal'.

As for the other dimension, you propose

  --- the extent to which people find formality more or less congenial. Both Ray and Erik dislike 'vague claims' ---

This distinction I find less informative, since all the work in this area is formal in one way or another. Even the kludgiest of programs exhibits `formality'. However, different researchers do take different stands wrt how we choose and motivate our theories. One approach is what you described in your first response to the panel (ENRAC Newsletter on 22.10):

  Knowledge-hackers try to formalise an intuition using logic A and find it hard to match formal inference against intuition no matter how ingenious they are with their ontologies and axioms; so they turn to logic B, which enables them to hack the examples to fit intuition rather better.

The key word here is examples. In this example-based methodology, proposed logics are treated like hypotheses in a pure empirical paradigm: they are accepted until a counterexample is found; then one has to find another logic that deals correctly at least with that example. Ernie Davis characterized this approach in his book, Representation of Commonsense Knowledge [mb-Davis-90]. See also the discussion of this approach in my book, Features and Fluents [mb-Sandewall-94], p. 63).

The example-based methodology has several problems:

The choice of methodology is indeed orthogonal to your first distinction, since the example-based methodology can be used both in the pursuit of theories of common sense, and in the development of intelligent robots by design iteration (try a design, see how it works, revise the design).

The alternative to this is to use a systematic methodology where, instead of searching for the "right" theory of actions and change, we identify a few plausible theories and investigate their properties. For this, we need to use an underlying semantics and a taxonomy of scenario descriptions; we can then proceed to analyse the range of applicability of proposed theories (entailment methods).

Your answer to this was (31.10):

  Yes, but to what end? The things you characterize as `ill-defined' are the very subject-matter which defines our field. There is no objective account of `action', `state', etc. to be found in physics, or indeed in any other science; intuitions about these things is the only ultimate test we have for the correctness or appropriateness of our formalisms.---

This would be true if the `cognitive' (in your terms) goal were the only one. From the point of view of modelling and design, on the other hand, these are perfectly valid concepts. The concept of state is used extensively in control engineering (yes, control theory does deal with discrete states, not only with differential equations!), and I am sure our colleagues in that area would be most surprised to hear that our intuitions is "the only ultimate test we have" for the correctness or appropriateness of the formalisms that they share with us.

Now, when you placed me in the cognitive category, you got me wrong. As I wrote in my position statement for this panel, my heart is with the use of knowledge representations as modelling languages. The present major project in our group is concerned with intelligent UAV:s (unmanned aircraft), and in this enterprise we need a lot of modelling for design purposes; we have currently no plans to pursue the `cognitive' goal.

However, just as the example-driven methodology can serve both the cognitive goal and the design goal, I do believe that the systematic methodology can also be relevant as one part of a strategy to achieve the `cognitive' goal. More precisely, for the reasons that both you and I have expressed, it's not easy to find any credible methodology for research on understanding the principles of commonsense, and in fact I did not see any concrete proposal for such a methodology in your contributions. However, to the extent that people continue to pursue that goal, my suggestion was to divide the problem into two parts: one where our discipline can say something substantial, and one which is clearly in the domain of the psychologists.

Therefore, the contradiction that you believed having seen when writing

  ... and Erik's suggested methodology (Newsletter 23.10) meticulously avoids all contact with psychology, as he emphasises; yet he ultimately appeals to capturing our intuition, rather than any successful application in a robot, to tell us which kinds of model-theoretic structures are more acceptable than others.

is not a real one; it only arises because your perception that

  ... this distinction in approaches - start with insects and work 'up', or start with human common sense and work 'down' - is also a methodological split within AI in general, and seems to be largely independent of whether one feels oneself to be really working towards a kind of ultimate HAL.

which I also do not share. After all, the behavioral/ commonsense view and the modelling/ design view represent goals, not methodologies, and both choices of methodology (the example-based and the systematic one) can be applied towards both the goals.

References:

mb-Davis-90Ernest Davis.
Representation of Commonsense Knowledge.
Morgan Kaufmann Publishers, Inc., 1990.
mb-Sandewall-94Erik Sandewall.
Features and Fluents. The Representation of Knowledge about Dynamical Systems.
Oxford University Press, 1994.

 
  20.1 Pat Hayes    

Ray Reiter wrote (21.10, position statement for panel):

  1. Erik's notion of an ontology seems odd to me, mainly because it requires "that the "frame assumption" or assumption of persistence must be built into the ontology".

Yes, I agree. Why choose just that assumption to build in, in any case? It clearly isn't always true (for example, if we are considering temperatures, cooking, drying paint or leaky containers, or indeed any kind of process which all by itself will produce a significant change as time goes by; or when we know that our information is imperfect; or when we have reason to suppose that there may be other agents trying to frustrate us, or even just working in the same area with their own goals which might interfere with ours; or if we know that a we are dealing with an oscillating or unstable system, or one that requires constant effort to maintain its stability.) There are many other equally plausible assumptions. For example, the assumption that things have been pretty much as they are now in the recent past (back-persistence), or that nothing will make any significant difference to anything a long way away (distance-security) or on the other side of a significant barrier (the privacy assumption). All of these, and others, are equally correct and about as useful as the assumption of persistence in limiting the spread of causal contagion.

But I think I may have understood what Erik means. (Erik, can you confirm or deny?) Let me reiterate some old observations. If the world were really a very turbulent place where any action or event might have any kind of consequences, producing all kinds of changes, then there would be no 'frame problem'. While it would of course then be difficult to describe the effects and non-effects of actions, this wouldn't be surprising, and we wouldnt call it a "problem". The FP only seems to be a problem to us because we feel that we should be able to somehow take better advantage of the world's causal inertia. So, perhaps this is what Erik means by saying that the 'frame assumption' must be built-in: he wants a model-theoretic characterisation of this causal inertia, a way to exclude models which, when thought of as physical worlds, would be worlds where things happen for no reason. He wants us to somehow delineate what the 'physically normal' interpretations are.

If this is more or less right, then there seems to me to be a central question. How can we specify the relationship of the logical 'possible world' (which is just a mathematical structure of sets of subsets of ordered pairs, etc.) to the physically possible worlds about which we have intuitions to guide us? This difficulty is illustrated by the recent discussions here. For example, my bitchin' over the distinction between gof-sitcalc and R-sitcalc comes from such a difference. Both of these use a similar notation, considered as first-order axioms: they both have things which have the role of state-to-state functions but which are reified into first-order objects, and a special function which takes a state and such an object and produces a new state. In gof-sitcalc, these are supposed to represent actions taken by agents. In R-sitcalc, they correspond rather to beginnings and endings of processes which are extended through time. The difference lies not in the axiomatic signature or even in the model theory itself, but rather in the intended mapping between the (set-theoretic) model theory and the actual physical world being modelled. We have intuitions about the physical worlds, but we dont have any physical intuitions about formal models.

Heres another illustration. I've never been very impressed by the famous Yale shooting problem, simply because it doesn't seem to me to be a problem. All the usual axioms say is that something is done, something else is done and then a third thing is done, and the outcome is unusual; and the 'problem' is supposed to be that the logic allows that it might have been the second thing that was unusual instead of the third one. When the vocabulary suggests that the third thing is a shooting and the second a mere waiting this is deemed to be a mistake, and much research has been devoted to finding principles to exclude such models. But if the vocabulary is interpreted differently (for example, if the second event had been a shooting and the third one a waiting) this wouldn't be unintuitive. But the usual situation-calculus formalisations of this problem provide no way to distinguish these cases! They say virtually nothing about the actual physical actions involved; certainly not enough about these actions to enable a sensible choice to be made between one formal interpretation and the other. To say that a shooting is 'abnormal' in that it kills someone, in the same sense that, say, a grasping which fails to grasp or which accidentally knocks over something unexpectedly is 'abnormal', is just a misuse of a formal device. Theres nothing abnormal about a shooting resulting in bodily harm, and to invoke circumscription to overcome a 'normal' case of actions being harmless seems obviously too crude a mechanism to be generally useful. (For a general strategy for defusing YSP-type examples, consider a gunfighter who always blows the smoke from his barrel after a successful fight, and ask what persistence or inertial principles are going to ensure that its his bullets that kill people, and not that final flourish.)

Pat Hayes

 
  21.1 Hector Geffner    

Pat says:

  ...Heres another illustration. I've never been very impressed by the famous Yale shooting problem, simply because it doesn't seem to me to be a problem ....

I'm not sure I understand Pat's point well, but I think I understand the YSP. Here is the way I see it.

In system/control theory there is a principle normally called the "causality principle" that basically says that "actions cannot affect the past". If a model of a dynamic system does not comply with this principle, it's considered "faulty".

In any AI the same principle makes perfect sense when actions are exogenous; such actions, I think, we can agree, should never affect your beliefs about the past (indeed, as long as you cannot predict exogenous actions from your past beliefs, you shouldn't change your past beliefs when such actions occur).

What Hanks and McDermott show is that certain models of action in AI (like simple minimization of abnormality) violate the causality principle. In particular they show that

your beliefs at time 2, say, after LOAD AND WAIT (where you believe the gun is loaded)

are different from your beliefs at time 2, after LOAD, WAIT and SHOOT.

Namely, SHOOT at t=3 had an effect on your past beliefs (LOADED at t=2).

Most recent models of action comply with the causality principle. In some it comes for free (e.g., language  A  due to the semantic structures used (transition functions); in others (Reiter, Sandewall, etc), I'm sure it can be proved.

Regards.

- Hector Geffner

 
  21.1 Erik Sandewall    

Pat, citing Ray Reiter's earlier contribution, you wrote:

  1. Erik's notion of an ontology seems odd to me, mainly because it requires "that the "frame assumption" or assumption of persistence must be built into the ontology".
  Yes, I agree. Why choose just that assumption to build in, in any case? ...

Well, Ray and you are bringing up two different issues here. Ray's objection was with respect to classification: he argued that the frame assumption (when one uses it) ought to be considered as epistemological rather ontological. (In the position statement that he referred to, I had proposed a definition of ontology and suggested that the situation calculus does not represent one, since the frame assumption is represented by separate axioms rather than being built into the underlying ontology). On the other hand, the question that you bring up is what kind or kinds of persistence we ought to prefer: temporal forward in time, temporal backwards, geometrical, etc.

Let me address your letter first. I certainly agree with the analysis in the second paragraph of your message: the world is not entirely chaotic, some of its regularities can be characterized in terms of persistence (= restrictions on change, or on discontinuities in the case of piecewise continuous change) and all those exceptions to persistence that are now well-known: ramifications, interactions due to concurrency, causality with delays, surprises, and so on.

For quite some time now, research in our field has used a direct method in trying to find a logic that is capable of dealing correctly with all these phenomena, that is, by considering a number of "typical" examples of common-sense reasoning and looking for a logic that does those examples right. My concern is that this is a misguided approach, for two reasons:

What I proposed, therefore (in particular in the book "Features and Fluents") was to subdivide this complex problem into the following loop consisting of managable parts (the "systematic methodology"):

  1. Define an ontology, that is, a "model" of what the world is like. States and assignment-like state transitions is a very simple such ontology. Ramifications, concurrency, and so on are phenomena that call for more complex ontologies. Make up your mind about which of them you want to allow, and leave the others aside for the time being.

  2. Define an appropriate logical language for describing phenomena in the ontology, including actions. Each combination of ontology and language defines a mapping from a set of formulae to the set of intended models for those formulae.

  3. Define entailment methods, that is, mappings from the set of classical models to a modified set called the selected models. Usually, the selected models are a subset of the classical models.

  4. Identify the range of applicability for each entailment method, that is, the conditions which guarantee that the selected models are exactly the intended ones.

  5. Define "implementations" of entailment methods by expressing them e.g. in circumscription, or using modified tableau techniques. If the implementation is done right, then the range of applicability for the entailment method is also the range of applicability of its implementations.

  6. When one has obtained sufficient understanding of points 2-5 for a given ontology, then define a richer one (allowing for additional phenomena of interest), and go back to item 2.

This agenda certainly aims to address all the intricacies that you mention in the first paragraph of your message, but only in due time. We can not do everything at once; if we try doing that then we'll just run around in circles.

In the Features and Fluents approach we have iterated this loop a few times, starting with strict inertia and then adding concurrency and ramification, doing assessments in each case. What about the other major current approaches? Early action languages, in particular  A , fits nicely into this paradigm, except that whereas above we use one single language and two semantics (classical models and intended models),  A  uses two different languages each with its own semantics. However, later action languages, such as  AR , do not qualify since they define the models of the action language (intended models, in the above terms) using a minimization rule. To me, minimization techniques belong in the entailment methods which are to be assessed according to the paradigm, but the gold standard that we assess them against should not use such an obscure concept as minimization.

On similar grounds, I argued that a situation-calculus approach where a frame assumption is realized by a recipe for adding more axioms to a given axiomatization does not really define an ontology. It can be measured against an ontology, of course, but it does not constitute one.

Ray's argument against that was that the frame assumption is inherently epistemological, or maybe metaphysical. Since most people would probably interpret "metaphysical" as "unreal" rather than in the technical sense used by philosophers, we couldn't really use that term. With respect to the term epistemological, I just notice that some entailment methods have been observed to have problems e.g. with postdiction: prediction works fine but postdiction doesn't. This means that when we specify the range of applicability of an entailment method, we can not restrict ourselves to ontological restrictions, such as "does this method work if the world behaves nondeterministically?"; we must also take into account those restrictions that refer to the properties of what is known and what is asked, and to their relationship. The restriction to only work for prediction is then for me an epistemological restriction.

On this background, Ray then questioned whether the frame assumption itself is ontological or epistemological in nature. I'd say that in a systematic methodology (as in items 1-6 above), the ontology that is defined in step 1 and revised in step 6 must specify the persistence properties of the world, otherwise there isn't much one can say with respect to assessments. This technical argument I think is more useful than the purely philosophical question of what it "really is".

You then address the following question:

  How can we specify the relationship of the logical 'possible world' (which is just a mathematical structure of sets of subsets of ordered pairs, etc.) to the physically possible worlds about which we have intuitions to guide us? This difficulty is illustrated by the recent discussions here. For example, my bitchin' over the distinction between gof-sitcalc and R-sitcalc comes from such a difference. Both of these use a similar notation, considered as first-order axioms: they both have things which have the role of state-to-state functions but which are reified into first-order objects, and a special function which takes a state and such an object and produces a new state. In gof-sitcalc, these are supposed to represent actions taken by agents. In R-sitcalc, they correspond rather to beginnings and endings of processes which are extended through time. The difference lies not in the axiomatic signature or even in the model theory itself, but rather in the intended mapping between the (set-theoretic) model theory and the actual physical world being modelled...

Yes, exactly! There are two different ontologies at work here; my argument would be that each of them should be articulated in terms that are not only precise but also concise, and which facilitate comparison with other approaches both within and outside KR.

But your question at the beginning of this quotation is a fundamental one: how do we choose the additional ontological structures as we iterate over the systematic methodology loop, and how do we motivate our choices?

In some cases the choice is fairly obvious, at least if you have decided to base the ontology on a combination of structured states and linear metric time (integers or reals). Concurrency, chains of transitions, immediate (delay-free) dependencies, and surprise changes can then be formalized in a straight-forward manner. Also, we can and should borrow structures from neighboring fields, such as automata theory, theory of real-time systems, and Markov chain theory.

However, there are also cases where the choice is less than obvious. What about the representation of actions by an invocation event and a termination event, which is what R-sitcalc is about? What about the recent proposal by Karlsson and Gustafsson [f-cis.linep.se-97-014] to use a concept of "influences" (vaguely similar to what is used in qualitative reasoning), so that if you try to light a fire and I drip water on the firewood at the same time, then your action has a light-fire-influence and my action has an extinguish-fire-influence, where the latter dominates? (If there is only a light-fire-influence for a sufficient period of time, then a fire results). These are nontrivial choices of ontology; how can we motivate them, assess them, and put them to use?

To my mind, this ties in with what Bob Kowalski said in the panel discussion at the recent workshop on Formalization of Common Sense: these are pre-logical issues. It is not meaningful to begin writing formulae in logic at once and to ask what variant of circumscription is going to be needed. Instead, one ought to work out an application area of non-trivial size with the proposed ontology, probably also using a tentative syntax that matches the ontology, but without committing to anything else. Only then, as one knows what ontology is needed, is it meaningful to look for entailment methods and their implementations which may be appropriate for the ontology one needs.

The bottom line is: let's use the ontology, or the underlying semantics, as an intermediate step on the way from application to implemented system. Going from application to ontology requires one kind of activity; going from ontology to implementation requires another kind. Such a decomposition has all the obvious advantages: it allows one to address simpler problems before proceeding to more difficult ones, it provides a way of characterizing and comparing results, and it facilitates reuse of earlier results.

References:

f-cis.linep.se-97-014Lars Karlsson and Joakim Gustafsson.
Reasoning about actions in a multi-agent environment.
Linköping University Electronic Press, 1997: [original publication]

 
  22.1 Murray Shanahan    

Pat Hayes wrote,

  I've never been very impressed by the famous Yale shooting problem, simply because it doesn't seem to me to be a problem.

First I think we should distinguish between the Yale shooting scenario and the Hanks-McDermott problem. The Yale shooting is scenario is the one in which someone loads, waits and shoots, and in which those actions have certain prescribed effects. The Hanks-McDermott problem is a difficulty that arises when we take certain approaches to solving the frame problem, and is exemplified in the Yale shooting scenario. The frame problem is the problem of describing the effects of actions in logic without recourse to an excess of axioms describing their non-effects.

If you want to solve the frame problem, your solution had better be able to deal with the Yale shooting scenario. A number of early attempts at the frame problem couldn't, which is why the scenario is of interest.

Isn't that all pretty straightforward?

Murray

 
  22.1 Pat Hayes    

Hector Geffner:

  ... I think I understand the YSP. Here is the way I see it.

In system/control theory there is a principle normally called the "causality principle" that basically says that "actions cannot affect the past". If a model of a dynamic system does not comply with this principle, it's considered "faulty".

In any AI the same principle makes perfect sense when actions are exogenous; such actions, I think, we can agree, should never affect your beliefs about the past (indeed, as long as you cannot predict exogenous actions from your past beliefs, you shouldn't change your past beliefs when such actions occur).

But actions - or external events - do change ones beliefs about the past. They do not change the past itself, of course: that is the causality principle. But consider for example coming into a room in an empty house and finding a hot cup of coffee resting on a table. One immediately infers that somone else has been present there recently. We constantly make inferences about the past on the basis of present knowledge, even in such technical areas as military planning.

I think there is a confusion here between (1) drawing conclusions about an external world and (2) simulating an external world by inferring a state from its previous state. The causality principle applies to the latter, but not the former; and even then, it is reasonable only when accompanied by a presumption of a certain kind of completeness in ones knowledge of the state. We often make predictions of the future by a kind of mental 'simulation' by inferring what is going to happen next from what is true now (as in the conventional situation calculus axiomatic approach); but in practice, such simulations are often unreliable precisely because we don't have sufficiently complete knowledge; and when this is so, we cannot cleave to the strict causality principle, but are obliged to use techniques such as nonmonotonic reasoning which allow us to recover gracefully from observed facts which contradict our predictions, which would otherwise enmesh us in contradictory beliefs. Nonmonotonicity is a good example of the need to revise ones beliefs about the past in the light of unexpected outcomes in the present, in fact, which gets us back to the YSP:

  What Hanks and McDermott show is that certain models of action in AI (like simple minimization of abnormality) violate the causality principle. In particular they show that

your beliefs at time 2, say, after LOAD AND WAIT (where you believe the gun is loaded)

But why should you believe the gun is loaded at this time? Why is this considered so obvious? Remember, all the axioms say about WAIT is ...well, nothing at all. That's the point of the example: if you say nothing about an action, the logic is supposed to assume that nothing much happened. But if what we are talking about is a description of the world, saying nothing doesn't assert blankness: it just fails to give any information. If one has no substantial information about this action, the right conclusion should be that anything could happen. Maybe WAIT is one of those actions that routinely unloads guns, for all I know about it from an axiomatic description that fails to say anything about it. So the 'problem' interpretation about which all the fuss is made seems to me to be a perfectly reasonable one. If I see a gun loaded, then taken behind a curtain for a while, and then the trigger pulled and nothing happened, I would conclude that the gun had been unloaded behind the curtain. So would you, I suspect. If I am told that a gun is loaded, then something unspecified happens to it, I would be suspicious that maybe the 'something' had interfered with the gun; at the very least, that seems to be a possibility one should consider. This is a more accurate intuitive rendering of the YSS axioms than talking about 'waiting'.

We all know that waiting definitely does not alter loadedness, as a matter of fact: but this isn't dependent on some kind of universal background default 'normality' assumption, but follows from what we know about what 'waiting' means. It is about as secure a piece of positive commonsense knowledge as one could wish to find. Just imagine it: there's the gun, sitting on the table, in full view, and you can see that nothing happens to it. Of course it's still loaded. How could the bullet have gotten out all by itself? But this follows from knowledge that we have about the way things work - that solid objects can't just evaporate or pass through solid boundaries, that things don't move or change their physical constitution unless acted on somehow, that guns are made of metal, and so on. And the firmness of our intuition about the gun still being loaded depends on that knowledge. (To see why, imagine the gun is a cup and the loading is filling it with solid carbon dioxide, or that the gun is made of paper and the bullet is made of ice, and ask what the effects would be of 'waiting'.) So if we want to appeal to those intuitions, we ought to be prepared to try to represent that knowledge and use it in our reasoners, instead of looking for simplistic 'principles' of minimising changes or temporal extension, etc., which will magically solve our problems for us without needing to get down to the actual facts of the matter. (Part of my frustration with the sitcalc is that it seems to provide no way to express or use such knowledge.)

I know how the usual story goes, as Murray Shanahan deftly outlines it. Theres a proposed solution to the frame problem - minimising abnormality - which has the nice sideeffect that when you say nothing about an action, the default conclusion is that nothing happened. The Yale- shooting- scenario- Hanks- McDermott problem is that this gives this 'unintuitive' consequence, when we insert gratuitous 'waitings', that these blank actions might be the abnormal ones. My point is that this is not a problem: this is exactly what one would expect such a logic to say, given the semantic insights which motivated it in the first place; and moreover, it is a perfectly reasonable conclusion, one which a human thinker might also come up with, given that amount of information.

Murray says:

  If you want to solve the frame problem, your solution had better be able to deal with the Yale shooting scenario.

This is crucially ambiguous. The conclusion I drew from this example when it first appeared was that it showed very vividly that this style of axiomatisation simply couldnt be made to work properly. So if "the Yale shooting scenario" refers to some typical set of axioms, I disagree. If it refers to something involving guns, bullets and time, then I agree, but think that a lot more needs to be said about solidity, containment, velocity, impact, etc., before one can even begin to ask whether a formalisation is adequate to describing this business of slow murder at Yale. Certainly your solution had better be able to describe what it means to just wait, doing nothing, for a while, and maybe (at least here in the USA) it had better be capable of describing guns and the effects of loading and shooting them. But thats not the same as saying that it has to be able to deal with the way this is conventionally axiomatised in the situation calculus.

Imagine a gun which requires a wick to be freshly soaked in acetone, so that just waiting too long can cause it to become unready to shoot. This satisfies the usual YSS axioms perfectly: when loaded, it is (normally) ready to fire, when fired, it (normally) kills, etc.. But if you wait a while, this gun (normally) unloads itself. Now, what is missing from the usual axiomatisation which would rule out such a gun? Notice, one doesnt want to make such a device logically impossible, since it obviously could be constructed, and indeed some mechanisms are time-critical in this way (hand grenades, for example). So one wants to be able to write an axiom which would say that the gun in question isn't time-critical: it has what one might call non-evaporative loading. Maybe its something to do with the fact that the bullets are securely located inside the gun, and that they don't change their state until fired...or whatever. My point is only that there is no way to avoid getting into this level of detail; formalisations which try to get intuitive results with very sketchy information cannot hope to succeed except in domains which are severely restricted.

(Response to Erik Sandewall in later message.)

Pat Hayes