ETAI NJ Actions and Change

News Journal on Reasoning about Actions and Change


Vol. 2, Nr. 1 Editor: Erik Sandewall 31.1.1998

The ETAI is organized and published under the auspices of the
European Coordinating Committee for Artificial Intelligence (ECCAI).

Contents of this issue

News items during the month

New ENRAC Facilities
Forthcoming book
Top Lines of the Newsletter Issues

ETAI Publications

Discussion about received articles

Articles Published Elsewhere

Discussion with Wolfgang Bibel about his IJCAI lecture

Topic-Oriented Discussions

Methodology of Research in Actions and Change
Ontologies for actions and change


News items during the month

New ENRAC Facilities

Dated: 5.1.1998

The present Newsletter issue is entirely dedicated to a description of the new facilities that have been implemented during the last month, and which are now operational in their main parts.

Formulas in HTML

One change concerns the representation of formulas in the on-line text and, in particular, in HTML. This is important for a wider use of an interactive, on-line communication service in a scientific domain such as ours. There seems to be two problems: graphics and ease of reading. The graphics problem is that present versions of HTML do not provide good support for mathematical characters or for the precise layout that is needed in order to make formulas look good. While waiting for implementations of the recently announced HTML4.0 standard (which will at least provide the character sets), we have implemented an ad hoc solution within current HTML that is somewhat reasonable.

As for ease of reading, it is generally recognized that it is more tiring to read the same mathematical text on a screen than on paper. However, we have now made an interesting observation: the use of color in the on-line medium may compensate for the disadvantages that it otherwise has. Basically, the trick is to use several colors that are distinct but not too distinct, such as black and two nuances of blue, in order to make the text easier to read. The main text is in black, formulas in blue, and what would be different fonts on paper is instead different kinds of blue on the screen. This serves a dual purpose: it adds liveliness to the textual surface, and it also helps to remedy the character-set problem. Additional colors and hues are used in the context of citations.

We have developed some simple software that is able to generate both latex and this kind of formula HTML from a single underlying representation, and the body of Newsletter contributions during the past few months are being converted to this representation. This job has been finished for the review discussions about received articles; please take a look at them. The conversion of the article summaries will follow, and then the panel-debate contributions during 1997. Forthcoming contributions will be put into this format as they arrive.

On-line citations

A second change concerns the representation of citations and other cross-links between discussion contributions, and from discussion contributions to previously published articles. An earlier implementation of an on-line "footnote" system (where footnote contents are shown in a small pane of the browser display, using HTML frames) has now been connected to the ACRES bibliographic database. The bottom line is that the author of a contribution can now indicate citations using a global coding scheme, such as

    c-ecai-94-401
for the paper beginning on page 401 of the ECAI 1994 proceedings, and the rest will happen automatically: The article code appears clickable in the on-line text, clicking it will cause the bibliographic entry to appear in the footnote windowpane, and from there one can again click to see the author's homepage or the full text of the paper, to the extent that these are known to the database.

There is of course also a facility for defining footnotes that are local to a contribution, for example for articles that appeared outside the A I literature and which are therefore not listed in ACRES. The noframe version of the system presents citations at the end of the contribution, since on-line footnotes are not available.

These facilities have also so far been realized for the article review discussions, and will follow for the other materials.

Authors of contributions during 1997 are kindly asked to check that nothing has been disrupted in their contributions as a result of this editing process. At the same time, please check also the wording in the copyright statement which has been slightly modified in order to specify this possibility of posterior editing. (The copyright of all contributions stay with the author).

Cross-linking frame mechanism

The discussions during the past months contained several examples of where an author referred to earlier discussion, not only for a direct answer or retort to what was said before, but also for a more general reference. This suggests the need for hot (clickable) links within and between the discussions. Such hot links may be particularly useful in HTML frame-based contexts, but some technical problems had to be solved first. In particular, it must be possible to change the contents of several windowpanes at a single click, but without redrawing the whole window. This in turn means that different kinds of discussions must have equal or compatible window layouts. The existing information structure has now been reorganized accordingly, and discussion crosslinks are operational.

Improved support for retrospection

The News Journal on Reasoning about Actions and Change appears at the end of each month, and contains the contributions that have arrived and have been exchanged by mail during that month. It serves several purposes:

Our original idea was that the News Journal would contain essentially all the month's contributions, and that the first purpose would be served both by the HTML edition (on line readable, containing hot links) and the postscript version (convenient to read off-line on paper). However, due to the considerable volume of contributions, it may be somewhat difficult to orient oneself in the HTML version of the News Journal.

In fact, the contributions to the discussions (both article discussions and panel debates) account for the major parts of the volume. This information is anyway also entered into separate, diachronic structures where each discussion is represented as one trace from start to end. Therefore, in order to facilitate overview, there is now a minimal variant of the News Journal called the Digest which only lists the headings that have been active during the past month, that is, what articles have been received, what articles have been discussed, what panel discussions have been active, etc. Some simple measures of activity are also given, such as the number of contributions. In addition, of course, the Digest contains hot links into the respective discussion tracks. The hope is that for on-line use, the Digest will be a convenient point of entry.

For off-line reading, the postscript version is presumably the most convenient to use, but it also takes some time to prepare it. The HTML version of the full News Journal is available instantly, and can of course also be printed out and read off-line.

Publication vehicle for Research Notes

Besides providing easy overview, the News Journals have a second purpose namely to serve as a publication vehicle for (research) notes. To this end, the Electronic News Journal on Reasoning about Actions and Change (ENRAC) is being formally registered as a periodic publication with monthly appearance. The contributions during 1997 will appear retroactively as Volume 1; 1998 will be Volume 2. More information about this will follow when the formalities have been completed.

A note on methodology

The method that is being used here is to develop this new information service gradually and based on feedback from the application: we "feel our way" to a design that will fit the users' needs as well as possible. In fact, it may be understood as an example of action research: new results are developed concurrently with their use in a selected application, although of course here we do it for the results, and the research is not an end in itself. Anyway, this methodology has its pros and cons, for example that all facilities are not ready at once, but then when it's ready it should be good.

Forthcoming book

From: Ray Reiter on 6.1.1998

From: Ray Reiter

I have decided to make available a draft of my book on the sitcalc to ETAI subscribers. It can be downloaded from our group's web site:

http://www.cs.toronto.edu/~cogrobo/

Here is a brief descriptor:

Being a draft of the first eight chapters of a book on the situation calculus and its relevance to artificial intelligence, cognitive robotics, databases, programming languages, simulation, and control theory. Permission is granted to download copies for personal, non-commercial use. Oversights, corrections and advice will be gratefully (and gracefully) received.

Best wishes for a happy new year.

--Ray

Top Lines of the Newsletter Issues

Dated: 20.1.1998

The basic ENRAC panel discussions are intended to be standing ones rather than temporary, so a lull of one or two months does not mean that the discussion is over. Their topics are not likely to be completely resolved very soon. In the present newsletter, Pat Hayes returns to the topic of the methodology panel.

Dated: 21.1.1998

Today's issue contains answers by Hector Geffner and by Erik Sandewall to Pat Hayes's contribution yesterday.

Dated: 22.1.1998

The renewed debate about methodology (previously also called theory evaluation) continues with a contribution by Murray Shanahan, and an answer by Pat Hayes both to Murray and to Hector Geffner's contribution yesterday.

Dated: 23.1.1998

Hector Geffner, Judea Pearl, and David Poole have reacted independently but in similar ways against one of Pat Hayes's statements yesterday; their answers follow. Then, Erik Sandewall answers Hector Geffner concerning nonmonotonicity with respect to observations.

Today's contributions are accumulated to the continuing panel debate on "ontologies for actions and change", and can be found under that heading in the on-line panel-debate structure.

Dated: 25.1.1998

We have two discussion items today:

  1. Pat Hayes answers to the joint critique by Hector Geffner, David Poole, and Judea Pearl in the previous Newsletter issue
  2. Erik Sandewall resumes the discussion with Wolfgang Bibel about his IJCAI-97 invited paper.

Dated: 26.1.1998

The research methodology discussion that resumed a week ago has now shifted to ontological issues concerning actions: "what is the relation between actions and observations?" and "what is an action anyway?" Today's issue contains answers to Pat Hayes by Judea Pearl and Erik Sandewall.

Also, Michael Thielscher communicates a reference to his existing work that addresses the case of nonmonotonicity wrt observations.

Dear Reader, if you feel that some of your own earlier works has a bearing on the topic(s) of the present discussion, then why don't you send us a note for inclusion in the debate? One of the problems in contemporary research is that a lot of good work tends to be overlooked, maybe because so much is being published, and because it's difficult for anyone to have a full overview of everything that's contributed. You can help both your colleagues and yourself by sending us a note that clarifies where you stand on the present issues: what is an action in your system; how do you relate actions to observations; do you agree or disagree with those who participated in this discussion up until now?

Please feel free to advertise your own published works by showing how it is relevant for the topics discussed here. (If you get to be asked hard questions as a result of appearing on this stage, then just remember that all advertising is usually good advertising). - References containing a URL for where the paper can be picked up are of course particularly useful.

Dated: 27.1.1998

Today we have answers to Luis Pereira for questions re the ETAI article by him and several co-authors that we received in December. Also, answers by Wolfgang Bibel for the additional set of questions regarding his invited article at IJCAI 1997.

Dated: 28.1.1998

Today, Michael Thielscher fills in additional details about what cases are correctly handled in his approach. The discussion between Wolfgang Bibel and Erik Sandewall proceeds on the topic of citation principles. Although this is a meta-issue which is important within all research areas, it has a particular relevance for the present Newsletter because our on-line discussions offer novel ways of dealing with citation updates. See the discussion contribution below.

Dated: 29.1.1998

Today, the ontologies discussion continues, with questions by Michael Gelfond and Luis Pereira.

Dated: 30.1.1998

Should actions be first-class citizens in logics of actions and change, so that it is possible to have action variables, to quantify over them, to minimize models with respect to the set of actions in them, etc? We have already touched on this topic before, namely in the discussion about the Kakas/Miller article (question number 2, initiated by Tom Costello and with a number of follow-up comments). Now the same question comes up again, but in another perspective, in a reply by Hector Geffner to a question by Michael Gelfond.


ETAI Publications

Discussion about received articles

Michael Thielscher
A Theory of Dynamic Diagnosis

José Júlio Alferes, João Alexandre Leite, Luís Moniz Pereira, Halina Przymusinska, and Teodor Przymusinski
Dynamic Logic Programming


Articles Published Elsewhere

Discussion with Wolfgang Bibel about his IJCAI lecture

From: Erik Sandewall on 25.1.1998

Dear Wolfgang,

With respect to your IJCAI article, I have a number of questions several of which relate to the research methodology or paradigm being used. I observe that when Marc Friedman ended one of his questions with ``why prefer one solution to the other'', your answer was

  To the best of my knowledge Michael Thielscher (...) was the first who gave a solution to the ramification problem which overcomes deficiencies in any previous solution (...) no previous solution would model reality in a correct way. A better solution in this sense must be preferred to a deficient one. (...)

In this way, you appeal to the traditional method of validating approaches to common-sense reasoning by ways of counterexamples: a method is accepted until disproved by an example. Unfortunately, this research methodology can not provide any reliable conclusions. As has been recognized in core computer science since a long time, the lack of knowledge of a counterexample does not prove that a proposed solution (or a program) is correct.

On this background, I have the following questions or observations:

1. What is your perspective on current research in reasoning about actions and change where we provide assessments or validation results for new theories, instead of merely proposing new logics based on a combination of intuition and toy examples? The new approach is being used by several of us, including Shoham (who started this trend), Lifschitz, and myself. In spite of being a survey paper, your article does not mention this development at all.

2. In the section on qualification, you describe a solution using transition logic: "TL opens a new way to deal with this problem". The solution is exemplified using McCarthy's old "potato in the tailpipe" problem, but not motivated in any other way. In particular, there is no statement of when the method is correct or is not correct, nor any proof or reference to a proof of its correctness under some precisely stated assumptions.

Unfortunately, however, it is well known that simple solutions to the qualification problem will easily fail if the naive potato and tailpipe example is modified ever so slightly. For example, suppose there are two cars, A and B, it is known that a potato is put into the tailpipe of one of them, and one asks whether car B will start properly. In such a case, the absence of positive knowledge that the tailpipe of car B has been plugged, does not allow one to draw the default conclusion. Or, suppose one does not know whether the plugging of the pipe preceded or succeeded the attempt to start the car; the same difficulty arises again.

Given that previously proposed solutions have failed so easily, one would like to have some evidence that the method proposed in your article is not going to encounter those problems, or other ones which may come up. In other words, here is a case where the shortcomings of the older, example-driven methodology are evident.

My question is what can be said in general about the method you propose for qualification: when is it known to work, and when not?

3. In section 4, you write "LCM has been the first method which actually solved these aspects of the frame problem and did so in the optimally possible way". Since there is no obvious definition for optimality in this respect, I wonder which quality measure do you use, and what is the proof that the LCM method is optimal with respect to it?

4. Another example of an incompletely substantiated claim occurs in your section on ramification. You write "Lifschitz' categorization of fluents does not work in this example. We need to categorize the actions into primary and secondary ones (rather than the fluents) as done in the solution presented in this section". However, the second sentence does not follow from the first one. The fact that the particular variety of fluent categorization that was proposed by Lifschitz doesn't work for the example, does not prove that all fluent categorization methods fail for it, or for some reasonable class of examples. So in what sense do we "need to" categorize the actions?

5. Also with respect to your formalization of Thielscher's example with three switches and a relay, it is remarkable that the electric circuit in question can easily be understood in terms of dependencies and persistence, but the proposed formalization requires the axioms to represent the propagation of changes: "if this fluent changes in such-and-such a way, then that fluent also changes in such a way". This seems clumsy and counterintuitive. Do you claim that it is the best possible representation in the present state of the art?

6. As you correctly observe, a rational agent must be capable of reasoning about the timing of actions and about changes within the duration of an action. In section 6.4 of the paper you describe in outline how to introduce timing of actions into TL. However, the most obvious way of arranging this is by approaches that use explicit, metric time, as in Shoham's work in the 1980's, the Features and fluents approach, and the modern event calculus. The PMON(RCs) logic presented by Doherty and Gustavsson at KR-96 is an example of such a logic. To summarize,

In other words, it does in concrete detail what you only describe in gross outline. My questions are:

  1. What are the advantages of your approach over the one I just referred to? (This is a valid question in view of your statement "A better solution in this sense must be preferred to a deficient one").

  2. In PMON(RCs) and other approaches that use explicit metric time, it is straightforward to make statements about durations, comparing durations of actions, and so on, basically because each interpretation represent an entire history of the world. How can this be done in a logic like TL where the => operator takes one from one state to the next, and updating the current time in the process?

7. You write "section 6 shows how the various aspects involved in reasoning about actions and causality can be taken into account within TL". However, nothing in section 6 or elsewhere in the article presents any concrete results about how spontaneous change in the world can be represented - it is as though the world were entirely static when no actions are taken. The resulting concept of causality is quite meagre.

Here again, there exists in fact a body of results in this area, ranging from the Situated Action Theory of Morgenstern and Stein and my own early work on integrating differential equations into logic (presented at KR-89 and IJCAI-89) to Murray Shanahan's earlier and recent work which uses a very similar approach, and which has also been used for implementing a simulated robot. When you write "another issue concerns the integration of differential equations and their computation within a logic such as TL", the uninformed reader will not easily guess that the integration occurred nine years ago, although not for TL.

8. In section 4, on the topic of how to obtain a specific plan as a solution to a planning problem, you write "[Bibel 1986a] introduced state literals, S(x), which keep track of the states passed through while executing a plan. (...) By unification the variable denoting the goal situation will then along with a successful proof always provide a term that expresses the linearized sequence of actions." Yes, but how is this different from the use of the Answer predicate which was proposed and used by Cordell Green in the late 1960's?

9. In subsection 6.3 you refer to an "easy solution" for allowing action laws for special cases to take precedence over a more general law for the same action. However, in the example you quote, the specialized case only contains an additional effect besides the effect in the general case. Does the same solution also apply if some of the effects of the general case do not arise in the special case?

10. If the answer to the previous question is positive, does the planning method that you propose using unification against the state literal (answer literal) still work when such specificity is allowed? The question arises since unification against answer literals results in planning backwards from goals, whereas the solution for specificity is defined in terms of forward simulation of the plan.

11. A final observation: The title and the introduction of your article focusses on planning, as shown when you write: "In this paper I review the state of the art in deductive planning...", but the major part of the contents (after introducing the TL logic as such) deals with problems in reasoning about actions and change: the frame problem, ramification, and so on. May I suggest that some additional coverage of modern results in the latter area would be appropriate, in particular since this is claimed to be a survey article.

Sincerely

Erik Sandewall

From: Wolfgang Bibel on 27.1.1998

Dear Erik,

Thank you for your continuing interest in the paper underlying my invited IJCAI-97 talk. Since the main focus of your questions concerns ``the research methodology or paradigm being used'' I need first to clarify the nature of this paper.

As you correctly state the paper is meant to be ``a survey paper''; however you failed to add (as clearly stated in the paper in the same sentence) that it is meant to be one ``with an emphasis on the contributions from research groups influenced by the author's own work'' (p2). As an invited talk this is an absolutely appropriate focus. For instance, your own invited talk at the ECAI conference in Budapest was of exactly the same nature. In fact you may recall that I then for fun complained about YOUR ignorance of all of OUR work (although NOT in public as you now decided to do). Reiter's Chambery IJCAI address did again the same representing the Toronto school. And so forth.

In consequence of this purposely chosen and clearly stated nature of the paper there are of course substantial omissions wrt planning, action and causality approaches in general. In fact I would not even consider myself capable of giving a fair survey of the current state of the art in these areas (perhaps not even in deduction any more where I feel myself still better ``at home''). And of course in such a paper there is no need to justify each of the contributions from scratch (even for invited papers there are at least unwritten space limits). If such a need would be, one should not have invited me to give this talk in the first place. 9 out of 11 of your questions are essentially answered by these statements, as I will briefly show you shortly, and have been disappointing in this regard.

Apart from the (in more than one respect incomplete) survey character there is one novel contribution in this paper which is the extension of reasoning about change and causality within the LCM (the action part of TL) frame to include (or to be embedded into) also classical reasoning. In other words the paper marries two complementary areas which belong together. Given this fact the focus of any questions wrt this particular paper should actually be directed to this particular contribution rather than to the contributions which have been done many years ago (as far back as 1985) and to a large extent by students and close colleagues of mine rather than to me. Now to your questions in (boring but time-consuming) detail. As to the paper's section numbering I refer to the version in the web (not to the one in the proceedings).

ad 1.

  provide assessments or validation results for new theories

Since there is no new theory I do not think this particular paper is in lack of such an addition, nor is this the case for the old theories surveyed in the paper. Classical deduction is so established that I am sure you agree wrt this part. As to the action part I know from you personally that you (as well as others) appreciate Thielscher's work especially for his contributions in this respect (relation of the fluent calculus with Lifschitz'  A  etc). In my Section 6.3 I survey the equivalence results between the fluent calculus and the action part of TL. On the basis of these theorems his results apply to TL in the same way.

ad 2.

  My question is what can be said in general about the method you propose for qualification: when is it known to work, and when not?

In line with the paper's survey character I properly state in Section 7.2 to which you refer here the source of this approach: ``The solution is again adapted to TL from [Thi96]'s FC using the example discussed there in great detail.'' In that paper Thielscher indeed gives theorems and proofs addressing your question. He is the right person to answer specific questions in this regard. But I can add a general comment to issues like

  For example, suppose there are two cars, A and B, it is known that a potato is put into the tailpipe of one of them, and one asks whether car B will start properly. In such a case, the absence of positive knowledge that the tailpipe of car B has been plugged, does not allow one to draw the default conclusion.

The advantage of staying within the framework of logic is that you have a long experience on your side (and this is my research methodology which I followed for more than 3 decades). TL can of course handle your example (and also the other one you mention) easily since it features (a resource sensitive) disjunction.

ad 3.

  Since there is no obvious definition for optimality in this respect, I wonder which quality measure do you use, and what is the proof that the LCM method is optimal with respect to it?

Assuming the method is semantically correct (discussed in 1) there is of course an ``obvious definition of optimality'' namely the ones familiar from AD. Any explicit frame axiom increases the search space for the deductive mechanism (and even the proof lengths). Since LCM, TL, FC, LL have no frame axioms at all they are optimal in this respect.

ad 4.

  So in what sense do we "need to" categorize the actions?

Your question refers to the paper's Section 7.1 in which I again clearly stated: ``The discussion in this section closely follows [Thi97b] ...''. So, given the equivalence results mentioned above, your question should again better be addressed to Thielscher directly who in his paper and his habilitation thesis (also cited in this part) provides excellent answers to your question.

ad 5.

  Also with respect to your formalization of Thielscher's example with three switches and a relay, it is remarkable that the electric circuit in question can easily be understood in terms of dependencies and persistence, but the proposed formalization requires the axioms to represent the propagation of changes: "if this fluent changes in such-and-such a way, then that fluent also changes in such a way". This seems clumsy and counterintuitive. Do you claim that it is the best possible representation in the present state of the art?

If you know a less ``clumsy and counterintuitive'' formalization of the example then use it. TL as any logic is a neutral formalism which does not bother about the way particular scenarios are represented in them (as Bob Kowalsky convincingly argued in the memorable Crete workshop in 1985 where, in your presence and bombarded by your criticisms, I first presented LCM - as you can still read in the proceedings transcript of the discussions then).

ad 6.

  My questions are:

  • What are the advantages of your approach over the one I just referred to? (This is a valid question in view of your statement "A better solution in this sense must be preferred to a deficient one").

Again this four paragraph Section 7.4 summarizes the work reported [BT97] and [Gro96] so that the answers to your question are better to be looked up in the original sources rather than in my paper and addressed to the authors rather than to me.

 
  • In PMON(RCs) and other approaches that use explicit metric time, it is straightforward to make statements about durations, comparing durations of actions, and so on, basically because each interpretation represents an entire history of the world. How can this be done in a logic like TL where the => operator takes one from one state to the next, and updating the current time in the process?

It is obvious how to state metric time durations in TL (and in fact illustrated in the lifting example of that section) by having  Tt  before => and  T(t+d,  d  being the duration, after it (the example specializes to  d = 1 ). I am not sure at this point whether Bornscheuer and the other people at Dresden working on time did already consider this more detail. My survey is definitely not a complete one.

ad 7.

  You write "section 6 shows how the various aspects involved in reasoning about actions and causality can be taken into account within TL". However, nothing in section 6 or elsewhere in the article presents any concrete results about how spontaneous change in the world can be represented - it is as though the world were entirely static when no actions are taken. The resulting concept of causality is quite meagre.

Again, my survey is definitely not (meant to be) a complete one even within the LCM family of approaches let alone all the rest. Spontaneous change has been formalized again by Thielscher in the references given in Section 7. I will include a pointer to that issue to avoid the (false) impression of ``meagreness''. Your other point concerning differential equations does not tell me anything new and my remark in the paper remains anyway correct.

ad 8.

  Yes, but how is this different from the use of the Answer predicate which was proposed and used by Cordell Green in the late 1960's?

The technique is of course the same and well established in logic. So what is the point?

ad 9.

  Does the same solution also apply if some of the effects of the general case do not arise in the special case?

No, it does not. It only applies in ``occasions of a similar nature'' as stated there. Again Thielscher is the expert on these issues.

ad 10.

  If the answer to the previous question is positive ...

Since it is negative I skip this point.

ad 11.

  I believe that some additional coverage of modern results in the latter area would be appropriate

I disagree as explained at the outset.

Best regards, Wolfgang

From: Erik Sandewall on 28.1.1998

Dear Wolfgang,

A question can serve as a means of obtaining information, but it can also be the syntactic form that's used for questioning a proposition. For those of my questions that were of the former kind, you have provided valuable answers - thank you. However, for those that are of the latter kind, I want to pursue our discussion a bit further on two specific issues: methodology of research in our area, and citation principles. Both of them have a general interest, I believe, and for simplicity let's address them one at a time.

I'll start with the citation principles. The following are two cases containing a quotation from your article, my question or comment, and your answer:

1. Concerning the answer literals or state predicate:

  For that purpose, [Bibel, 1986a] introduced state literals, S(x), which keep trace of the states passed through while executing a plan. ... By unification the variable denoting the goal situation will then ... always provide a term which expresses the linearized sequence of actions.

  Yes, but how is this different from the use of the Answer predicate which was proposed by Cordell Green in the late 1960's?

  The technique is of course the same and well established in logic. So what is the point?

2. Concerning the combination of differential equations and logic:

  Modelling continuous processes within a logic has become an active area of research though. ... [Herrmann and Thielscher, 1996]. Another issue concerns the integration of differential equations and their computation within a logic such as TL.

  When you write "another issue concerns the integration of differential equations and their computation within a logic such as TL", the uninformed reader will not easily guess that the integration occurred nine years ago, although not for TL.

  Your other point concerning differential equations does not tell me anything new and my remark in the paper remains anyway correct.

I must say that I find your reactions exceptional. It is true that it's always difficult to produce correct and complete references to earlier work; it is difficult for example to give adequate coverage of, and comparisons with approaches different from one's own. In the beginning of your answer, you gave an eloquent description of the problem that we all encounter: you are frustrated by a lack of reference to your own work (even in a case where it was not relevant, actually), and you don't have the space to give an adequate presentation of the work of others.

There is no easy solution to this problem, but would it be possible to agree on a few elementary principles, such as:

These points are fairly obvious, I would think, although I'm afraid they're not always practiced. There may be a number of other principles that ought to be added as well; the present ones resulted from reflecting on the two concrete cases that I quoted above.

Besides a clarification of policy, I think we need additional mechanisms for feedback and correction of citations. The traditional mechanism, namely, the confidential peer reviews, is clearly not sufficient. One obvious way of proceeding is to use ETAI's review mechanism using public discussion, which allows anyone to contribute questions or suggestions to the author in a transparent but undramatical way. This may be used for pointing out additional references that are relevant for the article.

Using a public forum for this feedback to the author has the additional advantage that it can help to resolve disagreements. If an author is unwilling to accept a suggestion for additional citations, it becomes possible for others to listen to the arguments and to form their own opinion. Then at least, if there is a citation fault it's less likely to propagate to later articles written by others.

I am sure that you have also thought about these issues, in particular as the Research Notes Editor of the Artificial Intelligence Journal, and I look forward to your comments on this important topic.

- Erik


Topic-Oriented Discussions

Methodology of Research in Actions and Change

From: Pat Hayes on 20.1.1998

Ray Reiter wrote (21.10, position statement for panel):

  1. Erik's notion of an ontology seems odd to me, mainly because it requires "that the "frame assumption" or assumption of persistence must be built into the ontology".

Yes, I agree. Why choose just that assumption to build in, in any case? It clearly isn't always true (for example, if we are considering temperatures, cooking, drying paint or leaky containers, or indeed any kind of process which all by itself will produce a significant change as time goes by; or when we know that our information is imperfect; or when we have reason to suppose that there may be other agents trying to frustrate us, or even just working in the same area with their own goals which might interfere with ours; or if we know that a we are dealing with an oscillating or unstable system, or one that requires constant effort to maintain its stability.) There are many other equally plausible assumptions. For example, the assumption that things have been pretty much as they are now in the recent past (back-persistence), or that nothing will make any significant difference to anything a long way away (distance-security) or on the other side of a significant barrier (the privacy assumption). All of these, and others, are equally correct and about as useful as the assumption of persistence in limiting the spread of causal contagion.

But I think I may have understood what Erik means. (Erik, can you confirm or deny?) Let me reiterate some old observations. If the world were really a very turbulent place where any action or event might have any kind of consequences, producing all kinds of changes, then there would be no 'frame problem'. While it would of course then be difficult to describe the effects and non-effects of actions, this wouldn't be surprising, and we wouldnt call it a "problem". The FP only seems to be a problem to us because we feel that we should be able to somehow take better advantage of the world's causal inertia. So, perhaps this is what Erik means by saying that the 'frame assumption' must be built-in: he wants a model-theoretic characterisation of this causal inertia, a way to exclude models which, when thought of as physical worlds, would be worlds where things happen for no reason. He wants us to somehow delineate what the 'physically normal' interpretations are.

If this is more or less right, then there seems to me to be a central question. How can we specify the relationship of the logical 'possible world' (which is just a mathematical structure of sets of subsets of ordered pairs, etc.) to the physically possible worlds about which we have intuitions to guide us? This difficulty is illustrated by the recent discussions here. For example, my bitchin' over the distinction between gof-sitcalc and R-sitcalc comes from such a difference. Both of these use a similar notation, considered as first-order axioms: they both have things which have the role of state-to-state functions but which are reified into first-order objects, and a special function which takes a state and such an object and produces a new state. In gof-sitcalc, these are supposed to represent actions taken by agents. In R-sitcalc, they correspond rather to beginnings and endings of processes which are extended through time. The difference lies not in the axiomatic signature or even in the model theory itself, but rather in the intended mapping between the (set-theoretic) model theory and the actual physical world being modelled. We have intuitions about the physical worlds, but we dont have any physical intuitions about formal models.

Heres another illustration. I've never been very impressed by the famous Yale shooting problem, simply because it doesn't seem to me to be a problem. All the usual axioms say is that something is done, something else is done and then a third thing is done, and the outcome is unusual; and the 'problem' is supposed to be that the logic allows that it might have been the second thing that was unusual instead of the third one. When the vocabulary suggests that the third thing is a shooting and the second a mere waiting this is deemed to be a mistake, and much research has been devoted to finding principles to exclude such models. But if the vocabulary is interpreted differently (for example, if the second event had been a shooting and the third one a waiting) this wouldn't be unintuitive. But the usual situation-calculus formalisations of this problem provide no way to distinguish these cases! They say virtually nothing about the actual physical actions involved; certainly not enough about these actions to enable a sensible choice to be made between one formal interpretation and the other. To say that a shooting is 'abnormal' in that it kills someone, in the same sense that, say, a grasping which fails to grasp or which accidentally knocks over something unexpectedly is 'abnormal', is just a misuse of a formal device. Theres nothing abnormal about a shooting resulting in bodily harm, and to invoke circumscription to overcome a 'normal' case of actions being harmless seems obviously too crude a mechanism to be generally useful. (For a general strategy for defusing YSP-type examples, consider a gunfighter who always blows the smoke from his barrel after a successful fight, and ask what persistence or inertial principles are going to ensure that its his bullets that kill people, and not that final flourish.)

Pat Hayes

From: Hector Geffner on 21.1.1998

Pat says:

  ...Heres another illustration. I've never been very impressed by the famous Yale shooting problem, simply because it doesn't seem to me to be a problem ....

I'm not sure I understand Pat's point well, but I think I understand the YSP. Here is the way I see it.

In system/control theory there is a principle normally called the "causality principle" that basically says that "actions cannot affect the past". If a model of a dynamic system does not comply with this principle, it's considered "faulty".

In any AI the same principle makes perfect sense when actions are exogenous; such actions, I think, we can agree, should never affect your beliefs about the past (indeed, as long as you cannot predict exogenous actions from your past beliefs, you shouldn't change your past beliefs when such actions occur).

What Hanks and McDermott show is that certain models of action in AI (like simple minimization of abnormality) violate the causality principle. In particular they show that

your beliefs at time 2, say, after LOAD AND WAIT (where you believe the gun is loaded)

are different from your beliefs at time 2, after LOAD, WAIT and SHOOT.

Namely, SHOOT at t=3 had an effect on your past beliefs (LOADED at t=2).

Most recent models of action comply with the causality principle. In some it comes for free (e.g., language  A  due to the semantic structures used (transition functions); in others (Reiter, Sandewall, etc), I'm sure it can be proved.

Regards.

- Hector Geffner

From: Erik Sandewall on 21.1.1998

Pat, citing Ray Reiter's earlier contribution, you wrote:

  1. Erik's notion of an ontology seems odd to me, mainly because it requires "that the "frame assumption" or assumption of persistence must be built into the ontology".
  Yes, I agree. Why choose just that assumption to build in, in any case? ...

Well, Ray and you are bringing up two different issues here. Ray's objection was with respect to classification: he argued that the frame assumption (when one uses it) ought to be considered as epistemological rather ontological. (In the position statement that he referred to, I had proposed a definition of ontology and suggested that the situation calculus does not represent one, since the frame assumption is represented by separate axioms rather than being built into the underlying ontology). On the other hand, the question that you bring up is what kind or kinds of persistence we ought to prefer: temporal forward in time, temporal backwards, geometrical, etc.

Let me address your letter first. I certainly agree with the analysis in the second paragraph of your message: the world is not entirely chaotic, some of its regularities can be characterized in terms of persistence (= restrictions on change, or on discontinuities in the case of piecewise continuous change) and all those exceptions to persistence that are now well-known: ramifications, interactions due to concurrency, causality with delays, surprises, and so on.

For quite some time now, research in our field has used a direct method in trying to find a logic that is capable of dealing correctly with all these phenomena, that is, by considering a number of "typical" examples of common-sense reasoning and looking for a logic that does those examples right. My concern is that this is a misguided approach, for two reasons:

What I proposed, therefore (in particular in the book "Features and Fluents") was to subdivide this complex problem into the following loop consisting of managable parts (the "systematic methodology"):

  1. Define an ontology, that is, a "model" of what the world is like. States and assignment-like state transitions is a very simple such ontology. Ramifications, concurrency, and so on are phenomena that call for more complex ontologies. Make up your mind about which of them you want to allow, and leave the others aside for the time being.

  2. Define an appropriate logical language for describing phenomena in the ontology, including actions. Each combination of ontology and language defines a mapping from a set of formulae to the set of intended models for those formulae.

  3. Define entailment methods, that is, mappings from the set of classical models to a modified set called the selected models. Usually, the selected models are a subset of the classical models.

  4. Identify the range of applicability for each entailment method, that is, the conditions which guarantee that the selected models are exactly the intended ones.

  5. Define "implementations" of entailment methods by expressing them e.g. in circumscription, or using modified tableau techniques. If the implementation is done right, then the range of applicability for the entailment method is also the range of applicability of its implementations.

  6. When one has obtained sufficient understanding of points 2-5 for a given ontology, then define a richer one (allowing for additional phenomena of interest), and go back to item 2.

This agenda certainly aims to address all the intricacies that you mention in the first paragraph of your message, but only in due time. We can not do everything at once; if we try doing that then we'll just run around in circles.

In the Features and Fluents approach we have iterated this loop a few times, starting with strict inertia and then adding concurrency and ramification, doing assessments in each case. What about the other major current approaches? Early action languages, in particular  A , fits nicely into this paradigm, except that whereas above we use one single language and two semantics (classical models and intended models),  A  uses two different languages each with its own semantics. However, later action languages, such as  AR , do not qualify since they define the models of the action language (intended models, in the above terms) using a minimization rule. To me, minimization techniques belong in the entailment methods which are to be assessed according to the paradigm, but the gold standard that we assess them against should not use such an obscure concept as minimization.

On similar grounds, I argued that a situation-calculus approach where a frame assumption is realized by a recipe for adding more axioms to a given axiomatization does not really define an ontology. It can be measured against an ontology, of course, but it does not constitute one.

Ray's argument against that was that the frame assumption is inherently epistemological, or maybe metaphysical. Since most people would probably interpret "metaphysical" as "unreal" rather than in the technical sense used by philosophers, we couldn't really use that term. With respect to the term epistemological, I just notice that some entailment methods have been observed to have problems e.g. with postdiction: prediction works fine but postdiction doesn't. This means that when we specify the range of applicability of an entailment method, we can not restrict ourselves to ontological restrictions, such as "does this method work if the world behaves nondeterministically?"; we must also take into account those restrictions that refer to the properties of what is known and what is asked, and to their relationship. The restriction to only work for prediction is then for me an epistemological restriction.

On this background, Ray then questioned whether the frame assumption itself is ontological or epistemological in nature. I'd say that in a systematic methodology (as in items 1-6 above), the ontology that is defined in step 1 and revised in step 6 must specify the persistence properties of the world, otherwise there isn't much one can say with respect to assessments. This technical argument I think is more useful than the purely philosophical question of what it "really is".

You then address the following question:

  How can we specify the relationship of the logical 'possible world' (which is just a mathematical structure of sets of subsets of ordered pairs, etc.) to the physically possible worlds about which we have intuitions to guide us? This difficulty is illustrated by the recent discussions here. For example, my bitchin' over the distinction between gof-sitcalc and R-sitcalc comes from such a difference. Both of these use a similar notation, considered as first-order axioms: they both have things which have the role of state-to-state functions but which are reified into first-order objects, and a special function which takes a state and such an object and produces a new state. In gof-sitcalc, these are supposed to represent actions taken by agents. In R-sitcalc, they correspond rather to beginnings and endings of processes which are extended through time. The difference lies not in the axiomatic signature or even in the model theory itself, but rather in the intended mapping between the (set-theoretic) model theory and the actual physical world being modelled...

Yes, exactly! There are two different ontologies at work here; my argument would be that each of them should be articulated in terms that are not only precise but also concise, and which facilitate comparison with other approaches both within and outside KR.

But your question at the beginning of this quotation is a fundamental one: how do we choose the additional ontological structures as we iterate over the systematic methodology loop, and how do we motivate our choices?

In some cases the choice is fairly obvious, at least if you have decided to base the ontology on a combination of structured states and linear metric time (integers or reals). Concurrency, chains of transitions, immediate (delay-free) dependencies, and surprise changes can then be formalized in a straight-forward manner. Also, we can and should borrow structures from neighboring fields, such as automata theory, theory of real-time systems, and Markov chain theory.

However, there are also cases where the choice is less than obvious. What about the representation of actions by an invocation event and a termination event, which is what R-sitcalc is about? What about the recent proposal by Karlsson and Gustafsson [f-cis.linep.se-97-014] to use a concept of "influences" (vaguely similar to what is used in qualitative reasoning), so that if you try to light a fire and I drip water on the firewood at the same time, then your action has a light-fire-influence and my action has an extinguish-fire-influence, where the latter dominates? (If there is only a light-fire-influence for a sufficient period of time, then a fire results). These are nontrivial choices of ontology; how can we motivate them, assess them, and put them to use?

To my mind, this ties in with what Bob Kowalski said in the panel discussion at the recent workshop on Formalization of Common Sense: these are pre-logical issues. It is not meaningful to begin writing formulae in logic at once and to ask what variant of circumscription is going to be needed. Instead, one ought to work out an application area of non-trivial size with the proposed ontology, probably also using a tentative syntax that matches the ontology, but without committing to anything else. Only then, as one knows what ontology is needed, is it meaningful to look for entailment methods and their implementations which may be appropriate for the ontology one needs.

The bottom line is: let's use the ontology, or the underlying semantics, as an intermediate step on the way from application to implemented system. Going from application to ontology requires one kind of activity; going from ontology to implementation requires another kind. Such a decomposition has all the obvious advantages: it allows one to address simpler problems before proceeding to more difficult ones, it provides a way of characterizing and comparing results, and it facilitates reuse of earlier results.

References:

f-cis.linep.se-97-014Lars Karlsson and Joakim Gustafsson.
Reasoning about actions in a multi-agent environment.
Linköping University Electronic Press, 1997: [original publication]

From: Murray Shanahan on 22.1.1998

Pat Hayes wrote,

  I've never been very impressed by the famous Yale shooting problem, simply because it doesn't seem to me to be a problem.

First I think we should distinguish between the Yale shooting scenario and the Hanks-McDermott problem. The Yale shooting is scenario is the one in which someone loads, waits and shoots, and in which those actions have certain prescribed effects. The Hanks-McDermott problem is a difficulty that arises when we take certain approaches to solving the frame problem, and is exemplified in the Yale shooting scenario. The frame problem is the problem of describing the effects of actions in logic without recourse to an excess of axioms describing their non-effects.

If you want to solve the frame problem, your solution had better be able to deal with the Yale shooting scenario. A number of early attempts at the frame problem couldn't, which is why the scenario is of interest.

Isn't that all pretty straightforward?

Murray

From: Pat Hayes on 22.1.1998

Hector Geffner:

  ... I think I understand the YSP. Here is the way I see it.

In system/control theory there is a principle normally called the "causality principle" that basically says that "actions cannot affect the past". If a model of a dynamic system does not comply with this principle, it's considered "faulty".

In any AI the same principle makes perfect sense when actions are exogenous; such actions, I think, we can agree, should never affect your beliefs about the past (indeed, as long as you cannot predict exogenous actions from your past beliefs, you shouldn't change your past beliefs when such actions occur).

But actions - or external events - do change ones beliefs about the past. They do not change the past itself, of course: that is the causality principle. But consider for example coming into a room in an empty house and finding a hot cup of coffee resting on a table. One immediately infers that somone else has been present there recently. We constantly make inferences about the past on the basis of present knowledge, even in such technical areas as military planning.

I think there is a confusion here between (1) drawing conclusions about an external world and (2) simulating an external world by inferring a state from its previous state. The causality principle applies to the latter, but not the former; and even then, it is reasonable only when accompanied by a presumption of a certain kind of completeness in ones knowledge of the state. We often make predictions of the future by a kind of mental 'simulation' by inferring what is going to happen next from what is true now (as in the conventional situation calculus axiomatic approach); but in practice, such simulations are often unreliable precisely because we don't have sufficiently complete knowledge; and when this is so, we cannot cleave to the strict causality principle, but are obliged to use techniques such as nonmonotonic reasoning which allow us to recover gracefully from observed facts which contradict our predictions, which would otherwise enmesh us in contradictory beliefs. Nonmonotonicity is a good example of the need to revise ones beliefs about the past in the light of unexpected outcomes in the present, in fact, which gets us back to the YSP:

  What Hanks and McDermott show is that certain models of action in AI (like simple minimization of abnormality) violate the causality principle. In particular they show that

your beliefs at time 2, say, after LOAD AND WAIT (where you believe the gun is loaded)

But why should you believe the gun is loaded at this time? Why is this considered so obvious? Remember, all the axioms say about WAIT is ...well, nothing at all. That's the point of the example: if you say nothing about an action, the logic is supposed to assume that nothing much happened. But if what we are talking about is a description of the world, saying nothing doesn't assert blankness: it just fails to give any information. If one has no substantial information about this action, the right conclusion should be that anything could happen. Maybe WAIT is one of those actions that routinely unloads guns, for all I know about it from an axiomatic description that fails to say anything about it. So the 'problem' interpretation about which all the fuss is made seems to me to be a perfectly reasonable one. If I see a gun loaded, then taken behind a curtain for a while, and then the trigger pulled and nothing happened, I would conclude that the gun had been unloaded behind the curtain. So would you, I suspect. If I am told that a gun is loaded, then something unspecified happens to it, I would be suspicious that maybe the 'something' had interfered with the gun; at the very least, that seems to be a possibility one should consider. This is a more accurate intuitive rendering of the YSS axioms than talking about 'waiting'.

We all know that waiting definitely does not alter loadedness, as a matter of fact: but this isn't dependent on some kind of universal background default 'normality' assumption, but follows from what we know about what 'waiting' means. It is about as secure a piece of positive commonsense knowledge as one could wish to find. Just imagine it: there's the gun, sitting on the table, in full view, and you can see that nothing happens to it. Of course it's still loaded. How could the bullet have gotten out all by itself? But this follows from knowledge that we have about the way things work - that solid objects can't just evaporate or pass through solid boundaries, that things don't move or change their physical constitution unless acted on somehow, that guns are made of metal, and so on. And the firmness of our intuition about the gun still being loaded depends on that knowledge. (To see why, imagine the gun is a cup and the loading is filling it with solid carbon dioxide, or that the gun is made of paper and the bullet is made of ice, and ask what the effects would be of 'waiting'.) So if we want to appeal to those intuitions, we ought to be prepared to try to represent that knowledge and use it in our reasoners, instead of looking for simplistic 'principles' of minimising changes or temporal extension, etc., which will magically solve our problems for us without needing to get down to the actual facts of the matter. (Part of my frustration with the sitcalc is that it seems to provide no way to express or use such knowledge.)

I know how the usual story goes, as Murray Shanahan deftly outlines it. Theres a proposed solution to the frame problem - minimising abnormality - which has the nice sideeffect that when you say nothing about an action, the default conclusion is that nothing happened. The Yale- shooting- scenario- Hanks- McDermott problem is that this gives this 'unintuitive' consequence, when we insert gratuitous 'waitings', that these blank actions might be the abnormal ones. My point is that this is not a problem: this is exactly what one would expect such a logic to say, given the semantic insights which motivated it in the first place; and moreover, it is a perfectly reasonable conclusion, one which a human thinker might also come up with, given that amount of information.

Murray says:

  If you want to solve the frame problem, your solution had better be able to deal with the Yale shooting scenario.

This is crucially ambiguous. The conclusion I drew from this example when it first appeared was that it showed very vividly that this style of axiomatisation simply couldnt be made to work properly. So if "the Yale shooting scenario" refers to some typical set of axioms, I disagree. If it refers to something involving guns, bullets and time, then I agree, but think that a lot more needs to be said about solidity, containment, velocity, impact, etc., before one can even begin to ask whether a formalisation is adequate to describing this business of slow murder at Yale. Certainly your solution had better be able to describe what it means to just wait, doing nothing, for a while, and maybe (at least here in the USA) it had better be capable of describing guns and the effects of loading and shooting them. But thats not the same as saying that it has to be able to deal with the way this is conventionally axiomatised in the situation calculus.

Imagine a gun which requires a wick to be freshly soaked in acetone, so that just waiting too long can cause it to become unready to shoot. This satisfies the usual YSS axioms perfectly: when loaded, it is (normally) ready to fire, when fired, it (normally) kills, etc.. But if you wait a while, this gun (normally) unloads itself. Now, what is missing from the usual axiomatisation which would rule out such a gun? Notice, one doesnt want to make such a device logically impossible, since it obviously could be constructed, and indeed some mechanisms are time-critical in this way (hand grenades, for example). So one wants to be able to write an axiom which would say that the gun in question isn't time-critical: it has what one might call non-evaporative loading. Maybe its something to do with the fact that the bullets are securely located inside the gun, and that they don't change their state until fired...or whatever. My point is only that there is no way to avoid getting into this level of detail; formalisations which try to get intuitive results with very sketchy information cannot hope to succeed except in domains which are severely restricted.

(Response to Erik Sandewall in later message.)

Pat Hayes

Ontologies for actions and change

From: Hector Geffner on 23.1.1998

Some brief comments about Pat's last comments.

  In any AI the same (causality) principle makes perfect sense when actions are exogenous; such actions, I think, we can agree, should never affect your beliefs about the past ..

But actions - or external events - do change ones beliefs about the past. They do not change the past itself, of course: that is the causality principle. But consider for example coming into a room in an empty house and finding a hot cup of coffee resting on a table. One immediately infers that somone else has been present there recently.

It's important to distinguish observations from actions. In dynamic systems the first are usually expressed as "initial conditions" e.g.,  x(0) = 5 ,  loaded(0) = false , etc; while the latter are the inputs to the system.

An observation at time  i  ("cup in the table") of course should have an effect on your beliefs at times  j < i  (the past) Basically the effect of an observation is to prune state trajectories (this is explicit in Sandewall's filtered entailment, in  A , etc)

On the other hand, what I'm saying (and of course, many other people) is that exogenous actions at time  i , unlike observations, should not have an effect on your beliefs at times  j < i .

You may say that if you drop the cup to the floor and it breaks, then you should change your beliefs about the past, infering things like that the cup was made of glass, etc.

Yet, it's not the action that is having an influence on your beliefs about the past; it is the observation that it breaks (you may say that "breaks" is not an observation but an action; yet in that case, it's definitely not an exogenous action as it depends on variables in the model)

If you remove that observation, and simply drop the cup, you will learn nothing about the past.

BTW, the causality principle is not about physics; I believe it's about models of physics. Whether such (dynamic) models are specified by means of mathematical equations or rules in a logical language, I don't think it's relevant (for compliance with the principle).

A final point before this gets to long. Pat says

  .... such simulations are often unreliable precisely because we don't have sufficiently complete knowledge; and when this is so, we cannot cleave to the strict causality principle, but are obliged to use techniques such as nonmonotonic reasoning which allow us to recover gracefully from observed facts which contradict our predictions, which would otherwise enmesh us in contradictory beliefs. Nonmonotonicity is a good example of the need to revise ones beliefs about the past in the light of unexpected outcomes in the present, in fact, which gets us back to the YSP:

Actually, I think none of the received models of actions in AI (say  A , Toronto Sit Calc, Basic Features and Fluents, etc) does that.

I believe they are all monotonic in the set of observations. In other words, if they predict  F  at time  i , nothing that they observe is going to affect that prediction. At most they can make the theories inconsistent. (one of the few exceptions that I'm aware of, is a proposal in a paper of mine [c-aaai-94-245] in which state trajectories are ordered by a plausibility measure)

They are non-monotonic however in the set of actions. That is, you can affect the prediction " F  at time  i " by performing an action before  i .

It's basically like in the standard dynamic models either deterministic or probabilistic.

So I think our models are not so different from more standard models of action. Of course, they are very different in the description languages; but that's the type of difference that you have between Strips, and transition-functions. The first is much more convenient, but it's basically a "front-end". It's not a new model of action; it's a new language for describing the most basic one (deterministic models). Of course, I'm completely convinced that this is very important, and I think it's precisely there where KR/action fits in.

- Hector Geffner

References:

c-aaai-94-245H. Geffner.
Causal Default Reasoning: Principles and Algorithms.
Proc. AAAI National Conference on Artificial Intelligence, 1994, pp. 245-250.

From: Judea Pearl on 23.1.1998

Correcting a statement by Pat Hayes.

Hector Geffner said, about the principle of causality:

  In any AI the same principle makes perfect sense when actions are exogenous; such actions, I think, we can agree, should never affect your beliefs about the past (indeed, as long as you cannot predict exogenous actions from your past beliefs, you shouldn't change your past beliefs when such actions occur).

To which Pat Hayes replied:

  But actions - or external events - do change ones beliefs about the past. They do not change the past itself, of course: that is the causality principle. But consider for example coming into a room in an empty house and finding a hot cup of coffee resting on a table. One immediately infers that someone else has been present there recently. We constantly make inferences about the past on the basis of present knowledge, even in such technical areas as military planning.

Correction

The principle of causality is in fact stronger than Pat's interpretation of it. Not only the past, but also our beliefs about the past, do not change as a result of actions, unless the acting agent is part of our model. Moreover, if the agent is part of our model, then actions cease to be interesting and problematic as they are today (at least in some AI approaches to actions).

To explain: The action of putting a cup of coffee on the table does not change the state of the coffee or the table before the action, and does not change our beliefs about the state of the coffee before the action. But, Pat will argue: seeing the coffee on the table allows us to infer that "someone else has been present there recently." True, but only if we are concerned about the actor's whereabout and if the limitations or motivations of the action-producing agents are in the model (e.g., that someone must be in a house to do it, and will probably do it if he/she is thirsty, etc.) Once this action are perceived as produced by a modeled agent, it is no different than any other events, say the agent tripping or being hit by a car, because then it is licensed to trigger the usual inferences of predictions and abduction that observations trigger.

Hector said all this by using the term "exogenous". Indeed, the problematic aspects of actions surface when we try to treat actions as exogenous, namely produced by external agents not modeled in our system. Only by considering an action exogenous can we talk about the world "changing" (anyone ever saw a world truly changing?.)

Put another way, the peculiarities of actions vis a vis observations stem from the boundaries we put around our models and our audacity to call our models "worlds".

B. Russel (1910) noted (my words, not his) that if we model the entire universe, there is no meaning to causality or to actions, because the manipulator and the manipulated lose their distinction.

One problematic aspect of dealing with external actions is that ordinary abduction, from the immediate consequences of those actions, must be suppressed. For example, we do not want to abduce that it rained when we decide to pour water on the driveway and get it wet. This suppression is easily enforced in systems such as logic programming (Lifschitz et al) in which inference has preferred directionality, and where abduction is not a built-in feature (e.g., contraposition is not sanctioned) but must be crafted as a special feature. However, the suppression is problematic in standard logic and probability where inference is symmetric, and no differentiation is made between abduction and prediction. Such differentiation is one role played by causally ordered relations.

Well, enough said for one tiny correction.

From: David Poole on 23.1.1998

Pat Hayes wrote:

  But actions - or external events - do change ones beliefs about the past. They do not change the past itself, of course: that is the causality principle. But consider for example coming into a room in an empty house and finding a hot cup of coffee resting on a table. One immediately infers that someone else has been present there recently. We constantly make inferences about the past on the basis of present knowledge, even in such technical areas as military planning.

No. No. No. Observations can change one's beliefs about the past. Doing an action doesn't change beliefs about the past. Observing the hot coffee made you update your beliefs about the past. The action of coming into the room didn't change your beliefs about the past (unless you include the observation that the action succeeded).

To make everything clearer it is crucial to distinguish sensing (making observations) from motor control (doing/attempting actions). The only role of sensing is to update your beliefs. Sensing in this sense doesn't change the world (of course, acting in order that you can sense can, and often does, change the world). Acting (i.e., sending a command to your motor controllers) doesn't change your beliefs about your present or your past, but only about the future. Acting can change the world.

If we fail to distinguish these, confusion will reign supreme. Surely in order to make some progress we could agree on such a distinction, then get back to the argument with at least one confusion cleared up!

David Poole

P.S. I am wondering why such confusion between observing and acting may have arisen in the first place.

I conjecture is that it has to do with the preconditions of actions. For example, the  pickup(x action has the precondition that there is nothing on  x . Then by carrying out the action, can you infer that there was nothing on  x ? But this doesn't make sense. What happens if I had attempted to pickup  x  when there was something on it? What if I didn't know whether there was something on  x  when I tried to pick it up? It seems that the only sensible interpretation of the precondition is that if there was nothing on  x  and I carried out  pickup(x, then the expected thing would happen. If something was on  x  and I carried out  pickup(x then who knows what may happen. The role of the precondition is that it is only sensible to attempt to carry out the action when the preconditions hold.

An alternative explanation of the confusion may be that the action  pickup(x is an observation of the effect of my motor control on a particular state of the world. If I carry out a particular motor control when there is nothing on  x , then a  pickup(x action arises. When I do the same motor control when there is something on  x  then some other action arises. Then I do not choose the action, but I only choose the motor control (consider the case when I don't know whether there is something on  x  or not, and I try to pick it up). Is this what people mean by an action? Surely then it is imperative to distinguish that motor control (for a better description) that I get to choose, from the observation of the effect of that motor control.

From: Erik Sandewall on 23.1.1998

Do all current approaches behave monotonically with respect to observations? On 23.1, Hector Geffner wrote:

  Actually, I think none of the received models of actions in AI (say A, Toronto Sit Calc, Basic Features and Fluents, etc) does that.

I believe they are all monotonic in the set of observations. In other words, if they predict F at time i, nothing that they observe is going to affect that prediction. At most they can make the theories inconsistent.

With respect to features and fluents, this is true for those cases where proven assessments exist at present, but not in general. The full catalogue of ontological characteristics includes such phenomena as "surprises" and "normality", both of which are nonmonotonic with respect to observations. A simple example is for the stolen car scenario: the car is left in the parking lot, three nights pass, it is for sure that the car is not removed during the day, the default is for it to stay where it is at night as well. The default conclusion is that it's still there after the three nights. Then add the observation that it's gone at the end of the period. Without admitting surprises, this scenario is inconsistent, as Hector writes. If surprises are allowed for, then you may conclude that it was removed during one of those three nights.

If you add in addition the fact that the parking lot was supervised the first and the third night, making removal impossible, then one is entitled to infer that it was stolen during the second night. Thus, the initial default conclusion regarding the presence of the car in the parking lot during the day after the second night goes from "yes" via "don't know" to "no" as these successive observations are added.

From the point of view of diagnostic reasoning these are familiar problems, but I can't think of any work in mainstream actions and change that has addressed nonmonotonicity with respect to observations in a serious way. Except, without knowing the details, I would imagine that the people who do probabilistic or possibilistic approaches might have something to say about this. Judea, or the folks in Toulouse, any input?

Do all current approaches comply with the causality principle? On 21.1, Hector Geffner wrote:

  Most recent models of action comply with the causality principle. In some it comes for free (e.g., language  A  due to the semantic structures used (transition functions); in others (Reiter, Sandewall, etc), I'm sure it can be proved.

Yes, with respect to the features and fluents approach, it also "comes for free". The underlying semantics that's used there is essentially a simulation of the world, using non-deterministic transition from state to state or from state to state sequence (the latter in order to account for actions with extended duration). In either case the simulation proceeds forward in time, so it complies with the causality principle. The same applies for the extensions of the approach to deal with concurrency, ramification, and continuous change.

Erik Sandewall

From: Pat Hayes on 25.1.1998

I said:
 

... actions - or external events - do change ones beliefs about the past.... consider for example coming into a room in an empty house and finding a hot cup of coffee resting on a table. One immediately infers that somone else has been present there recently.

This seems quite obvious and uncontroversial to me. One can be surprised by things that happen in one's world. Observations can, in fact, violate one's model of the world and show that it is faulty and needs to be updated; and these updates can involve changing one's beliefs about the past. However, many seem to disagree:

Hector Geffner:

  It's important to distinguish observations from actions.

David Poole:

  No. No. No. Observations can change one's beliefs about the past. Doing an action doesn't change beliefs about the past. Observing the hot coffee made you update your beliefs about the past.

Well, OK. I said 'actions or external events'.

I dont think this sharp distinction between sensing and acting is either necessary or even ultimately coherent, in fact. We are constantly monitoring our own actions in this kind of way, at various levels, so almost every action has observation involved in it. And similarly, it is hard to observe without somehow acting, and a good deal of our planning and acting is motivated by a perceived need to find out more about the world. So observation is intimately bound up with action. Our peripheral systems often blend motor action and lowlevel perception in tight feedback control loops, so that our bodies seem to 'move by themselves', but these lowlevel controls are the result of more cognitive decision-making (deciding to hit a tennis ball, say.)

But even if I agree, for the sake of argument; David's point about observation applies to beliefs about anything, past, present or future. You might predict what is going to happen when you act, but the only way to be sure it will happen is to do it and then take a look. So again, I see no reason why this distinction supports the original claim that we cannot draw new conclusions about the past from observations in the present.

Hector Geffner continues:

  In dynamic systems the first are usually expressed as "initial conditions" e.g.,  x(0) = 5 ,  loaded(0) = false , etc; while the latter are the inputs to the system. An observation at time  i  ("cup in the table") of course should have an effect on your beliefs at times  j < i 

This seems so peculiar that I wonder if we are talking about the same thing. The whole point of making an observation, surely, is to gain information about the present state, not the initial conditions (which, once one has taken an action, are in the past.) Similarly, I wasnt saying that an observation could change one's past beliefs - nothing can change the past - but rather that it could change one's (present) beliefs about the past.

[Regarding Hector's later remarks on nonmonotonicity, I agree with Erik Sandewall's reply in 23.1(98007).]

Judea Pearl:

  Correction

The principle of causality is in fact stronger than Pat's interpretation of it. Not only the past, but also our beliefs about the past, do not change as a result of actions, unless the acting agent is part of our model. Moreover, if the agent is part of our model, then actions cease to be interesting and problematic as they are today (at least in some AI approaches to actions). To explain: The action of putting a cup of coffee on the table does not change the state of the coffee or the table before the action, and does not change our beliefs about the state of the coffee before the action.

Well, I was talking about a later observation of the coffee put there by someone else, not the act of putting it there oneself. But in any case, this second statement again seems to me to be simply false. Suppose that one is walking along in NYC minding ones own business, when suddenly a mugger grabs one from behind and holds a knife to one's throat. Isnt this an example of an "exogenous" act (by an agent not in one's model) causing one to change one's beliefs about a whole lot of things, some of them involving the past, such as that you were alone in the street? (Examples like this are so blindingly obvious that I wonder if we are talking about the same thing??)

I fail to see why having beliefs about other agents (?I presume this is what is meant by an agent being part of a model) means that "actions cease to be interesting". Most of our daily lives are taken up with worrying about what other agents are doing, especially the ones we know a lot about. (Ever had kids to look after, or even a dog?)

  BUT, Pat will argue: seeing the coffee on the table allows us to infer that "someone else has been present there recently." True, but only if we are concerned about the actor's whereabout and if the limitations or motivations of the action-producing agents are in the model (e.g., that someone must be in a house to do it, and will probably do it if he/she is thirsty, etc.)

Well of course. If one knew nothing of people and coffee, one wouldnt come to the conclusion I mentioned. It follows from one's knowledge of such things. That is simply a nonsequitur. Notice however that the inference is that an agent exists, whose existence one had not previously considered. I'm not sure what your doctrine allows, but this doesn't seem quite the same as having that agent in one's model before seeing the cup. Not to mention the mugger.

  Once this action are perceived as produced by a modeled agent, it is no different than any other events, say the agent tripping or being hit by a car, because then it is licensed to trigger the usual inferences of predictions and abduction that observations trigger.

Well, all this talk of 'licence' seems like the reiteration of a doctrine of some kind (one I am not familiar with, being an atheist); but leaving that aside, if I can make sense of this at all, then Judea seems to simply be agreeing with me. The point is that (within one's "licence") it is possible to come to new conclusions about the past on the basis of new information about the present. The direction of inference can be opposed to time's arrow. Is that not what we were talking about?

  Hector said all this by using the term "exogenous". Indeed, the problematic aspects of actions surface when we try to treat actions as exogenous, namely produced by external agents not modeled in our system. Only by considering an action exogenous can we talk about the world "changing" (anyone ever saw a world truly changing?.)

Of course I have seen a world changing, in the relevant sense. It happens even if the actions are entirely within my own conceptual scope, eg as when I flick the light switch, confident that the light will in fact come on, and it does indeed come on. That's an observation of a changing world. (Sometimes the light doesnt come on, and I am forced to update my beliefs, often about the past history of the light bulb or the electrical system.) ....

  One problematic aspect of dealing with external actions is that ordinary abduction, from the immediate consequences of those actions, must be suppressed. For example, we do not want to abduce that it rained when we decide to pour water on the driveway and get it wet.

I confess to failing completely to follow this point. Why is my putting water on my driveway considered an external action? External to what?

Pat Hayes

P.S. back to David Poole:

  P.S. I am wondering why such confusion between observing and acting may have arisen in the first place.

I conjecture is that it has to do with the preconditions of actions. For example, the  pickup(x action has the precondition that there is nothing on  x . Then by carrying out the action, can you infer that there was nothing on  x ?

No, of course not. the implication runs from precondition to result, not in reverse. (You might also have a reverse implication if the precondition was necessary as well as sufficient; but then this would be a valid inference to make. Consider for example that the only way to stay alive is to not hit the oncoming truck; you make a wild swerve, survive, and say with a sigh of relief, Thank God I didnt hit the truck.)

  But this doesn't make sense. What happens if I had attempted to pickup x when there was something on it?

You would have failed, and maybe (if the axiom had been an iff ) concluded that there must have been something on the block after all, or at any rate that something had prevented it being picked up. It would have been an abnormal state, in McCarthy's middle-period sitcalc using ab-minimisation.

  What if I didn't know whether there was something on  x  when I tried to pick it up? It seems that the only sensible interpretation of the precondition is that if there was nothing on  x  and I carried out  pickup(x, then the expected thing would happen. If something was on  x  and I carried out  pickup(x then who knows what may happen. The role of the precondition is that it is only sensible to attempt to carry out the action when the preconditions hold.

No no. This is all a description of an action. What actually happens when you do the actual action may be much more complicated than your description (your beliefs about the action) are able to predict. Maybe something was there that you didnt know about; maybe your idea of lifting is defective in some way. We can never guarantee that our beliefs are accurate, and still less that they are complete. But whatever actually happens, if you are able to deduce from your beliefs that X should happen at time  t , and then when you actually do it, (you observe that) X doesnt happen at time  t , you want to be able to recover from this situation without dissolving into inconsistency. Hence the (original) need for nonmonotonic logics.

  An alternative explanation if the confusion may be that the action  pickup(x is an observation of the effect of my motor control on a particular state of the world. If I carry out a particular motor control when there is nothing on  x , then a  pickup(x action arises. When I do the same motor control when there is something on  x  then some other action arises. Then I do not choose the action, but I only choose the motor control (consider the case when I don't know whether there is something on  x  or not, and I try to pick it up). Is this what people mean by an action? Surely then it is imperative to distinguish that motor control (for a better description) that I get to choose, from the observation of the effect of that motor control.

In the sitcalc (any variety), actions are changes in the world, not motor commands. One plans by thinking about the changes, not by thinking about the muscles one is going to use. Putting such a system into a robot requires one to somehow connect these actions with motor controls, no doubt, but they shouldnt be identified. (Murray and Ray, do y'all agree??)

From: Michael Thielscher on 26.1.1998

On 23.1., Erik wrote

  ... but I can't think of any work in mainstream actions and change that has addressed nonmonotonicity with respect to observations in a serious way.

Well, I at least intended to be serious in my KR'96 paper on "Qualification and Causality" [c-kr-96-51], where I address the Qualification Problem. The latter is inherently concerned with nonmonotonicity wrt. observations--if we view it the way John introduced it, and not oversimplify it to the marginal issue of deriving implicit preconditions from state constraints. The classical example goes as follows. By default, we conclude that the car's engine is running after turning the ignition key. Adding the observation that initially the tail pipe houses a potato, the previous conclusion gets withdrawn. Part of my paper was concerned with a Fluent Calculus axiomatization capable of dealing with (a well-defined aspect of) the Qualification Problem. A nice feature about the Fluent Calculus is that it provides monotonic solutions to both the Frame and the Ramification Problem. But when it comes to the Qualification Problem, nonmonotonicity is inevitable per definitionem, which is why my Fluent Calculus axiomatization in the KR paper comes with a nonmonotonic feature.

References:

c-kr-96-51Michael Thielscher.
Causality and the Qualification Problem.
Proc. International Conf on Knowledge Representation and Reasoning, 1996, pp. 51-62.

From: Erik Sandewall on 26.1.1998

Michael,

Sorry about that, I expressed myself imprecisely. What I had in mind was in the context of Pat Hayes's contribution:
  Nonmonotonicity is a good example of the need to revise ones beliefs about the past in the light of unexpected outcomes in the present,...
which is why I quoted this elaboration of the stolen car scenario. You are of course quite right that for the topic of observation nonmonotonicity as such (not restricted to "backwards in time"), approaches to the qualification problem and in particular your contribution are highly relevant.

What happens in the approach of your KR-96 paper for the case of "backwards in time", such as the stolen car scenario?

Re the potato in tailpipe scenario, see also my question number 2 to Wolfgang Bibel in yesterday's newsletter.

Erik

From: Judea Pearl on 26.1.1998

On Actions vs Observations, or on Pat Hayes' reply to Geffner, Poole and me.

Well, well, and I thought my tiny correction would end with Pat just replying "Of course, I did not mean it ..." Instead, it now seems that the cleavage between the culture that Hector, David and I represent and the one represented by Pat has gotten so deep that we are not even sure we are talking about the same thing.

Pat does not think the "distinction between sensing and acting is either necessary or even ultimately coherent", For him, observing a surprising fact evokes the same chain of reasoning as establishing that fact by external act. In both cases, so claims Pat, the world is changing, because a world is none other but one's beliefs about the world, and these do change indeed in both cases.

I tend to suspect that Pat's position is singular, and that most readers of this newsletter share the understanding that it is useful to think about observations as changing beliefs about a static world, and about actions as changing the world itself. I am under the impression that this distinction has become generally accepted among AI researchers (as it has among philosophers and database people, e.g., counterfactual vs indicative conditionals, imaging vs. conditioning, belief updating vs. belief revision, etc. etc.) Otherwise, one would find it hard to understand why so much energy has been devoted in AI to actions vis a vis observations, why the title of this Newsletter is what it is. Why the Frame Problem does not exist relative to observations (e.g., need frame axions for "I saw only x"?), why there is no ramification problem relative to observations (surely every observation has some ramifications) and why people write so many papers on concurrent actions and not on concurrent observations. Or are these observational problems just waiting around the corner to hit us once we solve the corresponding problems with actions?

If Pat's position is singular, I would not wish to bore the rest of the readers with this issue, and I I will pursue it with Pat in the privacy of our screens. I am curious, though, if my suspicion is correct, and I will be waiting readers' feedback.

----------------

Those who read my paper in TARK 1996 need not read further. Others may wish to ask when a change is considered a "world-change" and when it is merely a "belief change".

Let us use Pat's example:
  Of course I have seen a world changing, in the relevant sense. It happens even if the actions are entirely within my own conceptual scope, eg as when I flick the light switch, confident that the light will in fact come on, and it does indeed come on. That's an observation of a changing world. (Sometimes the light doesnt come on, and I am forced to update my beliefs, often about the past history of the light bulb or the electrical system.)

Question: Is flicking the switch and seeing the light come on an example of a changing world ?

But we are not dealing here with physics or psychology. We are dealing with various formal systems for modeling the world and our beliefs about the world. For every such system, if we are to accomplish anything more useful than trying to solve Schrödinger's equation for all the particles in the universe, we must define some limits, or boundaries, thus distinguishing the things our system will explain from those that it wont. Once the boundaries are defined, they also define a set of relationships as invariant namely holding true permanently for all practical purposes unless they are violated by exceptions that come from outside those boundaries. We say that the world "changed" if any of these invariant relationships is violated.

These concepts were developed for systems with static boundaries. Humans on the other hand, show remarkable flexibility in expanding and shrinking these boundaries as the needs arise, and most of Pat's examples draw on this flexibility. This may account for his statement:
  I dont think this sharp distinction .... is either necessary or even ultimately coherent, in fact. We are constantly....
Yes. We are constantly expanding (shrinking) our model of the world as we bring in more (less) background knowlege into the working space of our inference engine. But most formal systems today are not we, they do work with fixed and finite number of invariants, and the laws of astrophysics are not among them.

Flicking the switch would be a case of a changing world if our formal system is ignorant of the causal connection between the switch and the light. But for a formal system that includes causal rules saying

	 "switch flicked causes light on"
and (this is important)
	 "switch unflicked causes light off"

seeing "light on" (even without flicking the switch) need not be considered a change of world, because "light on" can be explained within the system (in terms of the switch being flicked by some agent) and there is no need to invoke such dramatic phrases as "changing worlds" for things that we know how to handle by standard inference methods. (e.g., adding a proposition "light on" to the system and let a classical theorem prover draw the consequences. [and if our current beliefs contain "light off" then the contradiction can be handled by either temporal precedence or minimal-change belief-revision]).

What cannot be explained within such a system is the flicking of the switch (unless we model the agents' motivations too). Thus, we ask: Should the flicking of the switch be considered a "change of world" in such system? Yes, it can be considered such. But if our model is confined to the objects and the lighting in the room, and does not contain knowledge about switch-flicking agents, then it wont make any difference if we consider it on equal footing with the observation: "the switch is reported to be flicked". All inferences triggered by a changing world semantics will also be triggered by an observational semantic, according to which the world was static and we just opened our eyes and learned a new fact: "switched is flicked".

So when must we resort to the changing-world semantics? Pat answered this question:
  Sometimes the light doesnt come on, and I am forced to update my beliefs, often about the past history of the light bulb or the electrical system.)

Here we have it! a violation of one of the system's invariant -- the causal rules (stated above), which specify the influence of the switch on the light. Why it is special? Consider now the action "take the bulb out" with its immediate consequence "light out". Had the causal rule (above ) been qualified with "flick causes light unless someone takes the bulb out" there would be no violation, and again we can absorb the new information as before, without invoking world-change semantics. But without such qualification, a violation occurs in a rule that was meant to be permanent (Vladimir even uses the predicate always to distinguish causal rules from less permanent relationships) Well, now we have no choice BUT to invoke changing world semantics. The new semantics is similar to the one for changing-observations, with one important difference: the suppression of certain explanations. Before the action "take the bulb out" was enacted we had the license (excuse the religious overtones, Pat) to use our causal rule and infer and explanation: "light out, therefore switch must be unflicked" But after that action, the license is revoked, and this explanation should not be infered. (even though the causal rule still resides somewhere in the database, ready to be used again in a new situation.)

And this is where the difference between actions and observations come in. An action must be equipped with a pointer to mark some causal rule "violated", to suppress its use in abduction (or explanation). Such marking is needed only when some of the predicates in the action's add-list are consequences by some causal rule. Predicates that are not consequences of any causal rules are called "exogenous", and it does not matter if we treat them as actions or observations -- we will never attempt to explain them anyway.

Looking back, it appears that I did manage to bore you after all, despite my determination to do it to Pat only.

Time to stop.

Judea

From: Erik Sandewall on 26.1.1998

The topic of "what's in an action?" is important and quite worth some attention. In the previous Newsletter issues, David Poole wrote and Pat Hayes answered as follows:

  If I carry out a particular motor control when there is nothing on  x , then a  pickup(x action arises. When I do the same motor control when there is something on  x  then some other action arises. Then I do not choose the action, but I only choose the motor control (consider the case when I don't know whether there is something on  x  or not, and I try to pick it up). Is this what people mean by an action? Surely then it is imperative to distinguish that motor control (for a better description) that I get to choose, from the observation of the effect of that motor control.
  In the sitcalc (any variety), actions are changes in the world, not motor commands. One plans by thinking about the changes, not by thinking about the muscles one is going to use. Putting such a system into a robot requires one to somehow connect these actions with motor controls, no doubt, but they shouldnt be identified. (Murray and Ray, do y'all agree??)

However, earlier in the same contribution Pat had written:

  ... Our peripheral systems often blend motor action and lowlevel perception in tight feedback control loops, so that our bodies seem to 'move by themselves', but these lowlevel controls are the result of more cognitive decision-making (deciding to hit a tennis ball, say.)

Pat, I can't make sense out of your position: at one point you seem to argue that low-level and high-level descriptions of actions can't ever be separated; at another point you seem to say that they are best treated in complete separation.

My own preference is to take both into account, but to be precise about having two distinct levels of descriptions with distinct roles. In particular, this allows for dealing both with an action as a prescription for motor controls and as the expectation for what state changes will be obtained as a result. It also allows one to relate those two levels to each other.

Regardless of level, I understand an action as an instantaneous invocation that starts a process, just like in the ontology of Reiter-sitcalc. The invocation is treated similarly across levels. The two levels of description apply for the process; I've called them the material level and the deliberative level (previously in [mb-Sandewall-94], p. 3, but the deliberative level was called "image level"). An illustrative example of an action description on the material level is as follows for the action of going from the current location A to a new location B: turn around until you're looking towards B, then accelerate by 0.3 m/s/s until you reach the velocity of 1 m/s, from then on keep a constant velocity until you are at B. A lot of details aside, such as defining what it means to be "at B", the point I want to illustrate is that we use a continuous level description of the action, or more precisely, at least piecewise continuous for all the relevant state variables, but with a possibility to change the control mode due to interventions by some higher control level.

From a software engineering point of view, this is more like a specification than like a program. It is also "above" the level of control engineering in the sense that it provides no details about the feedback control loop between velocity sensor and motor controls. At the same time, it is much closer to the control level than a conventional description in some of our favorite logics in KR.

The deliberative level is the one that Pat alludes to, where actions are characterized by discrete properties at a small number of timepoints: possibly only the beginning and the end of the action, possibly a few more, possibly a sequence of partial world states at integer timepoints (as in basic Features and fluents). From the point of view of the deliberative level, it may be convenient to think of the material level as a "program" for performing the action, but then in a very general sense of program.

Here are some of the benefits from the explicit treatment of the action's process on both the material level and the deliberative level:

Furthermore, David Poole wrote and Pat Hayes answered:

  What if I didn't know whether there was something on  x  when I tried to pick it up? It seems that the only sensible interpretation of the precondition is that if there was nothing on  x  and I carried out  pickup(x, then the expected thing would happen. If something was on  x  and I carried out  pickup(x then who knows what may happen. The role of the precondition is that it is only sensible to attempt to carry out the action when the preconditions hold.

  No no. This is all a description of an action. What actually happens when you do the actual action may be much more complicated than your description (your beliefs about the action) are able to predict. ... But whatever actually happens, if you are able to deduce from your beliefs that X should happen at time  t , and then when you actually do it, (you observe that) X doesn't happen at time  t , you want to be able to recover from this situation without dissolving into inconsistency. Hence the (original) need for nonmonotonic logics.

I'll agree with David if the last sentence is changed to go "...it is only sensible to attempt to carry out the action when the preconditions are believed to hold". This belief may be based on correct observations, on defaults, or even on incorrect observations, but if one does have bona fide belief that the preconditions hold, with a high degree of assurance (and one has no particular reason to believe that a great disaster is to ensue if one happens to be mistaken, etc. etc), then of course it is sensible to attempt the action.

Therefore, Pat, I don't understand what you are objecting against in this respect. (And what does it have to do with the original need for nonmonotonic logic?)

Erik

References:

f-cis.linep.se-97-019Erik Sandewall.
Logic-Based Modelling of Goal-Directed Behavior.
Linköping University Electronic Press, 1997: [original publication]
j-aicom-9-214Erik Sandewall.
Towards the validation of high-level action descriptions from their low-level definitions. [E-press]
AI Communications, vol. 9 (1996), pp. 214-224.
mb-Sandewall-94Erik Sandewall.
Features and Fluents. The Representation of Knowledge about Dynamical Systems.
Oxford University Press, 1994.

From: Michael Thielscher on 28.1.1998

Dear Erik,

On 26.1., you wrote:

  What happens in the approach of your KR-96 paper for the case of "backwards in time", such as the stolen car scenario?

Re the potato in tailpipe scenario, see also my question number 2 to Wolfgang Bibel in yesterday's newsletter.

The approach is not restricted to projection, so observations may very well give cause to revising one's belief about the qualifications of an action in the past. As for the stolen car scenario: The only abnormality I consider in the paper is that of being unable to perform an action, in which case none of its effects materializes. Your Stolen Car scenario requires to consider abnormalities as to the surprising production of a single effect (or the failure of producing an expected effect). However, I can give you a straightforward formalization of your example in the Fluent Calculus, including default rules, along the lines of my KR-paper. The resulting axiomatization supports precisely the intended conclusions which you mentioned. My approach also works with non-deterministic action, so if an action has the (nondeterministic) effect that the tail pipe of either of two cars A and B gets clogged, then two preferred models result, one of which denies that we can start car B--as intended.

Michael

From: Michael Gelfond on 29.1.1998

I would like to better understand the following comment by Hector Geffner:

  I believe they (models of actions in AI) are all monotonic in the set of observations. In other words, if they predict  F  at time  i , nothing that they observe is going to affect that prediction.

If I understood Hector correctly, the following may be a counter example. Consider the following domain description D0 in the language  L  from [j-jlp-31-201].

The language of D0 contains names for two actions,  A  and  B , and two fluents,  F  and  P . D0 consists of two causal laws and two statements describing the initial situation  S0 :
    A causes F if P.   
    B causes neg(P).   
    true_at(PS0).   
    true_at(neg(F), S0).   
The first statement says that  F  will be true after execution of  A  in any situation in which  P  is true. The third one means that  P  is true in the initial situation  S0 .  neg(P stands for negation of  P .

(Domain descriptions in  L  allow two other types of statements:  occurs(AS - action  A  occurred at situation  S , and  S1 < S2 . We use them later)

Here we are interested in queries of the type
    holds(F,  [A1, ...An] )   
which can be read as ``If sequence  A1...An  were executed starting in the current situation then fluent  F  would be true afterwards''. This seems to correspond to Hector's prediction of  F  at time  i . We can also ask about occurrences of actions, truth of fluents in actual situations, etc).

The entailment relation on  L  between domain descriptions and queries formalizes the following informal assumptions:

As expected, we have that
    D0 entails holds(F,  [A] )   
Now assume that the agent observed (or performed)  B . This will be recorded in his description of the domain. New domain description D1 is D0 plus the statement
    occurs(BS0).   
Now we have that D1 entails  neg(holds(F,  [A] )) . It seems to me that the observation changed the prediction.

The second example shows how observations can change beliefs about the past. Consider a domain description D3
    A causes neg(F).   
    F at S0.   
This description entails  neg(occurs(AS0)) . Now the reasoner observed that in some situation  S1 ,  F  is false. This is recorded by adding to D3
    S0 < S1   
    neg(F) at S1.   
The new description entails  occurs(AS0. Again, observations changed the belief (this time about the past).

Hector, is this really a counter example or you meant something else?

Reference.

C. Baral, M. Gelfond, A. Provetti, ``Representing Actions: Laws, Observations and Hypotheses'', Journal of Logic Programming, vol. 31, Num. 1,2 and 3, pp. 201-245, 1997.

References:

j-jlp-31-201Chitta Baral, Michael Gelfond, and Alessandro Provetti.
Representing Action: Laws, Obervations and Hypotheses.
Journal of Logic Programming, vol. 31 (1997), pp. 201-244.

From: Luís Moniz Pereira on 29.1.1998

Dear Erik,

I noticed in the discussion that you said:
  From the point of view of diagnostic reasoning these are familiar problems, but I can't think of any work in mainstream actions and change that has addressed nonmonotonicity with respect to observations in a serious way.

I have tackled the issue of nonmonotonicty with respect to observations. Cf my home page, the AAAI-96, ECAI-96, AIMSA-96, LPKR97, JANCL97, AI&MATH98 papers. Using a LP approach I perform abuction to explain observations. The abductive explanations may be: non-inertiality of some fluent with respect to some action; occurrence of some erstwhile unsuspected foreign concurrent action along with some action of mine; or opting for a definite initial state of the world up till then given only by a disjunction of possibilities.

You're right, the techniques I and my co-author, Renwei Li, use were first developed by me and others in the context of diagnosis using LP! In fact we haven't yet used them all up yet in actions. For a view of LP and diagnosis, as well as representing actions in LP, see our book [mb-Alferes-96].

Best, Luís

References:

mb-Alferes-96José Júlio Alferes and Luís Moniz Pereira.
Reasoning with Logic Programs.
Springer Verlag, 1996.

From: Hector Geffner on 30.1.1998

This is in reply to Michael's message.

As I see it, in the "basic" approaches to the frame problem (Reiter's completion, Sandewall form of chrnological minimization, language  L , etc), in one way or another, action rules are compiled into transition functions of the form  f(as - where    is a state and    is an action - that describe the set of possible state trajectories. Observations in such models just prune some of the trajectories and hence have a "monotonic" effect (i.e., predictions are not retracted, at most they are made inconsistent).

In other models, Michael's and Luís's included, actions are represented in the state (in one way or the other) and hence abdution to both fluents and actions are supported. Such models are non-monotonic in the set of observations. Actually the only apparent difference in such models between actions and fluents is that the former are assumed to be "false by default" while the latter are assumed "to persist by default".

Interestingly, there seems to be another difference between actions and fluents in those models, and that is, that actions, unlike fluents, are not allowed to appear in the head of rules. Actually I'm not sure whether this is the case in all such models, but it is true at least in my AAAI-96 paper. The reason I excluded actions from rule heads was precisely "to prevent actions from having an effect on beliefs about the past" (just think of a rule like hungry then eat). Again "the causality principle".

I wonder now what's the reason the same constraint is in force in most other models of such type (if that's so).

Any ideas?

Hector Geffner


Edited by Erik Sandewall, Linköping University, Sweden. E-mail ejs@ida.liu.se.