******************************************************************** ELECTRONIC NEWSLETTER ON REASONING ABOUT ACTIONS AND CHANGE Issue 98068 Editor: Erik Sandewall 14.9.1998 Back issues available at http://www.ida.liu.se/ext/etai/actions/njl/ ******************************************************************** ********* TODAY ********* Today's issue contains David Poole's answer to Uffe Kjaerulff's questions re his ETAI submitted article. --- ETAI JUNCTION - NEW WEBPAGE STRUCTURE FOR ETAI --- A new structure and layout for the ETAI web pages has been developed during the summer months. The new design is now sufficiently complete to be released, and it has just replaced its predecessor at http://www.ida.liu.se/ext/etai/ The main features of the new design are: - More systematic presentation of auxiliary information, such as information for authors, information for guests - A uniform structure, common for all the ETAI areas - More professional layout and graphics We hope you will appreciate this innovation, and as usual we welcome your comments about it. We wish to reserve the acronym ETAI for the **Transactions**, that is, for our journal publication. It is therefore not very appropriate to talk about "the ETAI webpage" since it suggests that ETAI would be first and foremost "just" a webpage. We will therefore use the term **the ETAI Junction** for the webpage structure that contains all the information pertaining to the ETAI as a journal as well as the review dicussions, colloquia, and other surrounding information. ********* ETAI PUBLICATIONS ********* --- DISCUSSION ABOUT RECEIVED ARTICLES --- The following debate contributions (questions, answers, or comments) have been received for articles that have been submitted to the ETAI and which are presently subject of discussion. To see the full context, for example, to see the question that a given answer refers to, or to see the article itself or its summary, please use the web-page version of this Newsletter. ======================================================== | AUTHOR: David Poole | TITLE: Decision Theory, the Situation Calculus and Conditional | Plans | PAPER: http://www.ida.liu.se/ext/epa/cis/1998/008/paper.ps | REVIEW: http://www.ida.liu.se/ext/etai/received/actions/008/aip.html ======================================================== -------------------------------------------------------- | FROM: David Poole -------------------------------------------------------- Uffe Kjaerulff writes: > The focus of the paper seems to be on representational issues, which > is fine. But, being a Bayesian-network person, I would find it > interesting to see how computations of expected utilities of plans are > performed in the ICL framework. This issue is only touched upon very > briefly in the paper, mentioning that it involves enumerating all > possible cases. In my understanding that amounts to constructing the > corresponding decision tree (which may be huge), and then computing > the probability and utility for each leaf of the tree. > > So, my question boils down to: Is there an efficient computational > scheme for ICL? Thanks for the question! First, ICL theories can be seen as representations of Bayesian networks; there is a local translation between groundings of the ICL theories and Bayesian networks. ICL theories (with only choices by nature) can be seen as first-order rule-based representations of Bayesian networks. Thus any Bayesian network algorithms can, in principle, be used for the ICL. There are two main computational advantages of the ICL approach. The first is that they can exploit logic-based reasoning techniques for probabilistic reasoning (although it turns out that once the link is realized, it isn't restricted to ICL theories). Secondly they give us an opportunity to exploit, not only the graphical independence of a Bayesian network, but also the contextual independence that i natural in the rule based representations. I will expand briefly on each of these here. The ICL (with only choices by nature) can be seen as a way to write first-order Bayesian networks as rules. There is a local translation between ICL theories and Bayesian networks. Note that a Bayesian network has nothing to say about how a node depends on its parents. The ICL lets us write the conditional probabilities as a set of rules. (This is often a more compact representation than conditional probability tables when there are contextual independencies). To translate an ICL theory into a Bayesian network: ground the theory (substituting ground terms for the free variables); the ground atoms are the nodes of the Bayesian network; the atoms in the bodes of the rules that imply a become the parents of a. There is a lot more structure that can be expressed by the rules than in the network (in particular the contextual independencies). Note that the acyclicity of the logic program ensures that the resulting network is acyclic and (even when the resulting network is infinite) there are only finitely many ancestors for any node (which means you only need a finite subset of the graph to answer any conditional probability statement). To translate a Bayesian network to an ICL theory, you write as rules how the parents of a influence a. The atomic choices are introduced as extra conditions on the rules to let us interpret the rules logically. For more details see (Poole 1993). The first way that the ICL can be used computationally is adapting the logic-based algorithms for probabilistic reasoning. In particular, the algorithms for consistency-based diagnosis (such as the focusing algorithms of de Kleer), based on the notions of conflicts, can be adapted to compute posterior probabilities in the ICL (and so for Bayesian networks). This works well when there are skewed probability distributions. This has been explored in (Poole 1996). Another approach is to exploit the structure that can be expressed in a Bayesian network as well as the rule structure of the ICL. This is ongoing work with Nevin Zhang. I'll first sketch how the efficient structured Bayesian network algorithms (such as clique-tree propagation and more query-oriented algorithms such as D'Ambrosio's SPI, Zhang and Poole's VE or Dechter's bucket elimination) can be seen as ways to exploit the factorization of the joint probability distribution. The basic operation is to sum out a (non-observed, non-query) variable. Algorithms such as clique-tree propagation are efficient because they can compile this operation into an secondary structure (the clique tree or join tree) that works by message passing in the join tree so that the posterior distribution for all variables can be obtained in twice the time it takes to compute the posterior on one. The basic operation of the message passing from one clique to another is to sum out the variable that appears in one clique that doesn't appear in the other. It turns out that you can carry out a similar operation on the rule base directly. We have been working on a rule-based variation of the query-oriented algorithms I call probabilistic partial evaluation (Poole 1997). Eliminating a variable turns out to be like resolution. When the rules give no structure beyond that of the Bayesian network (i.e., there is no contextual independence, and there is a rule for each assignment of variables to a parent), the algorithm reduces to the standard query-oriented Bayesian network algorithms. But when there is contextual independence (as there is in many ICL theories) the algorithm lets us exploit that. This also seems to be very promising for approximation, where different approximations can be used in different contexts (Poole 1998). Aside: One main difference between clique tree propagation and the query-oriented algorithms is that clique tree propagation acts on marginal distributions (each clique represents the marginal distribution of the variables in the clique), whereas the query-oriented algorithms act on conditional probabilities. Probabilistic partial evaluation relies on manipulating conditional probabilities. The contextual independencies get lost quickly when you reason with marginal distributions. Finally, there is also a promise of structured dynamic programming using the rules. The most comment methods for solving sequential decision problems is to do dynamic programming. Traditional dynamic programming explicitly reasons over state space.s and thus suffers from what Bellman calls the "curse of dimensionality". We should be able to do much better by reasoning in terms of the variables (or propositions) that define the state space and exploiting the rule-structure in dynamic programming. In the ICL, dynamic programming looks much like regression planning. We can either do this in POMDPs where we regress values through conditional plans (Boutilier and Poole 1996) or where we are building plans conditional on history (as in influence diagrams), and we can use the rule structure to determine states that all have the same utility (Poole 1995). Thus in answer to your question, this is an active area of research by myself and others who are looking at exploiting contextual independence for probabilistic inference and sequential decision making. One of the reasons I am optimistic that this will work is that it promises nice methods of approximation, where we can approximate differently in different contexts. We will have to see if we can deliver on this promise. References The following are references to some of my papers that give more details (the HTML version gives hyperlinks to the actual papers): C. Boutilier and D. Poole, ``Computing Optimal Policies for Partially Observable Decision Processes using Compact Representations'', Proc. Thirteenth National Conference on Artificial Intelligence (AAAI-96), Portland, Oregon, August 1996. D. Poole, ``Probabilistic Horn abduction and Bayesian networks'', Artificial Intelligence, 64(1), 81-129, 1993. D. Poole, ``Exploiting the Rule Structure for Decision Making within the Independent Choice Logic'', Proceedings of the Eleventh Conference on Uncertainty in AI, Montreal, August 1995, 454-463. D. Poole, ``Probabilistic conflicts in a search algorithm for estimating posterior probabilities in Bayesian networks'', Artificial Intelligence, 88, 69-100, 1996. D. Poole, ``Probabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference'', Proc. Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97), Nagoya, Japan, August 1997, pp. 1284-1291. D. Poole, ``Context-specific approximation in probabilistic inference'', Proc. Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), Madison, Wisconsin, July 1998. N.L. Zhang and D. Poole, ``Exploiting Causal Independence in Bayesian Network Inference'', Journal of Artificial Intelligence Research, 5, 301-328, 1996. ******************************************************************** This Newsletter is issued whenever there is new news, and is sent by automatic E-mail and without charge to a list of subscribers. To obtain or change a subscription, please send mail to the editor, erisa@ida.liu.se. Contributions are welcomed to the same address. Instructions for contributors and other additional information is found at: http://www.ida.liu.se/ext/etai/actions/njl/ ********************************************************************