Lab 1: Modeling classical planning domains

Read the lab introduction page before you start!
The introduction contains important information that is common to all labs.

Please use the troubleshooting page if you encounter any problems.

Updates

Any updates or changes we make in the instructions for this lab during the course will be announced here.

Introduction

The aim of the first two labs is:

To practice modelling planning problems using a formal language that can be understood by an automated planner. This is useful in itself, but the intention is also to provide a feeling for how problems are structured, which in turn helps you understand planning algorithms and heuristics.
To give a feeling for the capabilities and limitations of today's domain-independent planning technology. How fast might a planner be? What problem sizes can we solve in a reasonable amount of time?

You will do this by using a classical fragment of PDDL to model a logistics domain for UAVs (unmanned aerial vehicles, typically helicopters). This domain will initially be quite simple and will later be gradually extended, in this lab as well as future labs. You will test your PDDL domain specifications on a number of problem instances using a set of well-known classical planners developed by various research groups around the world.

Because we want to be able to test using a wide variety of planners, we will focus on the STRIPS level of expressivity in PDDL. In other words, we will not yet begin to use negative preconditions, negative goals, conditional effects, or similar extensions above the baseline level of expressivity. Many planners do support some extensions, but few planners support exactly the same ones, and they often have limitations or bugs.

To avoid trouble, use the STRIPS level of expressivity at least in the beginning. This will be sufficient for all the modeling tasks in labs 1-3.

Note that we are talking about avoiding negative preconditions and goals. Negative effects are always allowed and are in fact required for almost all ordinary planning problems. Also, using :typing should be OK and will eventually be required; it's changes to the permitted formula types that we want to avoid.

Most likely, you will start working with this lab before we go through the different planning techniques that are used in current classical planners. If this is the case, detailed descriptions of certain planners may not make much sense. But don't worry: The main point of this lab is to understand how planning domains are modeled and how current planners perform.

Writing a Report

After you finish all parts of this lab, you will hand in the domain and problem files you generate, as well as a short(ish) report on your results as discussed in each exercise below. There is no minimum (or maximum) page count for this report, and you do not need to write a long introduction or be extremely formal. Simply write a clear and comprehensible report on the results we ask for. You can include (all or part of) the output from running certain planners if this is helpful in explaining your results. Any suggestions you may have for improving the lab or the instructions are also very welcome – we want your feedback!

The report can be handed in, in your Git repository, as a simple text file, a PDF document, or a Markdown document.

Throughout the instructions, we will show what you need to include in the report.

Getting started

Before you start, make sure you have received an email with a link to a Gitlab project for your Webreg group. Navigate to the repository, clone it, and do all your development within that repository. Don't forget to commit your changes and push them to Gitlab regularly.

Do not create your own Git repository for this course. You need to do your development in the repository we provide for you.

If you should happen to be unfamiliar with Git and Gitlab, there are introductions in Swedish (TDDE23) and also plenty of tutorials in English on the web. You can also ask the lab assistant if you are stuck.

Lab 1.1: Emergency Services Logistics, Initial Version

In the emergency services logistics domain, a number of persons at known locations have been injured. Your task is to generate plans for delivering crates containing emergency supplies to each person using UAVs.

Modeling the initial Emergency Services Domain

The first step will be to model an initial version of the Emergency Services Domain, as described in the bullet list below. You may want to test the domain as you are developing it; see the next subsection below for information about this.

Reminder: Avoid using negative preconditions (and goals), which some planners do not handle correctly, or you may run into trouble later (but do use negative effects). Avoid other extensions as well. Further motivation: Earlier on this page.

We strongly recommend the use of Visual Studio Code with a PDDL plugin; see instructions in the introduction.

In the domain you model, each injured person (type person) should always be at a specific location (type location). Please use these exact names for your types, as they will also be used by a problem instance generator that we will introduce later in the lab.
Each crate (type crate) is at a specific location and has a specific type of content such as food or medicine.

It might be tempting to introduce predicates such as (is-food-crate ?c) and (is-medicine-crate ?c) to model the contents of crates. But then the domain has to be changed if a new crate type is introduced (water, different types of medicine, ...). Instead you must deal with crate contents in a generic way, so that new contents can easily be introduced in the problem instance. That is, you should be able to ask the question "does crate X have contents of type Y?" -- using object of type contents.
Each person either has or does not have a crate with a specific kind of content. That is, you must keep track of whether they have food or not, whether they have medicine or not, and so on.
There can be more than one person at any given location, and therefore it isn't sufficient to keep track of where a crate is in order to know which people have been given crates! You must instead associate the crate with the actual person.
Initially all crates are located at a single location that we may call the depot. There are no injured people at the depot.
The goal should be that certain people have crates with certain contents. Some people might not need crates, some people might need both food and medicine, and so on. This means that you have to deliver some or all of the crates at the depot. Note that people don't care exactly which crate they get, so the goal should not be (for example) that person1 has crate5, merely that person1 has a crate of food.
A single UAV (type uav), initially located at the depot, is available to deliver crates. It can fly directly between arbitrary locations (there is no "roadmap" with specific connections that must be used). Since we want to be able to expand this domain for multiple UAVs in the future, we use a separate type for UAVs, which currently happens to have a single member in all your problem instances.
The UAV can pick up a single crate, if it is at the same location as the crate. It can then fly to another location with the crate. Finally, it can deliver a crate to a specific person who is at the same location. Doing this will result in three distinct actions in a plan.

Testing the initial Emergency Services Domain

You need to test your domain to make sure that it can be used to generate plans in the expected manner. For this purpose, you should also manually construct two small problem instances: One with only a single person and a single crate, and one with two people and three crates. Results should be handed in, so please follow the instructions as given here.

Make sure you have read the instructions about running planners.

Then, run the domain and small problem instances through the FF planner and preferably also the "parser" tool to ensure they are correctly specified. Make sure you get a reasonable plan from FF. (The reason we use FF here is that its parser is generally quite robust and gives reasonable error messages. It doesn't catch everything, though.)

Some planners are very picky about getting the syntax right and will crash in strange ways, or misinterpret the problem specification, if you get it wrong. Please use the troubleshooting page if you encounter any problems.

Reporting your results

Begin writing your report. For lab 1.1, please tell us whether you had any specific difficulties in modeling the problem domain or getting FF to generate a plan.

When you have done this and verified that the domain works properly, you may want to ask a lab assistant to take a look at your domain to verify that there isn't anything obviously wrong or inefficient in the model you developed. This can also be done by submitting a Gitlab issue in the question project.

If you don't discuss your domain with your lab assistant, you may later find that you spend twice as much time as necessary on future labs because your model has a flaw or is unnecessarily complicated! (But of course you should continue working with the remaining tasks until you get an answer from the assistant.)

Lab 1.2: Early State Of The Art Planners

Now, as part of a small demonstration of how planning performance has changed over the years, we'll move a little bit backwards in time and use the planner IPP, Interference Progression Planner. This planner won the more expressive ADL track (but not the less expressive STRIPS track) of the 1998 International Planning Competition, and while it is thoroughly obsolete by now, some of the underlying ideas live on in Relaxed Planning Graph heuristics. These heuristics are still very useful and will be covered to some extent during the lectures.

You should do the following, and discuss your findings in the report. Make sure that you clearly show exactly what you did to get the results that you are reporting, preferably by specifying exact command lines and file names used. Fuzzy specifications such as "the one with 8 crates" or "the first two problems" are less useful.

Step 1: Testing using small problems

Run the two small problems you created earlier through IPP according to the IPP instructions. (Note that IPP has some special requirements for type specifications, which is noted in the IPP instructions linked above.)

In your report, tell us approximately how much time IPP requires for these small problems.

Step 2: Generating larger problem instances

You generated the first couple of problem instances manually, in order to better understand what is needed. We will now need to generate much larger problem instances. To this end, we strongly recommend that you complete and extend a problem generator skeleton written in Python, available in your Git repository. This generator will become more and more useful in later labs, when it can for example help you generate correct and consistent distances between all pairs of locations.

The skeleton is deliberately incomplete. This is necessary because providing a full problem generator would also lead you directly towards a particular formulation of the problem domain. The intention is instead that you come up with your own problem domain and then adapt the problem generation to your own model.

To use the generator, you finish the implementation and then run:

python3 .../generate.py --uavs 1 --carriers 0 --locations 50 --persons 3 --crates 101 --goals 4

...if you want (for example) 1 UAV, no carriers, 50 locations, 3 people, 101 crates and 4 actual goals (deliveries to be made using the crates). This is just an example; begin with smaller problems!

(Note that carriers will be used in lab 2, but are not used here.)

Test the generated problem instances using FF as before, as well as using IPP. (FF might provide better error messages than IPP.) Again, please use the troubleshooting page if you encounter any problems.

Step 3: Continue performance tests using IPP

Now, continue creating larger problems and running them through IPP.

How large problems can IPP solve within approximately one minute? What happens when you increase the number of crates that are not necessary for the problem? What happens when you increase the number of people that need crates? Discuss your findings briefly in your report for lab 1.

Note that this is an informal analysis that will mostly be useful in order to compare the approximate performance of older planners, such as IPP, with the performance of newer planners that we will test in the next exercise. We can gain some insights from this kind of analysis, but it will never go very deeply – so don't spend a lot of time on it. For example, you don't have to try to hit the exact 1 minute time limit; it's OK if planning requires 40 seconds or 2 minutes.

Lab 1.3: Newer Planners

At this point we'll return to "modern" days and consider a number of sequential satisficing (non-optimal) planners from the International Planning Competitions in 2011, 2014, and 2018. (The next full competition is taking place in 2023, and the results will be out in July. Right now we don't have access to those planners, unfortunately.)

Planners, time limits and plan quality

Planning problems often have a very large number of solutions. Which solution will a planner return, and when, and why? You need to understand this in order to properly compare the solutions generated by different planners.

Some planners (such as FF and IPP) will search for one solution and then immediately return it.

This is particularly common for older planners, where simply finding any solution was considered hard enough – though heuristics and other techniques are typically used to try to find a reasonably good plan as well. In any case, once a solution is found, it is cousidered good enough and there will be no attempt to find a better plan.
Some planners are anytime algorithms in the sense that they will iteratively search for better and better solutions. Each solution is written to a solution file as it is found, and may then be superseded by a better solution.

Some of these planners are directly adapted to the competition rules that have been used for sequential satisficing planning for over a decade, where planners are rewarded for the highest plan quality generated within a time limit of 30 minutes. For example, they might be preconfigured to continue searching for 30 minutes before giving up. They might also be tuned with the assumption that it doesn't matter if they quickly find an initial solution, since they will be running for 30 minutes in any case.

One consequence is that a planner might take a long time to generate a plan even though it could easily be tuned to generate it quickly, simply because it is now tuned for the competition environment.

Another consequence is that a planner might continue working for a long time even though a good solution was generated in the first 5 seconds, because it might be able to find an even better solution and there is no penalty for spending more time.

The output of an anytime planner may end with "No solution found" even though it found some solutions earlier; this really means "no better solution found within the time limit". The safest way of finding the solutions is therefore to look at the generated output files.
Some planners are tuned for the "agile" part of the competition, where plan quality is completely ignored and the reward is only dependent on how much time was required in order to generate the plan. This has the opposite effect of tuning for quickly finding possibly terrible solutions.
Some planners (explored later) compete in the "optimal" part of the competition, where returning a non-optimal plan can lead to disqualification. These planners might still iteratively generate solutions, but will never return any solutions unless they have been conclusively proven to be optimal.

Note that some of the planners also generate a great deal of output inbetween plans, which means you may want to pipe the output through a tool such as "less" if you want to have a chance to see the plans that are generated. This is discussed in more detail in the planner instructions.

You can also run commands such as "watch -n 1 ls -l" in another terminal, to view the contents of the current directory every second. This can help you keep track of new files being generated.

Comparing planners in your emergency services domain

You should now do the following, in order to get a general idea of planner performance:

Run the largest problem you created earlier, for use in IPP, through a number of different planners as discussed below.
Determine approximately how much time it takes for these planners to generate the first solution. Remember that many planners generate multiple solutions, so to avoid wasting time, try to keep track of when the first one is created.
Continue creating larger problem instances and running them through the planners, until you reach problem instances that require approximately one minute to solve. It's fine if they only require 45 seconds or if they require 1.5 minutes.
Report your results in terms of both time taken for the old problem instances and how large problem instances the new planners can solve within the time limit.

Like before, the intention is for you to get a general idea of how much time a planner might take and how large problems it can solve. It is NOT important to have very precise measurements or to do a very deep analysis.

Note that there is a reason why the section header says "in your emergency services domain". Planners and heuristics are often somewhat dependent on exactly how your domain is formulated, and the fastest planner in your domain might not be the same as the fastest planner in another course participant's domain!

For this exercise we want you to use the following planners. While we briefly describe how the planners work, you are not expected to understand the details – that generally requires longer description and more background knowledge than you have at this point. We provide links to scientific papers in case you are particularly interested; reading them is not part of the course.

/courses/TDDD48/planners/ipc2011/seq-sat/seq-sat-lama-2011-quick/plan: Based on lama-2011, a heuristic search planner that uses forward search in the state space together with a heuristic based on landmarks.

The original planner was the winner of the sequential satisficing competition track in 2011. This variation has been modified to focus more on finding a plan quickly, stopping after a certain stage rathern than continuing to find even better plans.

The LAMA Planner: Guiding Cost-Based Anytime Planning with Landmarks (Silvia Richter, Matthias Westphal)
/courses/TDDD48/planners/ipc2011/seq-sat/seq-sat-madagascar-p/plan: madagascar-p (ipc-2011).

This planner transforms a planning problem into a boolean satisfiability problem (consisting of a set of variables and boolean formulas), uses a satisfiability checker to generate a solution to that transformed problem, and transforms the solution back into a plan.

The transformation needs to be done repeatedly to check for solutions with at most a certain number of actions; the planner then prints a new "Horizon".

Madagascar: Scalable Planning with SAT (Jussi Rintanen)
One of the following planners. If you want to know more about these planners, you can read parts of the papers that describe them (links can be found in the planner instructions).
- /courses/TDDD48/planners/ipc2014/seq-sat/seq-sat-yahsp3-first/plan: YAHSP3 (ipc-2014, modified to only give you the first plan found; seq-sat-yahsp3-anytime continues to try to find better plans).
  
  YAHSP3 (Yet Another Heuristic Search Planner 3) uses forward search in the state space, with a heuristic based on the use of relaxed plans (in a way that we cannot easily describe in a few sentences).
- /courses/TDDD48/planners/ipc2014/seq-sat/seq-sat-jasper/plan:
  Jasper from IPC-2014. Can sometimes crash after finding the first plan, or might continue to find better plans.
  
  Jasper builds on LAMA-2011, but uses an alternative greedy search algorithm and implements a technique for post-processing plans to improve plan quality once a solution has been found.
- /courses/TDDD48/planners/ipc2011/seq-sat/seq-sat-arvand/plan:
  Arvand. This planner uses heuristic search together with Monte Carlo random walks. Essentially, using a heuristic function for guidance can be very beneficial but can also lead to problems in parts of the search space where the heuristic function is not sufficiently informative, which occurs in many planning domains. Rather than exhaustively searching for a way out of such parts of the search space, one can use random walks for exploration.
- /courses/TDDD48/planners/ipc2011/seq-sat/seq-sat-forkuniform/plan:
  Forkuniform.
At least one planner from ipc-2018. These planners were installed in 2022 (somewhat delayed due to their reliance on the Singularity containerization system). They are therefore less well tested than the other planners, but still contribute to this overview of planner performance!

To run one of the newer ipc-2018 planners:
```
cd [folder where your pddl files are]
singularity run -C -H "$(pwd)" /courses/TDDD48/planners/ipc2018/seq-sat/[name of planner image] domain.pddl problem.pddl myoutput
```
Use ls /courses/TDDD48/planners/ipc2018/seq-sat to see which planners exist. Test one or more planners to see if they work for you; if not, choose another planner.

For example, if you choose the freelunch-madagascar planner:
```
cd /home/liuid123/tddd48/repo/lab1/lab1.1
singularity run -C -H "$(pwd)" /courses/TDDD48/planners/ipc2018/seq-sat/freelunch-madagascar.img domain.pddl problem.pddl myoutput
```
The "$(pwd)" is there because Singularity needs to mount a folder where it can read your domain and problem files, and this needs to be specified as an absolute path, so "." won't do.

Note that planners can create temporary files in that directory. Please try to avoid adding those to your Git repo.

Lab 1.4 Theory: State Space and Branching Factor

In this part you will familiarise yourselves with the size of planning problems and why they grow quickly. This is done by calculating the size of a state space, by calculating the size of a restricted state space so that only those states that are reachable are considered, and by finding the highest branching factor possible.

Solve all problems and include the answers in your report.

The intention behind this part of the lab is to make you think a bit more deeply about some aspects of planning problems, to gain a level of understanding that will help you later – during the exam. Don't worry if you are uncertain about the answer; then you can hand in your best attempt and show your reasoning, and we'll have a discussion / provide some tips.

Related issues are discussed in the lecture discussing the state space and forward state space search. If you have not yet attended that lecture, you may choose to start working with Lab 2 first.

Consider a planning problem where the domain consists of of the following untyped predicates:
```
;; "Ordinary" predicates
(at ?uav ?loc), (has-photo-of ?t), (photo-possible ?loc ?t),
;; Type predicates
(uav ?uav), (location ?loc), (target ?t)
```
Assume the problem instance contains the objects uav1, loc1, loc2, target1 and target2. How many states exist in the corresponding state space, including both "sensible" states and "meaningless" ones, and including all possible and "impossible" instantiations of type predicates? Note that we are not talking about states that are reachable from some other states; we are talking about all states in the state space.
Generalize your answer to the case where there are U UAVs, T targets and L locations (O=U+T+L objects in total).
Assume instead that the domain and problem instance are typed, with the obvious interpretation of names (?t can only have the values target1 and target2). For example, this means that (photo-possible loc1 uav1) cannot be true, because uav1 is not a target. We also remove the type predicates, retaining the following ones:
```
(at ?uav ?loc), (has-photo-of ?t), (photo-possible ?loc ?t)
```
How many states will exist in this case, first for the small problem instance and then in the general case with U UAVs, T targets and L locations? Motivate your answer!
Assume that the following operators exists:
```
(action move
:parameters (?uav ?from ?to)
:precondition (and (uav ?uav) (location ?from) (location ?to)
(at ?uav ?from))
:effect (and (not (at ?uav ?from)) (at ?uav ?to)))
(action take-photo
:parameters (?uav ?loc ?t)
:precondition (and (uav ?uav) (location ?loc) (target ?t) (at ?uav ?loc)
(photo-possible ?loc ?t))
:effect (has-photo-of ?t))
```
The branching factor of a node n is the number of successors of n.

When we do forward search in the state space, each successor to a node is created by applying an applicable action to the node (which is a state). Strictly speaking, more than one action could lead to the same successor – that could be the case for the take-photo action above, where the ?uav parameter is irrelevant when we compute the successor state. However, that will not be considered in this question. Instead, to simplify your task, we will consider how many unique actions might be applicable. This is then an upper bound on how many unique successors this might lead to, a value that we will not compute.

Now suppose that we start in a physically meaningful state – for example, not a state where objects are in multiple locations at once or where (uav target1) is true. What is the highest possible number of unique applicable actions (action instances) in any node/state reachable from such a meaningful state, given the above mentioned operators and predicates, assuming that there are U UAVs, T targets and L locations (still T=U+T+L objects in total)? You may make "optimistic" assumptions about photo-possible(...) facts in order to maximize the total number of possible instances of take-photo. Motivate your answer.

Example: In the blocks world you could say that "if all N blocks are on the table, then I have N different actions I could choose to perform". This would be the maximum for any reasonable state. If you could have a state where all blocks are on all blocks at the same time (which is a state, but not a "reasonable" state), then you could have N*N alternative unstack actions that would all be executable in the same state... but we're not interested in those "strange" states right now.

Finishing and handing in the results

When you have finished the lab, you need to hand in your results. For this lab, no demonstration is necessary. Instead:

Make sure that your domain files are sufficiently commented, in order to explain how you have chosen to model the domain.
Make sure that your report covers the issues discussed above.
Make sure everything is checked in and pushed to the Gitlab project you were assigned: All domain files, problem files and problem generators you have created or modified, clearly named, as well as the report as a simple text file, Markdown file or PDF document. Do not use your "own" Gitlab projects for this.
Then add a submission in the hand-in issue tracker. There is no need to generate zip files; simply provide a link to your repo.

Your assistant may ask you questions in case additional information is required. Alternatively, your hand-in may be approved without a demonstration.

Page responsible: Jonas Kvarnstrï¿½m
Last updated: 2023-03-29

IDA - Department of Computer and Information Science

Lab 1: Modeling classical planning domains

Updates

Introduction

Writing a Report

Getting started

Lab 1.1: Emergency Services Logistics, Initial Version

Modeling the initial Emergency Services Domain

Testing the initial Emergency Services Domain

Reporting your results

Lab 1.2: Early State Of The Art Planners

Step 1: Testing using small problems

Step 2: Generating larger problem instances

Step 3: Continue performance tests using IPP

Lab 1.3: Newer Planners

Planners, time limits and plan quality

Comparing planners in your emergency services domain

Lab 1.4 Theory: State Space and Branching Factor

Finishing and handing in the results