The WITAS System

The Image Processing Module

The hardware/software module dedicated to the execution of image processing operations is collectively known as the Image Processing Module (IPM). It consists of a general framework, in particular the IPAPI, in which a number of specific operations have been implemented. The IPM communicates with other parts of the WITAS system, such as the Dynamic Object Repository (DOR) and the Task Procedure Executor Module (TPEM).

Due to the constraints associated with scenarios of the type described, an important operational requirement of the IPM is that it is highly flexible and reconfigurable. The idea is that the UAV should be able to switch between different modes of operations, where each mode may require a different configuration of the IPM. For example, in one mode the UAV may hover and observe a particular road section. Triggered by some event, e.g., it observes a particular vehicle, the UAV switches to a tracking mode where the observed vehicle is being followed both by moving the helicopter and the camera. Additional flexibility is required in terms of using different implementations of the same type of operation for different purposes. For example, a relatively simple and fast method may be used for detecting moving ground objects. However, to estimate their true velocity, a more complex and time consuming operation has to be used if this feature is vital in achieving a particular task.

Examples of operations used in the current applications are find vehicle and track vehicle. Finding a vehicle can be done in many different ways, using simple, fast and unreliable methods such as detecting colored blobs on roads, or using complex, computationally demanding and robust methods based on, e.g., estimation of motion and shape. Also the tracking can be done using more or less sophisticated methods. Depending on the time available for the processing and the requirements for robustness, the deliberative component in the architecture can choose different algorithms dynamically during runtime or actually modify existing algorithms by using different image-processing operators in existing algorithms.

The services provided by the IPM includes the execution of image processing algorithms used for achieving tasks specified at the mission level. Since the goals and requirements at this level may change over time, one important aspect of the IPM is that it should allow the algorithms to be dynamically modified at runtime. Internally, the algorithms are represented using a Data Flow Graph (DFG) based model. Nodes in the graphs represent operations and the arcs represent data on which the operations are performed. In general, the graphs may be cyclic, e.g., feed-back loops are required for certain types of operations.

The design of the IPM is based on two variants of data flow graphs. The initial design uses homogeneous DFGs, where each node consumes one token at each of its inputs, and produces one token at each of its outputs, each time it is being executed. Homogeneous DFGs are limited to static behavior, e.g., no data dependent decision in the execution can be made. A later design is based on a newly developed computational model called Image Processing Data Flow Graph (IP-DFG), which allows dynamic behavior and simplifies the implementation of certain image processing algorithms (typically with iterative computations). An IP-DFG is based on a hierarchy of boolean DFGs, which explicitly supports iteration and tail recursion, as compared to other DFG variants where feed-back loops must be used instead. The use of IP-DFGs results in graphs that are simpler to understand and the problems with token accumulation and deadlocks are easier to manage. In a IP-DFG, the token consumption is made more flexible, for example it is possible to specify that an input should contain the three latest images from a image sequence and the node should execute every time a new image arrives.

Conceptually, the Image Processing Module (IPM) consists of two parts,

IPAPI-Runtime is implemented in the Java programming language. The use of Java offers a number of advantages; rapid prototyping, support on a number of different hardware/software configurations, access to an ever growing number of APIs for various purposes, e.g., both CORBA support and a powerful toolkit for building graphical applications is available. Thanks to the recent JIT technology for Java, it is also reasonable to implement the actual data processing in Java, i.e., the execution inside the nodes of the graphs. Only certain nodes for which the execution time is critical, e.g., convolution, have also been implemented at native level using JNI. In these cases the optimization has been targeted to use processor specific instruction sets for acceleration of floating point operations, e.g., Altivec for PowerPC or SSE for Pentium. A result of this work is a growing library of nodes which implement various image processing operations, some of them in Java, and some optimized at native level.

In IPAPI, graphs can be defined as new node classes and instantiated as nodes in other graphs. This implies that relatively complex graphs, containing a large number of nodes, can be implemented without too much work. Furthermore, memory management, scheduling of execution and data flow can be customized for various purposes. This flexibility in scheduling is lacking in existing systems of similar type such as AVS, Khoros, etc.

The IPM is responsible for all low-level image processing, much of which has to be done in real time. This real-time requirement implies both that the response times from this module have to be short enough to allow it to be included in control loops, e.g., for tracking of ground vehicles, and that sequences at normal video rate have to be managed. It should be emphasized that the system is not designed to manage the processing of a continuous video sequence, but instead the processing of short image bursts at varying rates, e.g., for motion estimation.

The two modules which the IPM communicates most closely with are the Task Procedure Executor Module (TPEM) and the Dynamic Object Repository (DOR). The TPEM plays a central role in the architecture and provides the means for smooth integration between the deliberative capabilities in the architecture and the reactive and control capabilities. It is also responsible for setting up and monitoring the execution of various low-level sub-tasks, e.g., the flight of the helicopter, control of the camera, and the processing of images. This is done by specifying task procedures. A task is a behavior intended to achieve a goal in a limited set of circumstances. A task procedure is the computational mechanism that generates this behavior. A task procedure is essentially event-driven and may often be viewed as executing an augmented automaton. It can open its own event channels to listen for events of interest such as those from the helicopter or camera controllers, and can call various services such as the IPM, the path planner, or the helicopter controller. A task procedure is implemented as a CORBA object and has its own interface definition.

The DOR is essentially a soft real-time database which keeps records of information about various objects, both static and dynamic, that the system needs to know about when achieving various tasks. This may include sighted ground vehicles during a traffic surveillance scenario, but may also include information about the helicopter itself and the camera. Information in the DOR is normally subject to variation over time, and one of the main tasks of the IPM is to provide the DOR with up-to-date information about various sensed objects. Information about objects rated as interesting is normally provided by task procedures communicating with the IPM and interfacing with deliberative capabilities. The interaction between the TPEM and the IPM may be viewed as a form of higher-level active vision where the context which determines image processing policy is represented implicitly in the DOR and interpreted by the TPEM via the use of other deliberative services.

The CORBA based solution makes the system very flexible regarding choice of hardware/software platforms for various parts of the system, only at the expense of a relative small overhead in communication latency. For example, for the development of the software architecture and even use during runtime, each of the three modules mentioned here can be run on the same or separate hardware platforms, using CORBA to manage the interface issues. In the case of hard real-time constraints, the IPM or DOR could be executed on a processor with a real-time OS using CORBA real-time channels to ensure quality of service relative to event arrival and scheduling. Practical tests have shown that entire images can be communicated using CORBA, but this relies on using particular configurations of operating system and CORBA implementation (which support shared memory) and has not been used in the project.


The original idea and initial conceptualisation for a dynamically reconfigurable on-line image processing module in the UAV system architecture (the IPM) was proposed by Patrick Doherty. The technical specification and implementation of the IPM and its integration with the UAV system architecture is joint work by Patrick Doherty, Klas Nordberg, and other project members in the IDA and ISY (i.e. CS and EE) departments.

Specific image processing operations that have been implemented in the IPM are described on a separate page. This work was done mainly by Gunnar Farnebdck, Per-Erik Forseen, and Johan Wiklund.

The IPM communicates with other parts of the WITAS system, such as the Dynamic Object Repository (DOR) and the Task Procedure Executor Module (TPEM). The part of the architecture represented by these modules (DOR, TPEM) are described in the WITAS system architecture (pdf) document. They were designed jointly by Patrick Doherty, Patrik Haslum, and Fredrik Heintz.