Programming Environments Laboratory, IDA
Revisiting Register Allocation: Why and How? -
or, what does the NP-completeness proof of Chaitin et al. really prove?
Abstract: Register allocation is one of the most studied problems in compilation. It is considered as an NP-complete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated to the variables. The fact that the interference graph can be arbitrary proves the NP-completeness of this formulation. However, this original proof does not really show where the complexity of register allocation comes from. Recently, the re-discovery that interference graphs of SSA programs can be colored in polynomial time raised the question: Can we exploit SSA form to perform register allocation in polynomial time, without contradicting Chaitin et al's NP-completeness result? To address such a question and, more generally, the complexity of register allocation, we revisit Chaitin et al's proof to better identify the interactions between spilling (load/store insertion), coalescing/splitting (removal/insertion of moves between registers), critical edges (a property of the control-flow graph), and coloring (assignment to registers). In particular, we show that, in general (we will make clear when), it is easy to decide if temporary variables can be assigned to k registers or if some spilling is necessary. In other words, the real complexity does not come from the coloring itself (as a wrong interpretation of the proof of Chaitin et al. may suggest) but comes from the presence of critical edges and from the optimizations of spilling and coalescing.
Fast and Flexible Instruction Selection with On-Demand Tree-Parsing Automata
Abstract: Tree parsing as supported by code generator generators like BEG, burg, iburg, lburg and ml-burg is a popular instruction selection method. There are two existing approaches for implementing tree parsing: dynamic programming, and tree-parsing automata; each approach has its advantages and disadvantages. We propose a new implementation approach that combines the advantages of both existing approaches: we start out with dynamic programming at compile time, but at every step we generate a state for a tree-parsing automaton, which is used the next time a tree matching the state is found, turning the instruction selector into a fast tree-parsing automaton. We have implemented this approach in the Gforth code generator. The implementation required little effort and reduced the startup time of Gforth by up to a factor of 2.5.
(This work will also be presented at PLDI in the following week.)
Abstract: I will present our Java based solver and some application for scheduling filters etc.
Integrated optimal code generation for digital signal processors (PhD defense)
Opposition by Dr. Alain Darte, ENS Lyon, France
Superinstructions and Replication in the Cacao JVM interpreter
Abstract: Dynamic superinstructions and replication can provide large speedups over plain interpretation. In a JVM implementation we have to overcome two problems to realize the full potential of these optimizations: the conflict between superinstructions and the quickening optimization; and the non-relocatability of JVM instructions that can throw exceptions. In this paper, we present solutions for these problems. We also present empirical results: We see speedups of up to a factor of 4 from superinstructions with all these problems solved. The contribution of making potentially throwing JVM instructions relocatable is up to a factor of 2. Replication has a small, but usually positive effect on performance.
(This work will also be presented in the week before at .NET technologies 2006.)