Linear Scan Register Allocation

Mikael Holmén, mikael.holmen@softlab.ericsson.se

Ericsson AB – UAB

Core Network Development - Compiler Technology

Abstract

This paper is done as part of the examination of the course Advanced Compiler Construction. It contains a summary of the paper “Linear Scan Register Allocation” written by Massimiliano Poletto and Vivek Sarkar [1]. The main topic is a new algorithm for fast register allocation, usable when compile time, as well as run time performance of generated code, is of importance.

Introduction

Register allocation is an important part of a compiler, heavily affecting the performance of the generated code. Unfortunately, most aggressive, optimizing register allocators are computationally expensive due to their use of a graph coloring framework [2]. Therefore Poletto and Sarkar presents an alternative algorithm, called linear scan, which is not based on graph coloring. Rather, given the live ranges of the input program, it allocates registers in a single pass of the live ranges. The results presented in [1] show that the compile time when using this algorithm is considerably smaller than when using a graph coloring allocator, while the quality of the generated code is still fairly good.

A number of improvements of the algorithm is also discussed and some are implemented and tested.

The Linear Scan Algorithm

This section first describes two concepts used in the linear scan algorithm. Then the algorithm itself is presented.

Definitions

The algorithm assumes the the intermediate representation (IR) is numbered in some way. Which numbering that is used might affect the quality of the produced code, but it does not affect the correctness of the algorithm. Throughout the implementation a depth first ordering is used.

Also, crucial to the algorithm is the notion of a “live interval”. Given a numbering of the IR, [i,j] is a live interval of a variable v iff there is no instruction with number i’ such that i’<i and v is live at i’, and there is no instruction with number j’ such that j<j’ and v is live at j’.

A live interval is a conservative approximization of a variables live ranges; there might be “holes” in a live interval where the variable is dead. Interference between two variables is captured by whether the associated live intervals overlap or not. An important observation is that the number of overlapping intervals change only at the start and end points of live intervals. Other program points are not interesting.

Given liveness information and a numbering of the IR, live intervals can be computed easily in a single pass over the IR.

Algorithm Details

The linear scan algorithm takes as input a list of live intervals, sorted by increasing start point. Thus the algorithm can quickly scan forward through the live intervals by iterating through this list.

At each step, the algorithm maintains a list, active, which contains the live intervals that overlap the current program point and are stored in a register. The active list is sorted by increasing end point.

For each new interval from the input list the algorithm scans active to see if any recently active interval expired at the new interval’s start point, that is, if a member of active has an end point that is smaller (according to the numbering) than the new interval’s start point, then the interval is removed from active and the allocated register is freed. Since active is sorted by increasing end point the algorithm can stop searching as soon as an interval that is still active is encountered.

If the number of intervals in active equals the number of registers (R), then an interval needs to be spilled, and in the implemented version this is done using a simple heuristic, simply spilling the interval with the largest end point. This is either the last interval in active or the new interval.

The algorithm ends when all intervals in the input list have been examined. See [1] for the complete algorithm.

Results

In the original paper they do a number of tests where they compare an allocator based on linear scan with other allocators based on usage count, graph coloring and bin-packing. Some of the tests seems to be rather useless since they get virtually the same results no matter which algorithm used, so a more extensive test suite (with relevant benchmarks) should really be run, but nevertheless it seems like linear scan is a good algorithm when both compile time and run time performance of generated code is if importance.

Linear scan is considerably faster than graph coloring while the produced code is often only 10-20% slower. Comparing linear scan to binpacking is interesting, but unfortunately not all tests are run for binpacking so it’s hard to draw any conclusions.

A usage count allocator is alot faster than linear scan, but the produced code is significantly slower.

Improvements

In the paper some suggestions for improvements to the algorithm, both aiming to speed up compilation and to increse quality of the generated code, are presented, but when evaluationg these improvements it shows that they are really not wortwhile doing. When speeding up compilation the quality of the generated code decreased too much, and when trying to make further optimizations to increase quality to compilation time grew so much a binpacking or graph coloring algorithm would be better.

Conclusions

Linear scan register allocation is a simple algorithm, still producing reasonably good code. It seems to be a good alternative when both compile time and run time performance of generated code is of importance. However if run time performance is of uttermost importance you should probably use a graph coloring allocator instead of trying to improve a linear scan allocator.

References

[1] Poletto, Sarkar, Linear Scan Register Allocation, TOPLAS 1999

[2] Chaitin, Auslander, Chandra, Cocke, Hopkins, Markstein, Register Allocation via coloring, Computer Languages 6, 47-57.