DF00100 Advanced Compiler Construction TDDC86 Compiler Optimizations and Code Generation



## **Register Allocation**

Christoph Kessler, IDA, Linköpings universitet, 2014

#### **Register Allocation**



- **Register Allocation**: Determines values (variables, temporaries, constants) to be kept when in registers
- **Register Assignment**: Determine in which physical register such a value should reside.
- Essential for Load-Store Architectures
- Reduce memory traffic (→ memory / cache latency, energy)
- Limited resource
- Values that are alive simultaneously cannot be kept in the same register
- Strong interdependence with instruction scheduling
  - · scheduling determines live ranges
  - spill code needs to be scheduled
- **Local register allocation** (for a single basic block) can be done in linear time (see function getreg() in the Dragon Book).
- Global register allocation (with minimal spill code) is NP-complete. Can be modeled as a graph coloring problem [Ershov'62] [Cocke'71

### Live range



(Here, variable = program variable or temporary)

- A variable is being defined at a program point if it is written (given a value) there.
- A variable is **used** at a program point if it is read (referenced in an expression) there.
- A variable is alive at a point if it is referenced there or at some following point that has not (may not have) been preceded by any definition.
- A variable is **reaching** a point if an (arbitrary) definition of it, or usage (because a variable can be used before it is defined) reaches the point.
- A variable's live range is the area of code (set of instructions) where the variable is both alive and reaching.
  - does not need to be consecutive in program text.

## **Local Register Allocation**



For variable  $\nu$  and basic block  $B_i$ :

 $netsave(v,i) = \#uses_i \cdot usesave + \#defs_i \cdot defsave$ -  $l \cdot ldcost$  (l = 1 if Load(v) needed at beg. of  $B_i$ , 0 otherw.)

-  $s \cdot stcost$  (s = 1 iff Store(v) needed at end of  $B_i$ , 0 otherw.)

For loop L estimate benefit of keeping v in a register:

$$\textit{benefit}(v, L) = 10^{\textit{depth}(L)} \cdot \sum_{i \in \textit{blocks}(L)} \textit{netsave}(v, i)$$

with R registers available:

allocate the R objects  $\nu$  with greatest benefit in L

moves may be necessary instead of Load( $\nu$ ) / Store( $\nu$ ) if v could reside in (different) registers in  $Pred(B_i)$ ,  $B_i$ ,  $Succ(B_i)$ 

add worst-case terms  $|Pred(v)| \cdot mvcost$ ,  $|Succ(v)| \cdot mvcost$ 

## **Register Allocation for Loops** Control flow graph



## Global Register Assignment by Graph Coloring



[Ershov'62] [Cocke'71] [Chaitin et al.'81] [Chaitin'82] [Chow/Hennessy'84,'90] [Briggs et al.'89] [Briggs'92] ...

- 1. allocate objects that can be assigned to registers to distinct symbolic registers s1, s2, ...
- 2. determine candidates for allocation to registers (si / webs)
- 3. build interference graph

nodes: allocatable objects, target machine registers

edges: (undir.)  $\{a_i, a_j\}$  iff allocatable objects  $a_i, a_j$  simultaneously live  $\{a_i, r_i\}$  iff  $a_i$  should not reside in register  $r_i$ 

4. color nodes with R colors (R = #available registers) such that any two adjacent nodes have different colors

5. allocate each object to a register that has the same color.













### **Conservative Coalescing**



Coalescing two live ranges u and v can increase the node degree:  $d(u\&v) > \max(d(u), d(v))$  is possible

 $\rightarrow$  may make u&v harder to color

Conservative coalescing:

coalesce only if d(u&v) < R

 $\rightarrow$  can color u&v by the naive degree < R heuristic

### Spilling (1)



- **Spilling** a (physical) register *r* 
  - = spilling the live range w contained in r
  - uses some memory location w.tmp (on stack, scratchpad memory, or w's home memory location)
  - insert a Store r, w.tmp immediately after each definition of w.var
  - insert a Load r, w.tmp immediately before each use of w.var
- Some interferences disappear, the interference graph must be updated.

## Spilling (2)



Heuristic choice of the best spill candidate [Bernstein et al.'89]

minimize ratio  $spillcost(w) / degree(w)^2$  etc.

$$spillcost(w) = c_{store} \cdot \sum_{def \in w} 10^{depth(def)} + c_{load} \cdot \sum_{nse \in w} 10^{depth(nse)} - c_{mov} \cdot \sum_{conv \in w} 10^{depth(copy)}$$

A copy instruction whose source or target is spilled can be removed.

- spill value once per block if possible
  - → avoids redundant loads and stores
- consider rematerialization as alternative to spilling

## Rematerialization



Recomputing a value to a register (rematerialization) may be cheaper than storing and reloading it, e.g. for loading constants to a register.

Modify spillcost(w) accordingly.



If a spilled value is used several times

and the restored value remains live for several adjacent uses,

a Load/Rematerialize is necessary only before the first of them.

(= live range splitting)

[Chow/Hennessy'84,'90]

#### Spilling (3)



- Total spilling eliminates a live range completely
  - store after each definition, reload before each use
- Partial spilling splits a live range into several ones
  - Some reduction in interference, some spill code





# **Live Range Splitting**



- Long live ranges tend to interfere with many others → harder to color.
- Idea: Split up long live ranges to avoid some spilling © reg-to-reg copy is much cheaper than memory spill
- Live range splitting = the reverse of coalescing













### **Optimal Spilling?**

- ATHON ON THE
- Select those live ranges for spilling whose accumulated spill cost is minimal
- Optimal (pre-)spilling and a-posteriori insertion of spill code for given instruction schedule is NP-complete even for basic blocks
  - Dynamic programming
    - ▶e.g. Horwitz et al. 1966
  - Integer Linear Programming
    - ▶e.g. Appel, George PLDI 2001
  - Most compilers use (greedy) heuristics (see above)

Kessler IDA Linkönings universitet

25

### **SSA-Based Register Allocation**



- For SSA programs, the register interference graph is **chordal** 
  - Can be K-colored in quadratic time!
    - Hack, Goos 2006
    - ▶ Bouchez et al. 2006
    - → Brisk et al. 2009: Optimistic chordal coloring
  - Optimal coalescing in spill-free SSA programs
    - Brisk et al. 2009: heuristic
    - Grund, Hack 2007: Integer Linear Programming
  - Optimal pre-spilling in SSA programs
  - ▶Ebner 2009: heuristic

## **Fast Register Allocation**



- For JIT compilers:
  - Compilation time critical (trade-off with code quality)
  - Linear-Scan Register allocators
    - Poletto, Sarkar TOPLAS 1999
    - → Traub, Holloway, Smith PLDI 1998

Kessler, IDA, Linköpings universitet.

# Interdependences Register Allocation ←→ Instruction Scheduling

- Determining live ranges requires a linear sequence of instructions (pre-scheduled MIR, LIR, or target code with symbolic registers)
- Spill code must be scheduled as well
  - $\rightarrow$  may destroy quality of a beforehand good schedule
- ⇒ Integration of register allocation and instruction scheduling
  - quantitative evaluation [Bradlee et al.'91]
  - integrated approaches, space-aware scheduling [Goodman/Hsu'88], [Freudenberger/Ruttenberg'92], [Pinter'93] [Brasier et al.'95], [Motwani et al.'95], [Kästner'97,'00],..., [K./Bednarski'01]

essler, IDA, Linköpings universitet.

5