Iris · Hubert Gruniaux

Table of Contents

A x86 codegen library implemented in OCaml.

Introduction
#

A knight facing a long road with references to other drawings. — Drawings from my mother for the project report.

A knight is fighting a hydra with the letters CFG written on it. — Drawings from my mother for the project report.

Iris is a code generator for x86-64 library written in OCaml (like LLVM, but much more simpler). It use an intermerdiary SSA-based representation to which it applies multiple optimization passes.

This project was originally written as part of the compilation course at the Ecole Normale Superieure de Paris. It was intended to be the backend for a MiniPureScript compiler (a subset of PureScript) also written in OCaml. As a result, the project lacks many of the features and tests needed for professional use.

The original report for this project can be found here (in French).

Usage example
#

The following OCaml snippet creates a add function that takes two parameters and returns their sum. Then, it optimizes the generated code and dumps to stdout the output x86-64 assembly code.

open LibIris

(* Let's create a function that adds its two parameters. *)
let ctx = Ir.mk_ctx () in
let ib = IrBuilder.create ctx in
let func = IrBuilder.mk_func ib "add" (* count of params: *) 2 in

let x = List.nth func.fn_params 0 in (* first param *)
let y = List.nth func.fn_params 1 in (* second param *)
let r = IrBuilder.mk_add ib x y in (* x + y *)

(* return r *)
IrBuilder.set_term ib (Ir.Iterm_retv (Iop_reg r))

(* Finish the function code generation! *)
IrBuilder.finish ib;

(* Optimize and generate assembly code: *)
let pm = PassManager.create Backend.Arch_x64 in
PassManager.run_on_ctx pm Stdlib.stdout ctx

The following intermediary code (IR) will be generated:

fn add(int %1, int %2) -> int {
L1:
  %3 = add %1 %2
  ret %3
}

Which gives the following x86-64 assembly code:

.globl add
.type add, @function
add:
.L1:
  push rbp
  mov rbp, rsp
  mov rcx, rsi
  mov rax, rdi
  add rax, rcx
  mov rsp, rbp
  pop rbp
  ret

The generated code is not perfect. The optimization pipeline of Iris is quite small, and its optimizations are higher level than x86-64. In particular, Iris introduces too many temporary registers.

Implemented optimization passes
#

The code of all passes (written in OCaml) can be found in the subdirectories lib/ir/pass (high-level passes) and lib/mir/pass (machine-level passes).

SimplifyCFG
#

Iris’ intermediate representation is based on the notion of a basic block and takes the form of a control-flow graph (CFG). The aim of this pass is to simplify the CFG by deleting unreachable basic blocks or by merging basic blocks that have only a single predecessor or have no instruction except the terminator instruction.

This is mostly intended to clean up other optimization passes and the initial naive CFG built by the IR builder API.

CopyPropagation
#

The aim of this pass is to propagate as many copies and constants as possible in the intermediate representation. For example, the following code

%0 = 5
%1 = %x
%2 = add %0 %1

will be transformed after this pass as follows:

%0 = 5
%1 = %x
%2 = add 5 %x

As such, it is of little interest, but it does enable other optimizations such as the elimination of dead code and algebraic simplifications.

InstCombine
#

This pass is the most general on the list. It performs many optimizations at the same time, like:

constant folding (e.g. 3 + 5 is transformed into 8),
algebraic simplification (e.g. 0 * x is transformed into 0 if x has no side effect),
strength reduction (e.g. x * 2 is transformed into x << 1).

Dead Code Elimination (DCE)
#

DCE or dead code elimination removes instructions whose result is not used by any other instruction and which have no side effects. The process is iterative, so deleting one instruction can make other instructions dead.

For example, the following code

%0 = 5
%1 = %x
%2 = add 5 %x

will be transformed into:

%2 = add 5 %x

We can see that the instructions %0 and %1 (not used) were removed.

Register allocation
#

Iris implements a graph-coloration based register allocation algorithm. In particular, the one described by Chaitin et al. with improvements from Briggs et al.

The implementation of the algorithm can be found at lib/mr/RegAlloc.ml .

Introduction #

Usage example #

Implemented optimization passes #

SimplifyCFG #

CopyPropagation #

InstCombine #

Dead Code Elimination (DCE) #

Register allocation #