Using LLVM for code generation: Part 1

Low-Level Virtual Machine is a tool for generating code. It is used by compilers such as CLang. It defines it own low-level language, provides an API for generating and printing in-memory representations of that language, as well as backends that converts to target-specific assembly and object files.

The term IR, Intermediate Representation, refers to LLVM’s low-level language. It is used to mean both the in-memory representation of llvm code, as well as the textual representation.

Consider this multiply-and-accumulate function

double mac(double a, double b, double c) {
    return a+b*c;

The LLVM IR might look like this:

; Function Attrs: norecurse nounwind readnone
p("define double @mac(double %a, double %b, double %c) #0 {"),
entry:
  %mul = fmul double %b, %c
  %add = fadd double %mul, %a
  ret double %add
}

As can be seen, LLVM uses three-address code, popular with compiler-writers everwhere. All global identifiers are prefixed by the @ sign, and all local identifiers are prefixed by the % sign. The IR is strongly typed, so type names are present almost everywhere.

To be honest, I cheated. Instead of writing it myself, I used clang to generate it:

clang -S -emit-llvm -O1 test1.c

The output file is test1.ll.

We can call this code from a test program:

# include <stdio.h>;
p("double mac(double a, double b, double c);"),
int main(int argc, char *argv[])
{
    printf("Result is %.10g\n", mac(1.345, 3.5, 4.4));
    return 0;
}

Compile and run:

$ clang test1.ll main.c
$ ./a.out
Result is 16.745

I deliberately used “C” instead of “C++”, to avoid name mangling.

So, what is llvm, and what is clang. Well, clang is a frontend that knows about C and C++ and several other languages. It uses the llvm API to generate llvm IR, instead of generating target-specific machine code. In other words, llvm is not a compiler, and clang is not a code generator.

Optimization of the LLVM IR is done by LLVM. There are a number of optimization passes that the front end can choose from. In addition, target-specific backends can implement additional optimizations. The compiler can generate rather naïve code, and leave the hard work to LLVM. Here is what clang generates with -O0:

define double @mac(double %a, double %b, double %c) #0 {
p("entry:"),
  %a.addr = alloca double, align 8
  %b.addr = alloca double, align 8
  %c.addr = alloca double, align 8
  store double %a, double* %a.addr, align 8
  store double %b, double* %b.addr, align 8
  store double %c, double* %c.addr, align 8
  %0 = load double, double* %a.addr, align 8
  %1 = load double, double* %b.addr, align 8
  %2 = load double, double* %c.addr, align 8
  %mul = fmul double %1, %2
  %add = fadd double %0, %mul
  ret double %add
}

This code first allocates three local variables, and refers to them as %a.addr and so on.This might not even generate any code on the target; it is just a way of reserving space on the stack. The next three instructions store the incoming parameters’ values to these local values. Then these values are loaded into three temporary variables, that we can think of as registers. Finally, the mutiplication and addition is performed, and the result is returned.

Without having looked at the clang source code, my guess is that this is what clang generates. The much shorter -O1 version is the result of LLVM’s frontend-independent optimization.

The LLVM Compiler, llc, is the LLVM backend. It generates target-specific assembler source code, to be assembled with an existing assembler.

llc test1.ll
as -a=test1.lst -c test1.s -o test1.o

LLVM does not provide an assembler. For code generated by ‘clang -O3’, shown above, the x86 assembly source looks like this, with directives removed:

mac:                                     #   define double @mac(double %a, double %b, double %c) #0
    subl    $12, %esp
    movsd    24(%esp), %xmm0         #
    mulsd    32(%esp), %xmm0         #   %mul = fmul double %b, %c
    addsd    16(%esp), %xmm0         #   %add = fadd double %mul, %a
    movsd    %xmm0, (%esp)           #   ret double %add
    fldl    (%esp)
    addl    $12, %esp
    retl

The comments are mine; they are not generated by llc.

The rather more verbose ‘clang -O0’ version generates a lot more code:

mac:                             # define double @mac(double %a, double %b, double %c) #0 {
#-- These LL instructions do not generate any code
                                 #      %a.addr = alloca double, align 8
                                 #      %b.addr = alloca double, align 8
                                 #      %c.addr = alloca double, align 8
#-- Generic x86 function entry code
        pushl   %ebp             #
        movl    %esp, %ebp       #
        subl    $32, %esp        #
#-- Copy arguments to local variables
        movsd   24(%ebp), %xmm0  #      store double %a, double* %a.addr, align 8
        movsd   16(%ebp), %xmm1  #      store double %b, double* %b.addr, align 8
        movsd   8(%ebp), %xmm2   #      store double %c, double* %c.addr, align 8
#-- Copy local variables to registers
        movsd   %xmm2, -8(%ebp)  #      %0 = load double, double* %a.addr, align 8
        movsd   %xmm1, -16(%ebp) #      %1 = load double, double* %b.addr, align 8
        movsd   %xmm0, -24(%ebp) #      %2 = load double, double* %c.addr, align 8
#-- Multiply and add
        mulsd   -16(%ebp), %xmm0 #      %mul = fmul double %1, %2
        addsd   -8(%ebp), %xmm0  #      %add = fadd double %0, %mul
#-- Return value
        movsd   %xmm0, -32(%ebp) #      ret double %add
#-- Generic x86 function exit code
        pushl   %ebp             #
        fldl    -32(%ebp)        #
        addl    $32, %esp        #
        popl    %ebp             #
        retl                     #
}

This is all very well, but we have not seen how to generate code from our own program. That is the subject of the next article in this series.

You can reach me by email at “lars dash 7 dot sdu dot se” or by telephone +46 705 189090

View source for the content of this page.