For my SPO600 class's second lab, I need to compile a C program for the x86-64 architecture with various compiler options and compare the output. Up until now, the only compiler flag I used was "-std=c++0x". This should be interesting!

Here's the program:

#include <stdio.h>

int main() {
    printf("Hello World!\n");
}

Control

Before I investigate the various compiler options, I need to compile and inspect the original program, the "control". When I compile it, if I run the "file" command on it, I get the following:

ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=7eb60e0bc26655a7f8932e8d90753e6e887ee679, not stripped

At this point in the lab, I don't know much about the program. I have nothing else to compare it to. However, our prof noted that the "dynamically linked" refers to the fact that this program brings in "stdio.h" with an include. Therefore, the program will look for "stdio.h" at run time, aka dynamically.

-static

Now I'll compile the program with the following options:

-static

When I run the "file" command, I get the following:

ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.32, BuildID[sha1]=7d1dd2f480fe0053f621075b1ca0e91dbf7f7a25, not stripped

Ahh, I see a difference here. It says "statically linked" this time. My suspicion is that this tells the compiler to bring in any code it references at compile time, linking together one monolithic executable that would not rely on any libraries existing on the system at runtime in order to run. I suspect the file size is going to be larger. Let's find out with "ls -l [0-1].o":

-rwxr-xr-x 1 mwelke mwelke 10896 Sep 17 15:33 0.o
-rwxr-xr-x 1 mwelke mwelke 918224 Sep 17 15:33 1.o

Yep! The static one is ~918 KB. The dynamic one is only ~11 KB.

When I inspect the assembly code in the static file, it is much more code than the dynamic one. These must be the instructions that were present in "stdio.h" (and any libraries it includes) at compile time.

-fno-builtin

For this compiler option, I'm to look for a change in the way the function call (to "printf") is performed in the assembly code.

Let's examine the function call in the control assembly code with "objdump -fsd --source 0.o":

560: ff 25 6a 0a 20 00    jmpq   *0x200a6a(%rip)        # 200fd0 <printf@GLIBC_2.2.5>

Now, let's examine the function call in the assembly code for this set of compiler ptions with objdump -fsd --source 2.o.

There actually wasn't a corresponding line in this one! And this makes sense. There is so jump to the beginning of a set of instructions for "printf" because there is no "printf" anymore. The instructions for all included functions have been included in the assembly code inline.

-g

When compiling the program without this compiler option, it seems to include less "comment" type data in the assembly code. It lacks a big section that was at the top in the control assembly code that looked like it described things that would help a debugger operate. Metadata etc.

10 Additional Arguments

This time, there were 10 arguments in the "printf" call (the numbers 1 through 10), not just one (the string "Hello, World!"). I am asked to observe where the arguments are placed in the registers.

We end with the following assembly code (which presumably describes the argument values being moved into registers and then eventually the function call):

6a4: 48 83 ec 08 sub $0x8,%rsp
6a8: 6a 00 pushq $0x0
6aa: 6a 09 pushq $0x9
6ac: 6a 08 pushq $0x8
6ae: 6a 07 pushq $0x7
6b0: 6a 06 pushq $0x6
6b2: 41 b9 05 00 00 00 mov $0x5,%r9d
6b8: 41 b8 04 00 00 00 mov $0x4,%r8d
6be: b9 03 00 00 00 mov $0x3,%ecx
6c3: ba 02 00 00 00 mov $0x2,%edx
6c8: be 01 00 00 00 mov $0x1,%esi
6cd: 48 8d 3d a4 00 00 00 lea 0xa4(%rip),%rdi # 778 <_IO_stdin_used+0x8>
6d4: b8 00 00 00 00 mov $0x0,%eax
6d9: e8 82 fe ff ff callq 560 <.plt.got>

Reading this, I'm having trouble figuring out what exactly is going on. I notice that there are two operators being used, "pushq" and "mov". I googled around for a bit to try to find out what the difference is between them, but I only found either texts that went above my head right now, or descriptions of how one would use either of these commands to specify a sort of "type size". For example, you can use "movzbl" to move something that is one byte big into a register that is 4 bytes big.

The only thing I can conclude right now is that moving these 10 arguments into the registers either happens in a random order (perhaps due to undefined behavior at the level where this happens, up until the function call, like multi threading) or that the registers have random names that don't correspond to their order, and that there are even arbitrary rules around which command to use (pushq vs mov) to get stuff into them. I'll need to study this further in the future.

Additional Function Call

For this, I need to add another layer of function calls to the program. The source code looks like this:

#include <stdio.h>

void output();
int main() {
    output();
}

void output() {
    printf("Hello World!\n");
}

And now we'll look at the changes in the assembly code, particularly looking for function calls. Here's the assembly code for both functions:

    void output() {
    6a0: 55 push %rbp
    6a1: 48 89 e5 mov %rsp,%rbp
    printf("Hello World!\n");
    6a4: 48 8d 3d a9 00 00 00 lea 0xa9(%rip),%rdi # 754 <_IO_stdin_used+0x4>
    6ab: b8 00 00 00 00 mov $0x0,%eax
    6b0: e8 ab fe ff ff callq 560 <.plt.got>
    }
    6b5: 90 nop
    6b6: 5d pop %rbp
    6b7: c3 retq

    00000000000006b8 <main>:

    int main() {
    6b8: 55 push %rbp
    6b9: 48 89 e5 mov %rsp,%rbp
    output();
    6bc: b8 00 00 00 00 mov $0x0,%eax
    6c1: e8 da ff ff ff callq 6a0 <output>
    6c6: b8 00 00 00 00 mov $0x0,%eax
    }
    6cb: 5d pop %rbp
    6cc: c3 retq
    6cd: 0f 1f 00 nopl (%rax)

What we end up seeing here is another section created the "output" function, and that output function is called by the "main" function section. I notice another "mov" command before the control is moved to the "output" function. Perhaps this is adding a stack frame to the stack, a way for the processor to keep track of where it is when running the code.

-O0 and -03

For this, I remove the "-O0" (capital 'oh' and zero) option and add the "-03" (capital 'oh' and three) option and recompile.

After studying the GCC documentation, I learned that the options "O0" through "O3" deal with code optimization. They progressively add a suite of additional options that tell the GCC compiler to do things to increase code execution speed, at the cost of increasing the file size, disabling debugging capability, and increasing compilation time. In my opinion, it makes to use the "O3" option for a production build, so long as you make sure the optimizations didn't break any of your tests etc.

One thing I notice is that the file size for the optimized build (6.o) is a bit larger:

-rwxr-xr-x 1 mwelke mwelke 10896 Sep 17 15:33 0.o
-rwxr-xr-x 1 mwelke mwelke 11504 Sep 17 15:33 6.o

However, when I examine the assembly code, I can't understand it enough to know exactly what the compiler did for these optimizations. One optimization I've heard of is "function inlining", which is when the compiler will take every instance of a function being called and replace the call with the code of the function itself. In theory, this would increase speed, especially when you've got many small functions being called frequently. I used grep to look for that "printf" call, to see if it was removed and instead inlined:

0010 5f6f6666 73657400 5f5f7072 696e7466  _offset.__printf 0010 5f6f6666 73657400 5f5f7072 696e7466  _offset.__printf 00a0 6e740070 72696e74 66005f63 75725f63  nt.printf._cur_c 580: ff 25 5a 0a 20 00    jmpq   *0x200a5a(%rip)        # 200fe0 <__printf_chk@GLIBC_2.3.4>printf (const char *__restrict __fmt, ...)  return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());    printf("Hello World!\n");

But I did get results from the grep. It looks like even with optimizations, there is still a reference to "printf" in the assembly code, and right now I'm not sure why. I'm sure that as the course progresses and I get more familiar with assembly syntax, I'll be able to understand these variations in compiling code with GCC more.


This was originally posted on the blog I used for my SPO600 class while studying at Seneca College.