Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

What happens when I build code in C?

What is buiding code?

Building is transforming code from the input format to the final format.

This can mean different things in different contexts. For example:

We sometimes say that compiling takes code from source to executable, but this process is actually multiple stages and compiling is one of those steps.

We will focus on what has to happen more than how it all happens.

CSC301, 402, 501, 502 go into greater detail on how languages work.

Our, here, goal is to:

Using SSH Keys

We are going to work on seawulf so that we all have the same compiler.

ssh seawulf
Last login: Tue Oct 28 12:58:26 2025 from 172.20.24.214

Using an interactive session

Last class we worked on the login node, but that is not best practice.

Today we will use an interactive session using the salloc program.

salloc
salloc: Granted job allocation 28290
salloc: Waiting for resource configuration
salloc: Nodes n005 are ready for job

We will make an empty directory to work in for today.

mkdir compilec
cd compilec/
ls

an empty folder!

Overall Build process

A simple program

We will use nano to write a very small program:

nano hello.c
hello.c
#include <stdio.h>
void main () {

 printf("Hello world\n");

}

and again, see what is in our file

ls
hello.c

Preprocessing with gcc

First we handle the preprocessing which pulls in headers that are included. We will use the compiler gcc

We will use gcc for many steps, and use its options to have it do subsets of what it can possibly do:

gcc -E hello.c -o hello.i

If it succeeds, we see no output, but we can check the folder

ls
hello.c  hello.i

now we have a new file

We can inspect what it does using wc

wc -l hello.*
    6 hello.c
  842 hello.i
  848 total

we started with just 6 lines of code and the preprocessing added a lot of lines

Since it is long, we will fist look at the top

head hello.i
# 1 "hello.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "hello.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 27 "/usr/include/stdio.h" 3 4
# 1 "/usr/include/features.h" 1 3 4
# 375 "/usr/include/features.h" 3 4

and the end

tail hello.i
extern void funlockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__));
# 943 "/usr/include/stdio.h" 3 4

# 2 "hello.c" 2
void main () {

 printf("Hello world\n");

}
cat hello.c
#include <stdio.h>
void main () {

 printf("Hello world\n");

}

we see that our original program, is at the end of the file, and the beginning is where the include line has been expanded.

Compiling

Next we take our preprocessed file and compile it to get assembly code.

Again, we use gcc:

gcc -S hello.i

but we can see what it output:

ls
hello.c  hello.i  hello.s

we have a new file as well with the .s extension.

Again, lets inspect

wc -l hello.*
    6 hello.c
  842 hello.i
   25 hello.s
  873 total

this is longer than the source, but not as long as the header. The header contains lots of information that we might need, but the assembly is only what we do.

And it’s manageable, so we inspect it directly:

cat hello.s
	.file	"hello.c"
	.section	.rodata
.LC0:
	.string	"Hello world"
	.text
	.globl	main
	.type	main, @function
main:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	movl	$.LC0, %edi
	call	puts
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.ident	"GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-44)"
	.section	.note.GNU-stack,"",@progbits

There are many more steps and they are lower level operations, but it is still human readable text stored in the file.

Assembling

Assembling is to take the assembly code and get object code. Assembly is relatively broad and there are families of assembly code, it is also still written for humans to understand it readily. It’s more complex than source code because it is closer to the hardware. The object code however, is specific instructions to your machine and not human readable.

Again, with gcc:

gcc -c hello.s -o hello.o

Again, check what it does by looking at files

ls
hello.c  hello.i  hello.o  hello.s

now we see a new file, the .o

and again check its length

wc -l hello.*
    6 hello.c
  842 hello.i
    5 hello.o
   25 hello.s
  878 total

this is very short

wc  hello.o
   5   17 1496 hello.o

it is not even too many characters

cat hello.o
ELF>?@@
UH???]?Hello worldGCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-44)zRx
K                                                               A?C
??      hello.cmainputs


???????? .symtab.strtab.shstrtab.rela.text.data.bss.rodata.comment.note.GNU-stack.rela.eh_frame @?0
&PP1P
     90\.B?W?R@
?
	?0a

This is not human readable, though

Linking

Now we can link it all together; in this program there are not a lot of other depdencies, but this fills in anything from libraries and outputs an executble

once again with gcc:

gcc -o hello hello.o -lm
ls
hello  hello.c  hello.i  hello.o  hello.s

If we look at the permissions

ls -l
total 44
-rwxr-xr-x. 1 brownsarahm spring2022-csc392  8360 Oct 30 12:59 hello
-rw-r--r--. 1 brownsarahm spring2022-csc392    64 Oct 30 12:40 hello.c
-rw-r--r--. 1 brownsarahm spring2022-csc392 16865 Oct 30 12:45 hello.i
-rw-r--r--. 1 brownsarahm spring2022-csc392  1496 Oct 30 12:55 hello.o
-rw-r--r--. 1 brownsarahm spring2022-csc392   433 Oct 30 12:49 hello.s

we can see that the executable file was automatically given x permissions for everyone.

the executable is not readable though

cat hello
ELF>P@@(@8	@@@@@@?88@8@@@ ``48 ``?TT@T@DDP?td??@?@44Q?tdR?td``/lib64/ld-linux-x86-64.so.2GNU GNU?hm|Y??!w\??د5P?S$)
                                        libm.so.6__gmon_start__libc.so.6puts__libc_start_mainGLIBC_2.2.5ui	;?`` `(`H?H?
                                             H??t?CH???5?
                                                          ?%?
                                                              @?%?
                                                                   h??????%?
                                                                             h??????%?
      h?????1?I??^H??H???PTI???@H??P@H??=@?????fD??`UH-8`H??H??w]øH??t?]?8`????8`UH-8`H??H??H??H???H?H??u]úH??t?]H?ƿ8`????==
                                            uUH???~???]?*
                                                          ??@H?= t?H??tU?`H????]?{????s???UH???@?????]?AWA??AVI??AUI??ATL?% UH?- SL)?1?H??H??e???H??t?L??L??D??A??H??H9?u?H?[]A\A]A^A_Ðf.???H?H??Hello world0$???|d???LQ????d????????
                                                                   zRx
                                                                     ???*zRx
                                                                           $????@FJ
K ??;*3$"D????A?C
Dd????eB?E?E ?E(?H0?H8?M@l8A0A(B BB?????@?@
???o?@@?@                                  ?@
G
 `H?@?@ ???oh@???o???o`@`&@6@F@GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-44)8@T@t@?@?@@`h@	?@
@8#T@T 1t@t$D???o?@N
  ?@                ?@?V@G^???o`@k???oh@hz?@?B?@???@??@@?P@Pr??@?	??@??@?4?`???0`04`4?04-h?.`4`??  P

Now we can run the program

./hello
Hello world

succes!!

Putting it all together

We can repeat with a different name for the executable and work directly from source to executable:

gcc -o demohello hello.c -lm

check what it looks like

ls -l
total 56
-rwxr-xr-x. 1 brownsarahm spring2022-csc392  8360 Oct 30 13:03 demohello
-rwxr-xr-x. 1 brownsarahm spring2022-csc392  8360 Oct 30 12:59 hello
-rw-r--r--. 1 brownsarahm spring2022-csc392    64 Oct 30 12:40 hello.c
-rw-r--r--. 1 brownsarahm spring2022-csc392 16865 Oct 30 12:45 hello.i
-rw-r--r--. 1 brownsarahm spring2022-csc392  1496 Oct 30 12:55 hello.o
-rw-r--r--. 1 brownsarahm spring2022-csc392   433 Oct 30 12:49 hello.s

only an executable no intermediate files. It still did all of those proesses but it didn’t write files for them.

./demohello
Hello world

If we edit the source:

nano hello.c
hello.c
#include <stdio.h>
void main () {

 printf("Hello world!\n");

}

the executable does not change

./hello
Hello world

until we build it again, which we can do from source

gcc -o demohello hello.c -lm

and then run

./demohello

Now it’s changed.

Hello world!

Working with multiple files

This all looks a bit different if we have our code split across files.

we will make a new file main.c

nano main.c
main.c
/* Used to illustrate separate compilation.
Created: Joe Zachary, October 22, 1992
Modified:
*/

#include <stdio.h>

void main () {
 int n;
 printf("Please enter a small positive integer: ");
 scanf("%d", &n);
 printf("The sum of the first n integers is %d\n", sum(n));
 printf("The product of the first n integers is %d\n", product(n));
}

Then help.c

nano help.c
help.c
/* Used to illustrate separate compilation

Created: Joe Zachary, October 22, 1992
Modified:

*/

/* Requires that "n" be positive. Returns the sum of the
  first "n" integers. */

int sum (int n) {
 int i;
 int total = 0;
 for (i = 1; i <= n; i++)
  total += i;
 return(total);
}


/* Requires that "n" be positive. Returns the product of the
  first "n" integers. */

int product (int n) {
 int i;
 int total = 1;
 for (i = 1; i <= n; i++)
  total *= i;
 return(total);
}

First we will compile and assemble the main.c

gcc -Wall -g -c main.c
main.c:8:6: warning: return type of ‘main’ is not ‘int’ [-Wmain]
 void main () {
      ^
main.c: In function ‘main’:
main.c:12:2: warning: implicit declaration of function ‘sum’ [-Wimplicit-function-declaration]
  printf("The sum of the first n integers is %d\n", sum(n));
  ^
main.c:13:2: warning: implicit declaration of function ‘product’ [-Wimplicit-function-declaration]
  printf("The product of the first n integers is %d\n", product(n));
  ^

we get some warnings, but that is okay

next we do the same for the helpers

gcc -Wall -g -c help.c

finally we link them togehter

gcc -o demo -lm main.o help.o

now it runs:

./demo
Please enter a small positive integer: 5
The sum of the first n integers is 15
The product of the first n integers is 120

we can modify one part

nano main.c
main.c
/* Used to illustrate separate compilation.
Created: Joe Zachary, October 22, 1992
Modified:
*/

#include <stdio.h>

void main () {
 int n;
 printf("Enter a small positive integer: ");
 scanf("%d", &n);
 printf("The sum of the first n integers is %d\n", sum(n));
 printf("The product of the first n integers is %d\n", product(n));
}

We need to recompile and reassemble that part.

gcc -Wall -g -c main.c
main.c:8:6: warning: return type of ‘main’ is not ‘int’ [-Wmain]
 void main () {
      ^
main.c: In function ‘main’:
main.c:12:2: warning: implicit declaration of function ‘sum’ [-Wimplicit-function-declaration]
  printf("The sum of the first n integers is %d\n", sum(n));
  ^
main.c:13:2: warning: implicit declaration of function ‘product’ [-Wimplicit-function-declaration]
  printf("The product of the first n integers is %d\n", product(n));
  ^

and re-link, but we do not have to recompile or reassemble the help.c file; the orignal object file works well.

gcc -o demo -lm main.o help.o

and we can run the code

./demo
Enter a small positive integer: 7
The sum of the first n integers is 28
The product of the first n integers is 5040

Why this is important

The build process includes different steps, so an error at different steps tell you to look in differen places for the source of the problem.

Consider the following:

Having a modular process means that for large, complex code bases, the parts can be split up. It also means that if you only change one part of the code you only need to recompile that part. For complex code the compilation and the optimizations that happen at compile time can take time. That means you dont’ have to that all the time.

Efficient code development means not only less waiting for you, but a smaller environmental impact while you work and when your code is distributed.

Prepare for Next Class

  1. Review the notes about floats to prepare for lab.

  2. Think about what you know about how computer execute code to prepare for class/

Badges

Review
Practice
  1. Review the notes from today

  2. Create some variations of the hello.c we made in class. Make hello2.c print twice with 2 print commands. Make hello5.c print 5 times with a for loop and hello7.c print 7 times with a for loop. Build them all on the command line and make sure they run correctly.

  3. Write a bash script, assembly.sh to compile each program to assembly and print the number of lines in each file.

  4. Put the output of your script in hello_assembly_compare.md. Add to the file some notes on how they are similar or different based on your own reading of them.

Experience Report Evidence

Questions After Today’s Class

Is it possible to reverse engineer source code from compiled code?

From the assembly, the output of the compiler it is definitely possible, but, it is lossy. Many different source codes can produce equivalent assembly code.

From the executable it is a lot harder, but not compeltely impossible to get something close.

What happens when you give gcc a “preprocessed” file that isn’t actually preprocessed during the compilation stage?

This is an explore badge, try this out and see how it works.

What’s actually inside the object file before linking?

The Object code is executable, but mising some dependencies from modular code or precompiled libraries. We will learn more about these instruction sets next!

What higher level courses focus on assembly code?

There are not full courses on assembly exclusively here but CSC411 in some sections does (Alvarez/Estevs version does; Daniels version focuses on optimization using Rust instead of authoring in assembly).

301 is the prereq but then 402 has more about how progrmaming languages are developed and how to execute code with interpreters and compilers.