0% found this document useful (0 votes)
21 views10 pages

Program Encoding: GCC - Og - o P p1.c p2.c

COmputer Organization

Uploaded by

Arfan Ghani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views10 pages

Program Encoding: GCC - Og - o P p1.c p2.c

COmputer Organization

Uploaded by

Arfan Ghani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Program encoding

• Suppose we write a C program as two files p1.c and p2.c. We can then compile
this code using a Unix command line:
• linux> gcc -Og -o p p1.c p2.c

• The command gcc indicates the gcc C compiler.


• The command-line option –Og instructs the compiler to apply a level of
optimization that yields machine code that follows the overall structure of the
original C code.
• We use -Og optimization as a learning tool and then see what happens as we
increase the level of optimization.
• In practice, higher levels of optimization (e.g., specified with the option -O1 or -
O2) are considered a better choice in terms of the resulting program
performance.
• First, the C preprocessor expands the source code to include any files
specified with #include commands and to expand any macros, specified
with #define declarations.
• Second, the compiler generates assembly code versions of the two source
files having names p1.s and p2.s.
• Next, the assembler converts the assembly code into binary object-code
files p1.o and p2.o.
• Object code is one form of machine code—it contains binary
representations of all of the instructions.
• Finally, the linker merges these two object-code files along with code
implementing library functions (e.g., printf) and generates the final
executable code file p (as specified by the command-line directive -o p).
Machine Level Code
• The machine code for x86-64 differs greatly from the original C code.
• Parts of the processor state are visible that normally are hidden from the C programmer:
• The program counter (commonly referred to as the PC, and called %rip in x86-64)
indicates the address in memory of the next instruction to be executed.
• The integer register file contains 16 named locations storing 64-bit values.
• These registers can hold addresses (corresponding to C pointers) or integer data.
• Some registers are used to keep track of critical parts of the program state, while others
are used to hold temporary data, such as the arguments and local variables of a
procedure, as well as the value to be returned by a function.
• The condition code registers hold status information about the most recently executed
arithmetic or logical instruction. These are used to implement conditional changes in the
control or data flow, such as is required to implement if and while statements.
• A set of vector registers can each hold one or more integer or floating-point values.
Example
• Suppose we write a C code file mstore.c • To see the assembly code
containing the following function definition:
generated by the C compiler,
we can use the –S option on
long mult2 (long x, long y);
void multstore(long x, long y, long *dest) {
the command line:
long t = mult2(x, y); • linux> gcc -Og -S mstore.c
*dest = t;
• This will cause gcc to run the
}
compiler, generating an
c code shows a function that makes a procedure
call to mult2 passing arguments x and y. assembly file mstore.s, and go
then it stores the return value to the location no further.
pointed to by dest.
long c type is 64 bits on a 64 bit architecture
long mult2 (long x, long y); multstore:
void multstore(long x, long y, long *dest) { System V AMD 64
pushq %rbx ABI (Application
long t = mult2(x, y); Binary Interface)
*dest = t; movq %rdx, %rbx Calling conventions

} call mult2 Up to six integer or


movq %rax, (%rbx) pointer arguments
are passed by
X in rdi registers in the
Y in rsi popq %rbx order of rdi, rsi,
Dest in rdx rdx, rcx, ra, r9
Rbx is a callee save ret Rax is used to
register return upto 64 bit
Mov the data from value. Rdx can be
rdx to rbx used upto 128 bit
Rdx is caller saved value.
• The pushq instruction indicates that the contents of register %rbx
should be pushed onto the program stack. All information about local
variable names or data types has been stripped away.
• If we use the -c command-line option, gcc will both compile and
assemble the code
• linux> gcc -Og -c mstore.c
• This will generate an object-code file mstore.o that is in binary format
and hence cannot be viewed directly.
• Embedded within the 1,368 bytes of the file mstore.o is a 14-byte
sequence with the hexadecimal representation
• 53 48 89 d3 e8 00 00 00 00 48 89 03 5b c3
• This is the object code corresponding to the assembly instructions
listed previously.
• A key lesson to learn from this is that the program executed by the
machine is simply a sequence of bytes encoding a series of
instructions.
• The machine has very little information about the source code from
which these instructions were generated.
Disassembler
• To inspect the contents of machine-code files, a class of programs
known as disassemblers can be invaluable.
• These programs generate a format similar to assembly code from the
machine code.
• With Linux systems, the program objdump (for “object dump”) can
serve this role given the -d command-line flag:
• linux> objdump -d mstore.o
Data Formats

You might also like