0% found this document useful (0 votes)
2 views

layout

Systems progamming course from Ethan Blanton

Uploaded by

Mouhamed Dieng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

layout

Systems progamming course from Ethan Blanton

Uploaded by

Mouhamed Dieng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

CSE 410: Systems Programming

Process Anatomy

Ethan Blanton
Department of Computer Science and Engineering
University at Buffalo
Introduction Executable Format Memory Layout Summary Coming Up References

Last Time

Structures and structure pointers


Dynamic memory allocation
Integer representation
Floating point numbers

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Processes
What is a process?
From the text:

[A] process is an instance of a program in execution.

If a program is a set of machine instructions, a process is:


Those instructions
The memory they use
The system resources they access

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Programs

The program that a process runs is loaded from an executable.


An executable is an object file intended to be loaded into a
process.
Once loaded, the system provides an execution environment.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

UNIX Processes

A UNIX process is protected from other processes:


It has its own memory.
It appears to execute on a dedicated CPU.
The system services it uses are dedicated to it.

Hardware assistance is required to maintain this environment.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

This Lecture

In this lecture, we will look at:


Program (executable) structure
Memory layout

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Executable Formats
Each executable is stored in an executable format.
An executable format provides:
Information about the environment the program requires
The program code itself
Other metadata

Early executables were essentially raw memory dumps.


Such dumps are simply copied into memory and executed.
Modern executables are somewhat more complicated.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

ELF

Many modern systems use ELF: Executable and Linking Format.


(Windows uses PE; macOS uses Mach-O.)

ELF executables and libraries contain two types of information:


Information required to load and execute the object
Information required to link the object

ELF objects correspond to translation units.


They provide only linking information

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Linking
Linking is the process of creating an executable or shared library
from multiple object files.
The linker in the C compiler toolchain performs this task.
It involves:
Cataloging symbols provided by various object files
Cataloging symbols provided by external libraries
Identifying symbols required by objects and libraries
Binding provided symbols to required symbols

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Loading
Loading is the process of moving an executable or shared
library into memory for execution.
On Linux, the kernel begins loading, and ld-linux.so finishes it.
The kernel moves various portions of the ELF executable
into place
The kernel moves the loader into place
The kernel invokes the loader, which performs various
changes to the in-memory program data
The loader jumps to the start of the program

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

ELF Structure

The data in an ELF file is mapped into sections and segments.


Sections describe the file for the linker.
Segments describe the file for the loader.
The two views typically have significant correspondence.
We will think about an ELF executable in terms of sections.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

ELF Sections
Each part of a process is represented in some section.
There are many possible sections, but we will consider:
Text1 : The actual program code, as executed by the
processor. ELF calls this .text.
Data: Non-code data that has some value defined at
compile time; for example: strings, constants, some global
variables, etc. ELF calls this .data.
BSS: The “block started by segment”. This is non-code data
that has no value defined at compile time. For example,
declared global variables with no initializer. ELF calls this
.bss.
1
Text, data, and BSS are all historical names.
© 2018 Ethan Blanton / CSE 410: Systems Programming
Introduction Executable Format Memory Layout Summary Coming Up References

ELF Object Layout


ELF Header
Segment Table
The ELF header describes the
type of object (platform,
endianness, etc.)
The segment table tells the
Sections
loader where the parts of the file
should be placed in memory.
The section table describes the
sections for the linker.
Section Table

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

From Program to Process

The executable file is loaded into memory to become a process.


The memory layout of the process mimics the ELF sections.
The system ascribes additional semantics to the loaded layout.
Most POSIX systems will use the same (or similar) layout.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Basic Layout
0xffffffffffffffff Kernel

The lowest addresses are not Process Stack


used — specifically so that NULL
remains invalid!
Unmapped
The text and data sections
come directly from the ELF file.
brk
The BSS doesn’t actually Heap
appear in the file! BSS
Data
The stack and heap are set up Text
by the loader. (program code)

0x0 (NULL) Unmapped

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Layout Caveats

As your text points out, modern systems also:


map shared libraries, which are code used by a program
that do not appear in its ELF file
randomize the location of the mapped sections
However, the logical layout remains the same.
In particular, the order of the sections is maintained.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

The Text Section

The text section contains the actual program instructions.


The assembler emits binary machine instructions that are placed
in the ELF .text section by the linker.
The kernel copies the text into the process’s memory, and the
loader prepares it for execution by modifying various memory
locations.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Data & BSS


The data segment contains variables and constants that have
known initial values at compile time.
The linker inserts this data into the ELF .data section and the
kernel loads it into the process’s memory.
The BSS contains variables that have no value at compile time.
The compiler identifies variables in the BSS and records their
locations, but does not store them in the ELF image.
The kernel makes space for the BSS when it loads the program.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

The Stack and the Heap

The stack contains local variables for function calls.


The heap contains explicitly allocated memory.
Both the stack and the heap can grow.
Thus the unmapped space between the heap and stack.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

The Stack
Base of Stack System Info
The stack grows downward as
functions are called and shrinks
when they return.
main argv

The kernel manages the size of argc

the stack automatically. Saved PC (main)

arguments
Each function called has a stack function
frame that contains: called by
main local variables
The arguments to the
function Top of Stack
Local variables (Stack Pointer)
Toward Heap

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

The Heap

The heap does not grow automatically.


The kernel maintains a program break between the process’s
data and the unmapped memory between the data and heap.
The program can request that the program break be moved.
Moving the break toward the stack makes more heap space.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Summary
A program is code that can be executed, a process is that
code running on a system.
The linker joins multiple objects into an executable.
A loader prepares a program that has been copied into
memory for execution.
Program code (text), initalized data (data), and uninitialized
data (bss) are present in both a program and a process.
The heap and stack can both grow, the former “upward”
toward higher addresses and the latter “downward” toward
lower addresses.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

Next Time …

Process creation
Kernel services
More about the execution environment

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

References I
Required Readings
[1] Randal E. Bryant and David R. O’Hallaron. Computer Science: A Programmer’s
Perspective. Third Edition. Chapter 7: Intro, 7.1-7.5. Pearson, 2016.

Optional Readings
[2] John R. Levine. Linkers & Loaders. Chapter 3: 3.1, 3.7. Morgan Kaufmann Publishers,
2000.

© 2018 Ethan Blanton / CSE 410: Systems Programming


Introduction Executable Format Memory Layout Summary Coming Up References

License

Copyright 2018 Ethan Blanton, All Rights Reserved.


Reproduction of this material without written consent of the
author is prohibited.
To retrieve a copy of this material, or related materials, see
https://www.cse.buffalo.edu/~eblanton/.

© 2018 Ethan Blanton / CSE 410: Systems Programming

You might also like