09 Memory
09 Memory
Chapter 9
Background on Main Memory
• Memory is a large(ish) Linearly Addressable array of
bytes
– Typically, a byte is an 8-bit octet, but memory could be
organized in different units.
– Think of main memory as being big, but smaller than you’d
like it to be
• Registers and main memory are the only storage the
CPU can directly access
– So, a program must be brought into main memory
(typically from disk) before it can be run.
– Registers are fast, main memory is slower, cache helps to
make up the difference
Memory Layout
• Pretend main memory is divided into two
regions
– Space for a resident operating system
• Maybe in low memory addresses
• With structures like the interrupt vector table and
memory-mapped I/O
– Space for multiple user processes in the rest of
memory
• Want to fit in as many user processes as we
can
A Simple System for Memory
Protection
• The OS allocates a region of memory to each process.
• We have a base and limit
register for memory protection
(for now)
– Provides memory protection
– Memory must be allocated
as a big, contiguous block for
each process.
Pretend (tiny)
memory addresses.
A Simple System for Memory
Protection
• Within each process, memory is organized into
different regions.
– A text section for executable code
– A data section for global variables
– A stack for local variables
– A heap for dynamic allocation
– Example program:
MemoryRegions.c
Address Binding
• Address Binding : that’s how we choose physical
memory addresses program code uses.
– Really, the addresses for each program symbol
– Compile time: compiler chooses an address when it
builds the program
• This builds absolute code, must be run in a particular
location and rebuilt if it’s going to run elsewhere
– Load time: Choose memory address when the
program starts running
– Execution time: Can move after it’s started running
We’re probably going to need more help from the
hardware.
Building a User Program
Static Linking
• Static Linking
– Do all linking at build
time
– Application code and
libraries linked into a
single load image
– Ready to execute, just
copy into memory
and run.
Static Linking
• Executable includes a copy
of needed libraries, but
just the parts it needs
• So, somewhat reduced
memory footprint
$ ls –l libc.a
3153906 bytes
$ gcc –static Hello.c
$ ls –l a.out
615997 bytes
Linking at Program Start-Up
Dynamic Linking
• Dynamic Linking
– Link with library code
at load time
– Get the latest
(compatible) version of
every library
– Smaller executable
image
$ gcc Hello.c
$ ls –l a.out
7096 bytes
Dynamic Linking
• Can be done with no special help from the OS
– Initially, a small piece of code, a stub, gets called
instead of the subroutine
– On the first call, the stub finds and replaces itself
with the actual subroutine
• Particularly useful for libraries
Dynamic Linking
• Programs share
common code on
disk,
• But, not
necessarily in
memory.
Shared Libraries
• Shared Dynamically
Linked Libraries
– Just one copy of the
library on disk,
– … and in memory.
– Significantly smaller
memory footprint
– Definitely requires help
from the OS.
– And, we’re going to
need more than
base/limit registers
Dynamic Loading
Dynamic Loading
• We want the most out of a limited physical memory
• Dynamic Loading : Load parts of the program when
they are first needed
– e.g., load a subroutine when its first called
– Overlays: same region of memory used for several
subroutines, one after another
• Useful if large amounts of code are needed later in
execution (or maybe not at all)
– Faster start-up times
– Reduced process memory footprint (maybe)
• No help from the OS required … but it might be nice.
Swapping
• When running lots of processes, memory may fill
up
– For example my laptop is running 238 processes right
now (and another 152 inside a virtual machine)
– But, most of them are idle most of the time
• Solution, temporarily remove an entire process
from memory
– Backing store, fast, reasonably large storage (usually
disk) used to hold memory contents for processes that
don’t reside in main memory right now
– Bring process’ memory image back into physical
memory when there’s more room
What’s Swapping Look Like?
Swapping
• This is called swapping
– Process PCB stays in the process table
– Really, we are preempting a process’ memory
– So, this is a different way a process can block
• Major cost of swapping is transfer time
– Total transfer time is proportional to process
memory size
Making Room for Processes
• Address binding matters
• With compile-time address
binding
– We have to put a process in the
memory it was compiled for
We ended up with a
small gap between
processes.
Fragmentation
• Some memory is going to be wasted
• External Fragmentation
– There may be some wasted memory between
allocated regions
– Lots of holes, but all to small to use
• 50-percent rule
– With first-fit, if a total of n bytes of memory have
been allocated, another ½ n will be lost to
fragmentation
Memory Partitioning for User
Processes
• If a hole is very close to the size of a memory
request
– It may not be practical to keep up with the left-
over memory.
I know you
You tied to access really mean
logical address 25. physical address
325.
Address Translation with Base / Limit
• We can create a simple address translation
scheme using just base & limit registers
– All logical addresses are relative to the base.
Same logical
address
25 from the
base
register?
Now it
translates
to physical
That’s
address 125
physical
address 325
Making Programs Relocatable
• We’re using the base register as a relocation
register
– Adding it to every logical address to get the
corresponding physical address
The Memory-Management Unit
(MMU)
• A hardware device that handles mapping
logical to physical addresses
• Typically, part of the CPU
• For example, we can use base as a relocation
register
– The MMU will add the base register to every
logical address
– Automatically, whenever the CPU is in user mode
– Fast, simple, easy to implement this kind of MMU
The Memory-Management Unit
(MMU)
• The user program only sees logical addresses
– It never sees the real physical addresses
– Really, why would you want to?
• Of course, the OS still has to think about
physical addresses (and each process’ logical
address space). The process gives us a logical
address.
– Consider a call like:
read( fd, buffer, 100 );
But, we’ll need to give the device
controller a physical address.
Address Translation Requirements
• However, just a base and limit register won’t
be enough to let us
implement
– Shared memory
– Shared libraries
– Other things we’ll
talk about later
Paged Memory
• Memory allocation would be easier if processes memory
didn’t need to be contiguous
– Can we do this? Let processes use a little bit of physical
memory here, and a little bit there?
– Process still needs to be able to see its logical address space as
contiguous (why?)
– We certainly need this for shared libraries
• OK, we’ll allocate process memory as multiple fixed-sized
blocks, called pages
• Process memory = a sequence of pages
– Divide the process’ logical address space into fixed-sized pages
• Physical memory = a sequence of page-sized frames
– Divide physical memory into frames , each able to hold a page
– Size typically a power of 2, between 512 bytes and 8,192 bytes
Paged Memory
• To run a program of
size n pages, we just
need n free frames
(anywhere in memory)
– We’ll tolerate some
internal fragmentation
– But, we’ll virtually
eliminate external
fragmentation
Paged Memory
• But, if process pages
could be scattered all
over physical memory
…
– How do we know how
to access a particular
byte of a process’
memory?
– We need a table that
says where process
pages are.
– We need a page table
Page Table Organization
• Line 0 says what
frame holds page 0
• Line 1 says what
frame holds page 1
• And so on
• It’s an array of frame
numbers (indexed by
page)
Logical and Physical views of Memory
Multiple Processes and Page Tables
• Each process has its
own page table
– Byte 3 of process A
– Byte 8 of process B
– Byte 14 of process A
Multiple Processes and Page Tables
• Each process has its
own page table
– Byte 3 of process A
– Byte 8 of process B
– Byte 14 of process A
Multiple Processes and Page Tables
• Each process has its
own page table
TLB
20-bit page Rd M Re Frame
10 bits 10 bits 12 bits
number
p1 p2 d 453282 1 1 1 31245
132054 0 0 1 725418
A Multi-Level Page Table
• Consider a system with:
– 64-bit logical/physical addresses
– 64 KB page size
1. How many bits for the offset in a logical address?
2. How many bits for the page number?
3. How many lines in the (inner, level-1) page table?
4. Total size of the (inner page table)?
5. How many lines of the page table fit on a page?
6. So, how many lines of a 2nd-level page table?
7. How many lines for a 3rd-level page table?
8. How many levels?
A Multi-Level Page Table
1. How many bits for the offset in a logical address?
That’s the low-order 16 bits.
2. How many bits for the page number?
That’s the high-order 48 bits.
3. How many lines in the (inner, level-1) page table?
One for each page, so 248
4. Total size of the (inner page table)?
If it’s 8 bytes per entry, that’s 248 * 8 = 251
5. How many lines of the page table fit on a page?
216 page size / 8 entries per page = 2 13
6. So, how many lines of a 2 nd-level page table?
1/213 times as many as the (inner) page table, so 235
7. How many lines for a 3rd-level page table?
1/213 times as many as the level-2 page table, so 222
8. How many levels?
Level-4 page table has 29 entries. At 8 bytes each, this fits in a page.
– 4 KB page size
(or 2 MB or 1 GB, your choice)
– A sneaky trick :
just use the low-order 48 bits of
the logical address
– That explains something we saw earlier:
– 8 bytes per page table entry.
– We can figure out the structure of this paging
system … let’s do that
AMD64 Illustrated
• Here’s what this paged
memory hierarchy
looks like: