SP 16

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 34

Dynamic Shared Libraries

Different Types of Libraries


Non-shared library
Like a normal library that an ordinary user creates. Images of used library function calls will be copied into the executable file.

Static shared library


Images of used library function calls will NOT be copied into the executable file. Library function calls will be mapped to fixed addresses in a processs address space.

Dynamic shared library


Images of used library function calls will NOT be copied into the executable file. Library function calls can be mapped to any addresses in a processs address space.

Dynamic Linking
Dynamic linking defers much of the linking process until a program starts running or even later. It provides several benefits:
Easier to create than statically linked shared library Easier to update than statically linked shared library Semantics are closer to those of unshared libraries. Permit a problem to load and unload routines at run time.

Of course, its performance cost is higher because the linking process needs to be redone every time a program runs. However, most people are willing to pay this overhead given its offered advantages.

ELF Position-Independent Code


ELF shared libraries can be loaded at any address, so they use position-independent code (PIC). The advantage of using PIC is that the text pages of the file need not be relocated and thus can be shared among multiple processes. ELF linkers supports PIC code with a global offset table (GOT) in each shared library that contains pointers to all of the static data reference in the program.

How GOT Work?


We observe that an ELF executable consists of a group of code pages follows a group of data ages, and regardless of where in the address space the program is loaded, the offset from the code to the data does not change. If the code can load its own address into a register, the data will be at a known distance from that address, and references to data in the programs own data segment can use efficient based addressing with fixed offsets. The linker thus creates a global offset table that contains pointers to all of the global data that the executable file addresses.

How a Program Uses GOT?


If a procedure needs to refer to global or static data, its up to the procedure itself to load up the address of the GOT. Example code:

How a Program Uses GOT?


This piece of code stores PC value into the base register and then add the difference between the address of the GOT and the current address to the base register. (After doing this, now the base register points to the GOT!) In an object file generated by a compiler, there is a special R_386_GOTPC relocation item for the operand of the addl instruction. This iten tells the linker to substitute in the offset from the current instruction to the base address of the GOT, and it also serves as a flag to the linker to build a GOT in the output file.

GOT Operation

Reference Data
Once the GOT register is loaded, code can reference local static data using the GOT register as a base register, because the distance from a static datum in the programs data segment to the GOT is fixed at link time. Addresses of global data are not bound until the program is loaded. So, to reference global data, code has to load a pointer to the data from the GOT and then dereference that pointer. This extra memory reference makes programs somewhat slower. However, most programmers are willing to pay for it. Of course, this pointer needs to be loaded to the GOT when the program is dynamically linked.

Relocation Record Types


To support PIC, ELF defines some special relocation types for code that uses the GOT.
R_386_GOTPC For letting the base register point to the GOT. R_386_GOT32 This is the relative location of the slot in the GOT where the linker has placed a pointer to the given global symbol. The compiler only creates the R_386_GOT32 reference. It is up to the linker to collect all such references and make slots for them in the GOT.

Relocation Record Types (contd)


R_386_GOTOFF
This is the distance from the base of the GOT to the given symbol address. It is used to addres static data relative to GOT.

R_386_RELATIVE
This is used to mark data addresses in a PIC shared library that need to be relocated at load time. The run-time loader use this information to do load-time relocation. Note that the code is usually PIC and sharable. However, usually the data is not sharable (has its own copy in the physical memory) and not PIC. Thus they may need to be relocated. For example:
Char buf[100]; char *datap = &buf[0];

Program Reference Example

Procedure Linkage Table (PLT)


To support dynamic linking, each ELF shared library and each executable that uses shared libraries has a procedure linkage table. The PLT adds a level of indirection for function calls analogous to that provided by the GOT for data. The PLT permits lazy evaluation, that is, not resolving procedure addresses until they are called for the first time. (dynamic loading and linking)

PLT Operation

Lazy Binding
Programs that use shared libraries generally contain calls to a lot of functions. In a single run of the program, many of the functions are never called. To speed program startup, dynamically linked ELF programs use lazy binding of procedure addresses. This is accomplished by means of a PLT. Each dynamically bound program has a PLT, with the PLT containing an entry for each nonlocal routine called from the prohgram.

PLT and Lazy Binding


All calls within the program to a particular routine are adjusted to be calls to the routines entry in the PLT. The first time the program calls a routine, the PLT entry calls the run-time linker to resolve the actually address of the routine. After that, the PLT entry jumps directly to the actual address. So, after the first call, the cost of using the PLT is a single indirect jump at a procedure call and nothing at return.

PLT Details
The first entry in the PLT, which is called PLT0, is special code to call the dynamic linker. At load time, the dynamically linker automatically places two values in the GOT.
At GOT+4, it puts a code that identifies the particular library. At GOT+8, it puts the address of the dynamic linkers symbol resolution routine.

The rest of PLT entries, which we call PLTn, each starts with an indirect jump through a GOT entry that is initially set to point to the push instructions in the PLT entry that follows the jmp.

PLT Details
Following the jmp is a push instruction that pushes a relocation offset. The offset is the offset of a special relocation entry of type R_386_JMP_SLOT in the files relocation table. The relocation entrys symbol reference points to the symbol in the files symbol table and its address points to the GOT entry.

PLT Details
The first time the program calls a PLT entry, the first jump in the PLT entry in effect does nothing, because the GOT entry through which it jumps points back into the PLT entry. Then the push instruction pushes the offset value, which indirectly identifies both the symbol to resolve and the GOT entry into which to resolve it, and jumps to PLT0. The instruction in PLT0 pushes another code that identifies which program it is, and then jump into stub code in the dynamic linker with the two identifying codes at the top of the stack. The return address back to the routine that called into the PLT is also pushed into the stack.

PLT Details
Now the stub code saves all registers and calls an internal routine to do the resolution. The two identidying words suffice to find the librarys symbol table and the routines entry in that symbol table. The dynamic linker looks up the symbol value using the run-time symbol table and stores the routines address into the GOT entry. (Dynamic loading is also possible here.)

PLT Details
Then the stub code restores the registers, pops the two words that the PLT pushed, and jump off to the routine. With the GOT entry having been updated, subsequent calls to that PLT entry jumps directly to the routine itself without entering the dynamic linker.

ELF Shared Library


An ELF shared library contains all of the linker information that the run-time linker will need to relocate the file and resolve any undefined symbols. The .dynsym section contains all of the files imported and exported symbols. The .dynstr and .hash sections contain the name strings for the symbol and a hash table the runtime linker can use to quickly look up symbols.

ELF Shared Library


An ELF dynamically linked program looks much the same as an ELF shared library. The difference is that it has init and fini routines, and an interp section near the front the file to specify the name of the dynamic linker (ld.so).

Loading a Dynamically Linked Program


When the operating system runs the program, it maps in the files pages as normal but notes that there is an Interpreter section in the executable. The specified interpreter is the dynamic linker, ld.so, which is itself in ELF shared library format. Rather than starting the program, the system maps the dynamic linker into a convenient part of the address space as well and start ld.so. Ld.so is given some information such as the programs entry point.

Loading a Dynamically Linked Program


The linker than initializes a chain of symbol tables with pointers to the programs symbol table and the linkers own symbol table. Then the linker starts to find all libraries that are needed for this program. This information can be gathered because the programs program header has apointer to the dynamic segment that contains dynamic linking information for the program. That segment contains a pointer DT_STRTAB to the files string table and entries DR_NEEDED, each of which contains the offset in the string table of the name of a required library.

Finding Library
Once the linker has found the file containing the library, the dynamic linker opens the file and reads the ELF header to find the program header, which in turn points to the files segments including the dynamic segment. The linker allocate space for the librarys text and data segments and maps them in, along with zeroed pages for bss. For the librarys dynamic segment, it adds the librarys symbol table to the chain of symbol tables, and if the library requires further libraries that are not already loaded, it adds any new libraries to the list to be loaded.

Finding Library
When this process terminates, all of the libraries have been mapped in, and the loader has a logical global symbol table consisting of the union of all of the symbol tables of the program and the mapped libraries.

Shared Library Initialization


Now the linker revisits each library and handles the librarys relocation entries, filling in the librarys GOT and performing any relocation needed in the librarys data segment. If a library has an .init section, the linker calls it to do library-specific initializations, such as C++ static constructors, and any .fini section is noted to be run at exit time. When this pass is done, all of the libraries are fully mapped and ready to execute, and the loader called the programs entry point to start the program.

An Example C Program
int xx, yy; main() { xx = 1; yy = 2; printf ("xx %d yy %d\n", xx, yy); }

ELF Header Information


shieyuan3# objdump -f a.out
a.out: file format elf32-i386 architecture: i386, flags 0x00000112: EXEC_P, HAS_SYMS, D_PAGED start address 0x080483dc

Dynamic Section
Dynamic Section: NEEDED libc.so.4 INIT 0x8048390 FINI 0x8048550 HASH 0x8048128 STRTAB 0x80482c8 SYMTAB 0x80481b8 STRSZ 0xad SYMENT 0x10 DEBUG 0x0 PLTGOT 0x8049584 PLTRELSZ 0x18 PLTREL 0x11 JMPREL 0x8048378

Need to link this shared library for printf()

Section Header
Sections: Idx Name 0 .interp Size 00000019 CONTENTS, 1 .note.ABI-tag 00000018 CONTENTS, 2 .hash 00000090 CONTENTS, 3 .dynsym 00000110 CONTENTS, 4 .dynstr 000000ad CONTENTS, 5 .rel.plt 00000018 CONTENTS, 6 .init 0000000b CONTENTS, 7 .plt 00000040 CONTENTS, 8 .text 00000174 VMA LMA File off 080480f4 080480f4 000000f4 ALLOC, LOAD, READONLY, DATA 08048110 08048110 00000110 ALLOC, LOAD, READONLY, DATA 08048128 08048128 00000128 ALLOC, LOAD, READONLY, DATA 080481b8 080481b8 000001b8 ALLOC, LOAD, READONLY, DATA 080482c8 080482c8 000002c8 ALLOC, LOAD, READONLY, DATA 08048378 08048378 00000378 ALLOC, LOAD, READONLY, DATA 08048390 08048390 00000390 ALLOC, LOAD, READONLY, CODE 0804839c 0804839c 0000039c ALLOC, LOAD, READONLY, CODE 080483dc 080483dc 000003dc Algn 2**0 2**2 2**2

2**2
2**0 2**2

2**2
2**2 2**2

CONTENTS, ALLOC, LOAD, READONLY, CODE

Section Header (contd)


9 .fini
10 .rodata 11 .data 12 .eh_frame 13 .ctors 14 .dtors 15 .got 16 .dynamic 00000006 08048550 08048550 00000550 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 0000000e 08048556 08048556 00000556 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 0000000c 08049564 08049564 00000564 2**2 CONTENTS, ALLOC, LOAD, DATA 00000004 08049570 08049570 00000570 2**2 CONTENTS, ALLOC, LOAD, DATA 00000008 08049574 08049574 00000574 2**2 CONTENTS, ALLOC, LOAD, DATA 00000008 0804957c 0804957c 0000057c 2**2 CONTENTS, ALLOC, LOAD, DATA 00000018 08049584 08049584 00000584 2**2 CONTENTS, ALLOC, LOAD, DATA 00000070 0804959c 0804959c 0000059c 2**2 CONTENTS, ALLOC, LOAD, DATA 00000024 0804960c 0804960c 0000060c 2**2 ALLOC 000001bc 00000000 00000000 0000060c 2**2 CONTENTS, READONLY, DEBUGGING 00000388 00000000 00000000 000007c8 2**0 CONTENTS, READONLY, DEBUGGING 000000c8 00000000 00000000 00000b50 2**0

17 .bss
18 .stab 19 .stabstr

20 .comment

You might also like