0% found this document useful (0 votes)
43 views59 pages

12.translation Units II

Uploaded by

Johan Dahlin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views59 pages

12.translation Units II

Uploaded by

Johan Dahlin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

Modern C++

Programming
12. Translation Units II
Include, Module, and Namespace

Federico Busato
2023-11-14
Table of Context

1 #include Issues
Include Guard
Forward Declaration
Circular Dependencies
Common Linking Errors

1/54
Table of Context

2 C++20 Modules
Overview
Terminology
Visibility and Reachability
Module Unit Types
Keywords
Global Module Fragment
Private Module Fragment
Header Module Unit
Module Partitions
2/54
Table of Context

3 Namespace
Namespace Functions vs. Class + static Methods
Namespace Alias
Anonymous Namespace
inline Namespace
Attributes for Namespace

3/54
Table of Context

4 Compiling Multiple Translation Units


Fundamental Compiler Flags
Compile Methods
Deal with Libraries
Build Static/Dynamic Libraries
Find Dynamic Library Dependencies
Analyze Object/Executable Symbols

4/54
#include Issues
Include Guard 1/3

The include guard avoids the problem of multiple inclusions of a header file in a
translation unit
header.hpp:
# ifndef HEADER_HPP // include guard
# define HEADER_HPP

... many lines of code ...

# endif // HEADER_HPP

#pragma once preprocessor directive is an alternative to the the include guard to force
current file to be included only once in a translation unit
• #pragma once is less portable but less verbose and compile faster than the include
guard

The include guard/#pragma once should be used in every header file


5/54
Include Guard 2/3

Common case:

6/54
Include Guard 3/3

header A.hpp:
# pragma once // prevent "multiple definitions" linking error

struct A {
};

header B.hpp:
# include "header_A.hpp" // included here

struct B {
A a;
};

main.cpp:
# include "header_A.hpp" // .. and included here
# include "header_B.hpp"
int main() {
A a; // ok, here we need "header_A.hpp"
B b; // ok, here we need "header_B.hpp"
7/54
}
Forward Declaration

Forward declaration is a declaration of an identifier for which a complete definition


has not yet given. “forward” means that an entity is declared before it is defined
void f(); // function forward declaration

class A; // class forward declaration

int main() {
f(); // ok, f() is defined in the translation unit
// A a; // compiler error no definition (incomplete type)
// e.g. the compiler is not able to deduce the size of A
A* a; // ok
}

void f() {} // definition of f()


class A {}; // definition of A()
8/54
Forward Declaration vs. #include

Advantages:
• Forward declarations can save compile time as #include forces the compiler to open
more files and process more input

• Forward declarations can save on unnecessary recompilation. #include can force your
code to be recompiled more often, due to unrelated changes in the header

Disadvantages:
• Forward declarations can hide a dependency, allowing user code to skip necessary
recompilation when headers change

• A forward declaration may be broken by subsequent changes to the library

• Forward declaring multiple symbols from a header can be more verbose than simply
#including the header

9/54
google.github.io/styleguide/cppguide.html#Forward Declarations
Circular Dependencies 1/3

A circular dependency is a relation between two or more modules which either


directly or indirectly depend on each other to function properly

Circular dependencies can be solved by using forward declaration, or better, by


rethinking the project organization 10/54
Circular Dependencies 2/3

header A.hpp:
# pragma once // first include
# include "header_B.hpp"
class A {
B* b;
};

header B.hpp:
# pragma once // second include
# include "header_C.hpp"
class B {
C* c;
};

header C.hpp:
# pragma once // third include
# include "header_A.hpp"
class C { // compile error "header_A.hpp": already included by "main.cpp"
A* a; // the compiler does not know the meaning of "A"
11/54
};
Circular Dependencies (fix) 3/3
header A.hpp:
# pragma once
class B; // forward declaration
// note: does not include "header_B.hpp"
class A {
B* b;
};

header B.hpp:
# pragma once
class C; // forward declaration
class B {
C* c;
};

header C.hpp:
# pragma once
class A; // forward declaration
class C {
A* a;
}; 12/54
Common Linking Errors

Very common linking errors:

• undefined reference
Solutions:
- Check if the right headers and sources are included
- Break circular dependencies (could be hard to find)

• multiple definitions
Solutions:
- inline function, variable definition or extern declaration
- Add include guard/ #pragma once to header files
- Place template definition in header file and full specialization in source files

13/54
C++20 Modules
C++20 Modules 1/2

The #include problem: The duplication of work - the same header files are
possibly parsed/compiled multiple times and most of the compiled output is later-on
thrown away again by the linker
C++20 introduces modules as a robust replacement for plain #include
Module (C++20)
A module is a set of source code files that are compiled independently of the
translation units that import them

Modules allow to define clearer interfaces with a fine-grained control on what to


import and export (similar to Java, Python, Rust, etc.)

• A Practical Introduction to C++20’s Modules


• Modules the beginner’s guide
• Understanding C++ Modules 14/54
• Overview of modules in C++
C++20 Modules 2/2

Less error-prone than #include :


• No effect on the compilation of the translation unit that imports the module
• Macros, preprocessor directives, and non-exported names declared in a module are
not visible outside the module
• Declarations in the importing translation unit do not participate in overload
resolution or name lookup in the imported module
Other benefits:
• (Much) Faster compile time. After a module is compiled once, the results are
stored in a binary file that describes all the exported types, functions, and
templates
• Smaller binary size. Allow to incorporate only the imported code and not the
whole #include 15/54
Terminology

A module consists of one or more module units

A module unit is a translation unit that contains a module declaration


module my.module.example;

A module name is a concatenation of identifiers joined by dots (the dot carries no


meaning) my.module.example

A module unit purview is the content of the translation unit

A module purview is the set of purviews of a given module name


16/54
Visibility and Reachability

Visibility of names instructs the linker if a symbol can be used by another translation
unit. Visible also means a candidate for name lookup

Reachable of declarations means that the semantic properties of an entity are


available

• Each visible declaration is also reachable


• Not all reachable declarations are also visible

17/54
Reachability Example

Common example: the members of a class are reachable (i.e. can be used) or the class
size is known, but not the class type itself
auto g() {
struct A {
void f() {}
};
return A{};
}
//---------------------------------------------------------------------------------

auto x = g(); // ok
// A y = g(); // compile error, "A" is unknown at this point
x.f(); // ok
sizeof(x); // ok
using T = decltype(x); // ok
18/54
Module Unit Types

• A module interface unit is a module unit that exports a symbol and/or module
name or module partition name

• A primary module interface unit is a module interface unit that exports the
module name. There must be one and only one primary module interface unit in
a module

• A module implementation unit is a module unit that does not export a module
name or module partition name

A module interface unit should contain only declarations if one or more module
implementation units are present. A module implementation unit
implements/defines the declarations of module interface units
19/54
Keywords

module specifies that the file is a named module


module my.module; // first code line

import makes a module and its symbols visible in the current file
import my.module; // after module declaration and #include

export makes symbols visible to the files that import the current module
• export module <module name> makes visible all the exported symbols of a
module. It must appear once per module in the primary module interface unit
• export namespace <namespace> makes visible all symbols in a namespace
• export <entity> makes visible a specific function, class, or variable
• export {<code>} makes visible all symbols in a block 20/54
import Example

# include <iostream>

int main() {
std::cout << "Hello World";
}
Preprocessing size -E : ∼1MB
import <iostream>

int main() {
std::cout << "Hello World";
}
Preprocessing size: 236B (x500)
Compile time: 2x (up to 10x) less
g++-12 -std=c++20 -fmodules-ts main.cpp -x c++-system-header iostream
21/54
export Example - Single Primary Module Interface Unit

my module.cpp
export module my.example; // make visible all module symbols

export int f1() { return 3; } // export function

export namespace my_ns { // export namespace and its content


int f2() { return 5; }
}

export { // export code block


int f3() { return 2; }
int f4() { return 8; }
}

void internal() {} // NOT exported. It can be used only internally

22/54
export Example - Two Module Interface Units

my module1.cpp Primary Module Interface Unit


export module my.example; // This is the only file that exports all module symbols

export int f1() { return 3; } // export function

my module2.cpp Module Interface Unit


module my.example; // Module declaration but symbols are not exported

export namespace my_ns { // export namespace


int f2() { return 5; }
}

export { // export code block7


int f3() { return 2; }
int f4() { return 8; }
}
23/54
export Example - Module Interface and Implementation Units

my module1.cpp Primary Module Interface Unit


export module my.example; // This is the only file that exports all module symbols

export int f1(); // export function

export { // export code block


int f3();
int f4();
}

my module2.cpp Module Implementation Unit


module my.example; // Module declaration but symbols are not exported

int f1() { return 3; }


int f3() { return 2; }
int f4() { return 8; }
24/54
Keyword Notes

import
• A module implementation unit can import another module, but cannot
export any names. Symbols of the module interface unit are imported implicitly

• All import must appear before any declarations in that module unit and after
module; a export module (if present)

export
• Symbols with internal linkage or no linkage cannot be exported, i.e. anonymous
namespaces and static entities

• The export keyword is used in module interface units only

• The semantic properties associated to exported symbols become reachable 25/54


export import Declaration

Imported modules can be directly re-exported


export module main_module; // Top-level primary module interface unit

export import sub_module; // import and re-export "sub_module"

export module sub_module; // Primary module interface unit

export void f() {}

import main_module;

int main() {
f(); // ok, f() is visible
}

26/54
Global Module Fragment

A global module fragment (unnamed module) can be used to include header files in
a module interface when importing them is not possible or preprocessing directives are
needed
module; // start Global Module Fragment

# define ENABLE_FAST_MATH
# include "my_math.h"

export modul my.module; // end Global Module Fragment

Macro definitions or other preprocessing directives are not visible outside the file itself

27/54
Private Module Fragment

A private module fragment allows a module to be represented as a single translation


unit without making all of the contents of the module reachable to importers
→ A modification of the private module fragment does not cause recompilation
If a module unit contains a private module fragment, it will be the only module unit of
its module
export module my.example;
export int f();

module :private; // start private module fragment

int f() { // definition not reachable from importers of f()


return 42;
}
28/54
Header Module Unit

Legacy headers can be directly imported with import instead of #include


All declarations are implicitly exported and attached to the global module
(fragment)

• Macros from the header are available for the importer, but macros defined in the
importer have no effect on the imported header
• Importing compiled declarations is faster than #include

C++23 will introduce modules for the standard library

29/54
Module Partitions 1/2

A module can be organized in isolated module partitions


Syntax :
export module module_name : partition_name;

• Declarations in any of the partitions are visible within the entire module

• Like common modules, a module partition consists in one module partition


interface unit and zero or more module partition implementation units

• Module partitions are not visible outside of module

• Module partitions do not implicitly import the module interface

• All names exported by partition interface files must be imported and


re-exported by the primary module interface file
30/54
Module Partitions 2/2

main module.ixx
export module main_module;

export import :partition1; // re-export f() to importers of "main_module"


export import :partition2; // re-export g() to importers of "main_module"

export void h() { internal(); } // internal() can be directly used

partition1.ixx
export module module_name:partition1;

export void f() {}

partition2.ixx
export module module_name:partition2;

export void g() {}


31/54
void internal() {} // not exported
Namespace
Overview

The problem: Named entities, such as variables, functions, and compound types declared
outside any block has global scope, meaning that its name is valid anywhere in the code

Namespaces allow to group named entities that otherwise would have global
scope into narrower scopes, giving them namespace scope (where std stands
for “standard”)

Namespaces provide a method for preventing name conflicts in large projects. Symbols
declared inside a namespace block are placed in a named scope that prevents them
from being mistaken for identically-named symbols in other scopes

32/54
Namespace Functions vs. Class + static Methods

Namespace functions:
• Namespace can be extended anywhere (without control)
• Namespace specifier can be avoided with the keyword using

Class + static methods:


• Can interact only with static data members
• struct/class cannot be extended outside their declarations

→ static methods should define operations strictly related to an object state


(statefull)
→ otherwise namespace should be preferred (stateless)
33/54
Namespace Example 1
# include <iostream>
namespace ns1 {
void f() {
std::cout << "ns1" << std::endl;
}
} // namespace ns1

namespace ns2 {
void f() {
std::cout << "ns2" << std::endl;
}
} // namespace ns2

int main () {
ns1::f(); // print "ns1"
ns2::f(); // print "ns2"
// f(); // compile error f() is not visible 34/54
}
Namespace Example 2

# include <iostream>

namespace ns1 {
void f() { std::cout << "ns1::f()" << endl; }
} // namespace ns1

namespace ns1 { // the same namespace can be declared multiple times


void g() { std::cout << "ns1::g()" << endl; }
} // namespace ns1

int main () {
ns1::f(); // print "ns1::f()"
ns1::g(); // print "ns1::g()"
}

35/54
‘using namespace’ Declaration

# include <iostream>
void f() { std::cout << "global" << endl; }

namespace ns1 {
void f() { std::cout << "ns1::f()" << endl; }
void g() { std::cout << "ns1::g()" << endl; }
} // namespace ns1

int main () {
f(); // ok, print "global"
using namespace ns1; // expand "ns1" in this scope (from this line)
g(); // ok, print "ns1::g()", only one choice
// f(); // compile error ambiguous function name
::f(); // ok, print "global"
ns1::f(); // ok, print "ns1::f()"
}
36/54
Nested Namespaces

# include <iostream>
namespace ns1 {
void f() { std::cout << "ns1::f()" << endl; }

namespace ns2 {
void f() { std::cout << "ns1::ns2::f()" << endl; }
} // namespace ns2

} // namespace ns1

C++17 allows nested namespace definitions with less verbose syntax:


namespace ns1::ns2 {
void h()
}

37/54
Namespace Alias

Namespace alias allows declaring an alternate name for an existing namespace


namespace very_very_long_namespace {
void g() {}
}

int main() {
namespace ns = very_very_long_namespace; // namespace alias
ns::g(); // available only in this scope
}

38/54
Anonymous Namespace

A namespace with no identifier is called unnamed/anonymous namespace


Entities within an anonymous namespace have internal linkage and, therefore, are used
for declaring unique identifiers, visible only in the same source file
Anonymous namespaces vs. static: Anonymous namespaces allow type declarations
and class definition, and they are less verbose
main.cpp source.cpp
# include <iostream> # include <iostream>
namespace { // anonymous namespace { // anonymous
void f() { std::cout << "main"; } void f() { std::cout << "source"; }
} // namespace internal linkage } // namespace internal linkage

int main() { int g() {


f(); // print "main" f(); // print "source"
} } 39/54
inline Namespace

inline namespace is a concept similar to library versioning. It is a mechanism that


makes a nested namespace look and act as if all its declarations were in the
surrounding namespace
namespace ns1 {

inline namespace V99 { void f(int) {} } // most recent version

namespace V98 { void f(int) {} }

} // namespace ns1
using namespace ns1;

V98::f(1); // call V98


V99::f(1); // call V99
f(1); // call default version (V99)
40/54
Attributes for Namespace

C++17 allows to define attribute on namespaces


namespace [[deprecated]] ns1 {

void f() {}

} // namespace ns1

ns1::f(); // compiler warning

41/54
Compiling Multiple
Translation Units
Fundamental Compiler Flags

Include flag: g++ -I include/ main.cpp -o main.x

• -I : Specify the include path for the project headers


• -isystem : Specify the include path for system (external) headers (warnings
are not emitted)

They can be used multiple times


Important: include and library compiler flags, as well as multiple values in an
environment variable, are evaluated in order from left to right. The first match
suppress the other ones

Compile to a file object: g++ -c source.cpp -o source.o

42/54
Compile Methods

Method 1
Compile all files together (naive):
g++ main.cpp source.cpp -o main.out

Method 2
Compile each translation unit in a file object:
g++ -c source.cpp -o source.o
g++ -c main.cpp -o main.o
Multiple objects can be compiled in parallel
Link all file objects:
g++ main.o source.o -o main.out

43/54
C++ Libraries 1/2

A library is a package of code that is meant to be reused by many programs

A static library is a set of object files (just the concatenation) that are directly linked
into the final executable. If a program is compiled with a static library, all the
functionality of the static library becomes part of final executable
– A static library cannot be modified without re-link the final executable
– Increase the size of the final executable
+ The linker can optimize the final executable (link time optimization)

Given the static library my lib , the corresponding file is:


• Linux: libmy lib.a
• Windows: my lib.lib
44/54
C++ Libraries 2/2

A dynamic library, also called a shared library, consists of routines that are loaded
into the application at run-time. If a program is compiled with a dynamic library, the
library does not become part of final executable. It remains as a separate unit
+ A dynamic library can be modified without re-link
– Dynamic library functions are called outside the executable
– Neither the linker, nor the compiler can optimize the code between shared libraries
and the final executable
• The environment variables must be set to the right shared library path, otherwise
the application crashes at the beginning

Given the shared library my lib , the corresponding file is:


• Linux: libmy lib.so
• Windows: my lib.dll + my lib.lib 45/54
Deal with Libraries

Specify the library path (path where search for static/dynamic libraries) to the
compiler: g++ -L<library path> main.cpp -o main

-L can be used multiple times ( /LIBPATH on Windows)

Specify the library name (e.g. liblibrary.a) to the compiler:


g++ -llibrary main.cpp -o main
The full path on Windows instead

46/54
Deal with Libraries

Linux/Unix environmental variables:


• LIBRARY PATH Specify the directories where search for static libraries .a at
compile-time

• LD LIBRARY PATH Specify the directories where search for dynamic/shared


libraries .so at run-time

Windows environmental variables:


• LIBPATH Specify the directories where search for static libraries .lib at
compile-time

• PATH Specify the directories where search for dynamic/shared libraries .dll at
run-time
47/54
Build Static/Dynamic Libraries

Static Library Creation


• Create object files for each translation unit (.cpp)
• Create the static library by using the archiver (ar) linux utility
g++ source1.c -c source1.o
g++ source2.c -c source2.o
ar rvs libmystaticlib.a source1.o source2.o

Dynamic Library Creation


• Create object files for each translation unit (.cpp). Since library cannot store code at
fixed addresses the compile must generate position independent code
• Create the dynamic library
g++ source1.c -c source1.o -fPIC
g++ source2.c -c source2.o -fPIC
g++ source1.o source2.o -shared -o libmydynamiclib.so 48/54
Demangling

Name mangling is a technique used to solve various problems caused by the need to
resolve unique names
Transforming C++ ABI (Application binary interface) identifiers into the original
source identifiers is called demangling
Example (linking error):
_ZNSt13basic_filebufIcSt11char_traitsIcEED1Ev

After demangling:
std::basic_filebuf<char, std::char_traits<char> >::∼basic_filebuf()

How to demangle: c++filt


Online Demangler: https://demangler.com
49/54
Find Dynamic Library Dependencies

The ldd utility shows the shared objects (shared libraries) required by a program or
other shared objects
$ ldd /bin/ls
linux-vdso.so.1 (0x00007ffcc3563000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f87e5459000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f87e5254000)
libc.so.6 => /lib64/libc.so.6 (0x00007f87e4e92000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f87e4c22000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f87e4a1e000)
/lib64/ld-linux-x86-64.so.2 (0x00005574bf12e000)
libattr.so.1 => /lib64/libattr.so.1 (0x00007f87e4817000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f87e45fa000)

50/54
Find Object/Executable Symbols 1/3

The nm utility provides information on the symbols being used in an object file or
executable file
$ nm -D -C something.so
w __gmon_start__
D __libc_start_main
D free
D malloc
D printf

# -C: Decode low-level symbol names


# -D: accepts a dynamic library

51/54
Find Object/Executable Symbols 2/3

readelf displays information about ELF format object files

$ readelf --symbols something.so | c++filt


... OBJECT LOCAL DEFAULT 17 __frame_dummy_init_array_
... FILE LOCAL DEFAULT ABS prog.cpp
... OBJECT LOCAL DEFAULT 14 CC1
... OBJECT LOCAL DEFAULT 14 CC2
... FUNC LOCAL DEFAULT 12 g()

# --symbols: display symbol table

52/54
Find Object/Executable Symbols 3/3

objdump displays information about object files


$ objdump -t -C something.so | c++filt
... df *ABS* ... prog.cpp
... O .rodata ... CC1
... O .rodata ... CC2
... F .text ... g()
... O .rodata ... (anonymous namespace)::CC3
... O .rodata ... (anonymous namespace)::CC4
... F .text ... (anonymous namespace)::h()
... F .text ... (anonymous namespace)::B::j1()
... F .text ... (anonymous namespace)::B::j2()

# --t: display symbols


# -C: Decode low-level symbol names

53/54
References and Additional Material

• 20 ABI (Application Binary Interface) breaking changes every C++


developer should know

• Policies/Binary Compatibility Issues With C++

• 10 differences between static and dynamic libraries every C++


developer should know

54/54

You might also like