CPP For C Sharp Developers
CPP For C Sharp Developers
com
Copyright © 2021 Jackson Dunstan
All rights reserved
Table of Contents
1. Introduction
2. Primitive Types and Literals
3. Variables and Initialization
4. Functions
5. Build Model
6. Control Flow
7. Pointers, Arrays, and Strings
8. References
9. Enumerations
10. Struct Basics
11. Struct Functions
12. Constructors and Destructors
13. Initialization
14. Inheritance
15. Struct and Class Permissions
16. Struct and Class Wrapup
17. Namespaces
18. Exceptions
19. Dynamic Allocation
20. Implicit Type Conversion
21. Casting and RTTI
22. Lambdas
23. Compile-Time Programming
24. Preprocessor
25. Intro to Templates
26. Template Parameters
27. Template Deduction and Specialization
28. Variadic Templates
29. Template Constraints
30. Type Aliases
31. Deconstructing and Attributes
32. Thread-Local Storage and Volatile
33. Alignment, Assembly, and Language Linkage
34. Fold Expressions and Elaborated Type Specifiers
35. Modules, The New Build Model
36. Coroutines
37. Missing Language Features
38. C Standard Library
39. Language Support Library
40. Utilities Library
41. System Integration Library
42. Numbers Library
43. Threading Library
44. Strings Library
45. Array Containers Library
46. Other Containers Library
47. Containers Library Wrapup
48. Algorithms Library
49. Ranges and Parallel Algorithms
50. I/O Library
51. Missing Library Features
52. Idioms and Best Practices
53. Conclusion
1. Introduction
History
C++’s predecessor is C, which debuted in 1972. It is still the most
used language with C++ in fourth place and C# in fifth.
C++ got started with the name “C with Classes” in 1979. The name
C++ came later in 1982. The original C++ compiler, Cfront, output C
source files which were then compiled to machine code. That
compiler has long since been replaced and modern compilers all
compile C++ directly to machine code.
Major additions to the language were added with “C++ 2.0” in 1989
and the language was then standardized by ISO in 1998.
Colloquially, this was called C++98 and began the convention where
the year is added to name a version of the language. It also
formalized the process of designing and standardizing the language
via a committee and various working groups.
The following table shows the major sections of the Standard Library
and their loose equivalents in .NET:
Standard
Library C++ C#
Section
Language
numeric_limits::max int.MaxValue
support
Regular
regex Regex
expressions
Atomic
atomic++ Interlocked.Increment
operations
Open
Compiler Cost Platforms
Source
Free and
Microsoft Visual Studio No Windows
Paid
Windows,
Clang Free Yes
macOS, Linux
Windows,
Intel C++ Free No
macOS, Linux
There are also many IDEs available with the usual combination of
features: a text editor, compiler execution, interactive debugger, etc.
Here are some popular options:
Open
IDE Cost Platforms
Source
Windows,
JetBrains CLion Paid No
macOS, Linux
Open
IDE Cost Platforms
Source
Many guideline documents exist for C++. The C++ Core Guidelines,
Google C++ style guide, and engine-specific standards are all
commonly used. The C++ Core Guidelines, in particular, has a
companion Guidelines Support Library (GSL) to enforce and
facilitate the guidelines.
Community
There are many places where the community of developers
congregate. Here are a few:
Part Meaning
short short 16 16
int int 32 32
int signed 32 32
N/A long 32 64
The types named with char are due to their original usage for
characters in ASCII strings. There are also larger character types:
N/A char8_t 8 8
N/A char16_t 16 16
N/A char32_t 32 32
N/A wchar_t 16 32
float float 32 32
double double 64 32
Given the uncertainty of size across CPU and OS, it’s a best practice
to avoid many of these types and instead use types that have
specific sizes. These are found in the Standard Library or in game
engine APIs. Here’s how much simpler that makes everything:
Part Meaning
long
long Hexadecimal Signed long
0xFFFFFFFFll
long (explicit) (default) (explicit)
Next up are floating point literals, which are also written in four parts parts:
Part Meaning
Form Meaning
Characters can be anything in their set (e.g. UTF-8) except ', \, and the newline character.
To get those, and other special characters, use an escape sequence:
Escape
Meaning Note Example
Sequence
Question mark \?
Backslash \\
Bell \a
Backspace \b
Form feed \f
Line feed \n
Escape
Meaning Note Example
Sequence
Carriage return \r
Tab \t
Vertical tab \v
int x;
Just like in C#, we state the type of the variable, the variable’s name,
and end with a semicolon. We can also declare multiple variables in
one statement:
int x, y, z;
Also like C#, these variables do not yet have a value. Consider trying
to read the value of such a variable:
int x;
int y = x;
In C#, this would result in a compiler error on the second line. The
compiler knows that x doesn’t have a value, so it can’t be read and
assigned to y. In C++, this is known as “undefined behavior.” When
the compiler encounters undefined behavior, it is free to generate
arbitrary code for the entire executable. It may or may not produce a
warning or error to warn about this, meaning it may silently produce
an executable that doesn’t do what the author thinks it should do. It
is very important to never invoke undefined behavior and tools have
been written to help avoid it.
int localPlayerHealth;
foreach (Player p in players)
{
if (p.IsLocal)
{
localPlayerHealth = p.Health;
break;
}
}
Debug.Log(localPlayerHealth);
We know that one player has to be the local player because that’s
how our game was designed, so it’s safe to not initialize
localPlayerHealth before the loop. Initializing it to 0 would be
wasteful in this case, but the C# compiler doesn’t know about our
game design so it can’t prove that we’ll always find the local player
and it forces us to initialize.
In C++, we’re free to skip this initialization and assume the risk of
undefined behavior if it turns out there really wasn’t a local player in
the players array. Alternatively, we can replicate the C# approach
and just initialize the variable to be safe.
Initialization
C++ provides a lot of ways to initialize variables. We’ve already seen
one above where a value is copied:
int x = y;
Variable Value
a (Unknown)
b 123
Variable Value
c 0
d 456
e 789
Type Deduction
In C#, we can use var to avoid needing to specify the type of our
variables. Similarly, C++ has the auto keyword:
auto x = 123;
auto x{123};
auto x(123);
Also similar to C#, we can only use auto when there is an initializer.
The following isn’t allowed:
auto x;
auto x{};
int x;
decltype(x) y = 123; // y is an int
int* x;
int * x;
int *x;
int* x, y, z;
We’ll cover how to actually use pointers more in depth later in the
book.
References
C++ has two kinds of references: “lvalue” and “rvalue.” Just like with
pointers, these are an annotation on another type:
// lvalue references
int& x;
int & x;
int &x;
// rvalue references
int&& x;
int && x;
int &&x;
When declaring more than one variable per statement, the same rule
applies here: & or && only attaches to one variable:
Taken all together, this means we can declare several variables per
statement and each can have their own modifier on the stated type:
a int
b int*
c int&
d int&&
We’ll dive into the details of how lvalue references and rvalue
references work later in the book. For now, it’s important to know that
they are like non-nullable pointers. This means we must initialize
them when they are declared. All of the above lines will fail to
compile since we didn’t. So let’s correct that:
int x = 123;
int& y = x;
int&& z = 456;
Now let’s write the second part: the function’s definition. This also
contains the function’s signature but includes the body too:
One last quirk: it’s possible to declare more than one function in a
statement just like int a, b; declares two variables. This is very
rarely seen and should generally be avoided in favor of the single-
declaration form.
// Two functions:
// int Add(int a, int b)
// int Sub(int a, int b)
int Add(int a, int b), Sub(int a, int b);
After all, we’re just telling the compiler the signature of the function
and the parameter names are irrelevant to that. Perhaps more
strangely, we can omit the parameter names from function
definitions!
void OnPlayerSpawned(Vector3)
{
NumSpawns++;
}
This function doesn’t care where the player spawned because all it’s
doing is keeping track of a statistic. So we can omit the parameter
name for a couple reasons. First, it tells the reader that this
parameter isn’t important in the function so it isn’t even given a name
that needs to be memorized. Second, it tells the compiler not to
complain about an unused variable. After all, we can’t use a varaible
without a name in the first place. Sometimes we see a middle ground
in C++ code where the name is stated inside a comment to gain only
the second benefit but not the first:
uint64_t GetCurrentTime(void);
Just like with variables, the compiler figures out what the return type
should be. In this case, it’s just int since that’s what we get when
adding two int values together.
We can also specify the return type after the parameter list if we put
auto before the function name:
The function should then call va_start, va_arg, and va_end in order
to get the arguments. This is quite type-unsafe and a very clunky
interface, which is part of why the feature should generally not be
used. There are several alternatives that are preferred instead, but
many are more advanced features that will be discussed later on in
the book. For now, let’s discuss a simple one: overloading.
Overloading
As in C#, functions may be overloaded in the sense that more than
one function may have the same name. When the function is called,
the compiler figures out which of these identically-named functions
should actually be called.
score = GetPlayerScore(myPlayerId);
score = GetPlayerScore();
score = GetPlayerScore(myPosition);
In this case, the compiler will generate calls to the three functions we
declared in the same order.
Ref, Out, and In parameters
In C#, parameters can be declared with the ref, in, and out
keywords. Each of these change the parameter to be a pointer to the
passed value. In C++, these keywords don’t exist. Instead, we use
some conventions:
// Alternative to `ref`
// Use an lvalue reference, which is like a non-nullable
pointer
void MovePlayer(Player& player, Vector3 offset)
{
player.position += offset;
}
// Alternative to `in`
// Use a constant lvalue reference
// `const` means it can't be changed
void PrintPlayerName(const Player& player)
{
DebugLog(player.name);
}
// Alternative to `out`
// Just use return values
ReallyBigMatrix ComputeMatrix()
{
ReallyBigMatrix matrix;
// ...math goes here...
return matrix
}
// Another alternative to `out`
// Use lvalue reference parameters
void ComputeMatrix(ReallyBigMatrix& mat1,
ReallyBigMatrix& mat2)
{
mat1 = /* math for mat1 */;
mat2 = /* math for mat2 */;
}
// Another alternative to `out`
// Pack the outputs into a return value
tuple<ReallyBigMatrix, ReallyBigMatrix> ComputeMatrix()
{
return make_tuple(/* math for mat1 */, /* math for
mat2 */);
}
int GetNextId()
{
static int id = 0;
id++;
return id;
}
GetNextId(); // 1
GetNextId(); // 2
GetNextId(); // 3
DebugLog(GetSquareOfSumUpTo(5000));
// equivalent to...
DebugLog(1020530960);
This means that normal C++ can be reused for both compile time
and runtime work. There’s usually no need to run scripts in another
language in order to generate C++ files. The types and functionality
the program is already made up of are usable at compile time with
this mechanism.
With C++, we compile all our translation units (source code files with
.cpp, .cxx, .cc, .C, or .c++) into object files (.obj or .o) and then link
them together into an executable (app.exe or app), static library (.lib
or .a), or dynamic library (.dll or .so).
If any of the source code files changed, we recompile them to
generate a new object files and then run the linker with all the
unchanged object files too.
Critically for performance, all calls into functions in the static library
are just normal function calls. This means there’s no indirection
through a pointer that is set at runtime when a dynamic library is
loaded. It also means that the linker can perform “link time
optimizations” such as inlining these functions.
We won’t discuss the specifics of how to run the compiler and linker
in this book. This is heavily dependent on the specific compiler, OS,
and game engine being used. Usually game engines or console
vendors will provide documentation for this. Also typical is to use an
IDE like Microsoft Visual Studio or Xcode that provides a “project”
abstraction for managing source code files, compiler settings, and so
forth.
Header Files and the Preprocessor
In C#, we add using directives to reference code in other files. C++
has a similar “module” system added in C++20 which we’ll cover in a
future chapter in this book. For now, we’ll pretend like that doesn’t
exist and only discuss the way that C++ has traditionally been built.
Header files (.h, .hpp, .hxx, .hh, .H, .h++, or no extension) are by far
the most common way for code in one file to reference code in
another file. These are simply C++ source code files that are
intended to be copy-and-pasted into another C++ source code file.
The copy-and-paste operation is performed by the preprocessor.
Just like in C#, preprocessor directives like #if are evaluated before
the main phase of compilation. There is no separate preprocessor
executable that must be called to produce an intermediate file that
the compiler receives. Preprocessing is simply an earlier step for the
compiler.
// math.h
int Add(int a, int b);
// math.cpp
#include "math.h"
int Add(int a, int b)
{
return a + b;
}
Recall from the previous chapter that the first Add is a function
declaration and the second is a function definition. Since the
signatures match, the compiler knows we’re defining the earlier
declaration.
So far we’ve split the declaration and definition across two files, but
without much benefit. Now let’s make this pay off by adding another
translation unit:
// user.cpp
#include "math.h"
int AddThree(int a, int b, int c)
{
return Add(a, Add(b, c));
}
This shows how user.cpp can add the same #include "math.h" to
access the declaration of Add, resulting in this:
#include <math.h>
This version is meant to search just for the C++ Standard Library
and other header files that the compiler provides. For example,
Microsoft Visual Studio allows #include <windows.h> to make
Windows OS calls. This is useful to disambiguate file names that are
both in the application’s codebase and provided by the compiler.
Imagine this program:
#include "math.h"
bool IsNearlyZero(float val)
{
return fabsf(val) < 0.000001f;
}
#include <math.h>
bool IsNearlyZero(float val)
{
return fabsf(val) < 0.000001f;
}
Also note that we can specify paths in the #include that correspond
to a directory structure:
#include "utils/math.h"
#include <nlohmann/json.hpp>
Finally, while it’s esoteric and usually best avoided, there is nothing
stopping us from using #include to pull in non-header files. We can
#include any file as long as the result is legal C++. Sometimes
#include is even placed in the middle of a function to fill in part of its
body!
ODR and Include Guards
C++ has what it calls the “one definition rule,” commonly abbreviated
to ODR. This says that there may be only one definition of something
in a translation unit. This includes variables and functions, which
presents us some problems as our codebase grows. Imagine we’ve
expanded our math library and added a vector math library on top of
it:
// math.h
int Add(int a, int b);
float PI = 3.14f;
// vector.h
#include "math.h"
float Dot(float aX, float aY, float bX, float bY);
// user.cpp
#include "math.h"
#include "vector.h"
int AddThree(int a, int b, int c)
{
return Add(a, Add(b, c));
}
bool IsOrthogonal(float aX, float aY, float bX, float bY)
{
return Dot(aX, aY, bX, bY) == 0.0f;
}
This makes use of the #if, #define, and #endif directives, which are
similar to their C# counterparts. The only real difference in this case
is the use of !defined MATH_H in C++ instead of just !MATH_H in C#.
#ifndef MATH_H
#define MATH_H
int Add(int a, int b);
float PI = 3.14f;
#endif
math_h
MATH_H
MATH_H_
MYGAME_MATH_H
#pragma once
int Add(int a, int b);
float PI = 3.14f;
Regardless of the form chosen, let’s look at how this helps avoid the
ODR violation. Here’s how user.cpp looks after all the #include
directives are resolved: (indentation added for clarity)
#ifndef MATH_H
#define MATH_H
int Add(int a, int b);
float PI = 3.14f;
#endif
#ifndef VECTOR_H
#define VECTOR_H
#ifndef MATH_H
#define MATH_H
int Add(int a, int b);
float PI = 3.14f;
#endif
float Dot(float aX, float aY, float bX, float bY);
#endif
int AddThree(int a, int b, int c)
{
return Add(a, Add(b, c));
}
bool IsOrthogonal(float aX, float aY, float bX, float bY)
{
return Dot(aX, aY, bX, bY) == 0.0f;
}
On the first line (#ifndef MATH_H), the preprocessor finds that MATH_H
isn’t defined so it keeps all the code until the #endif. That includes a
#define MATH_H, so now it’s defined.
#include "vector.h"
float Dot(float aX, float aY, float bX, float bY)
{
return Add(aX*bX, aY+bY);
}
#ifndef VECTOR_H
#define VECTOR_H
#ifndef MATH_H
#define MATH_H
int Add(int a, int b);
float PI = 3.14f;
#endif
float Dot(float aX, float aY, float bX, float bY);
#endif
float Dot(float aX, float aY, float bX, float bY)
{
return Add(aX*bX, aY+bY);
}
That’s great for the purposes of compiling this translation unit since
there are no duplicate definitions to violate the ODR. Compilation will
succeed, but linking will fail.
The reason for the linker error is that, by default, we can’t have
duplicate definitions of PI at link time either. If we want to do that, we
need to add the inline keyword to PI to tell the compiler that
multiple definitions should be allowed. That’ll result in these
translation units:
// user.cpp
int Add(int a, int b);
inline float PI = 3.14f;
float Dot(float aX, float aY, float bX, float bY);
int AddThree(int a, int b, int c)
{
return Add(a, Add(b, c));
}
bool IsOrthogonal(float aX, float aY, float bX, float bY)
{
return Dot(aX, aY, bX, bY) == 0.0f;
}
// vector.cpp
int Add(int a, int b);
inline float PI = 3.14f;
float Dot(float aX, float aY, float bX, float bY);
float Dot(float aX, float aY, float bX, float bY)
{
return Add(aX*bX, aY+bY);
}
This is often avoided though because any change to the function will
require recompiling all of the translation units that include it, directly
or indirectly, which may take quite a while in a big codebase.
Linkage
Finally for this chapter, C++ has the concept of “linkage.” By default,
variables like PI have external linkage. This means it can be
referenced by other translation units. For example, say we added a
variable to math.cpp:
Later on, the linker runs and reads in user.obj as well as all the
other object files including math.obj. While processing user.obj, it
reads that note from the compiler saying that the definition of SQRT2
is missing and it goes looking through the other object files to find it.
Lo and behold, it finds a note in math.obj saying that there’s a float
named SQRT2 so the linker makes GetDiagonalOfSquare refer to that
variable.
Quick note: the extern keyword can also be applied in math.cpp, but
this has no effect since external linkage is the default. Still, here’s
how it’d look:
Now if we try to link user.obj and math.obj, the linker can’t find any
available definition of SQRT2 in any of the object files so it produces
an error.
Both extern and static can be used with functions, too. For
example:
// math.cpp
int Sub(int a, int b)
{
return a - b;
}
static int Mul(int a, int b)
{
return a * b;
}
// user.cpp
extern int Sub(int a, int b);
int SubThree(int a, int b, int c)
{
return Sub(Sub(a, b), c);
}
extern int Mul(int a, int b); // compiler error: Mul is
`static`
Conclusion
In this chapter we’ve seen C++’s very different approach to building
source code. The “compile then link” approach combined with
header files has domino effects into the ODR, linkage, and include
guards. We’ll go into C++20’s module system that solves a lot of
these problems and results in a much more C#-like build model later
on in the book, but header files will still be very relevant even with
modules. There’s also a lot more detail to go into with respect to the
ODR and linkage, but we’ll cover that incrementally as we introduce
more language concepts like templates and thread-local variables.
6. Control Flow
If and Else
Let’s start with the lowly if statement, which is just like in C#:
if (someBool)
{
// ... execute this if someBool is true
}
if (someBool)
{
// ... execute this if someBool is true
}
else
{
// ... execute this if someBool is false
}
Goto and Labels
The goto statement is also similar to in C#. We create a label and
then name it in our goto statement:
void DoLotsOfThingsThatMightFail()
{
if (!DoThingA())
{
goto handleFailure;
}
if (!DoThingB())
{
goto handleFailure;
}
if (!DoThingC())
{
goto handleFailure;
}
handleFailure:
DebugLog("Critical operation failed. Aborting
program.");
exit(1);
}
Like in C#, the label to goto must be in the same function. Unlike in
C#, the label can’t be inside of a try or catch block.
One subtle difference is that a C++ goto can be used to skip past the
declaration of variables, but not the initialization of them. For
example:
void Bad()
{
goto myLabel;
int x = 1; // Un-skippable initialization
myLabel:
DebugLog(x);
}
void Ok()
{
goto myLabel;
int x; // No initialization. Can be skipped.
myLabel:
DebugLog(x); // Using uninitialized variable
}
switch (someVal)
{
case 1:
DebugLog("someVal is one");
break;
case 2:
DebugLog("someVal is two");
break;
case 3:
DebugLog("someVal is three");
break;
default:
DebugLog("Unhandled value");
break;
}
One difference is that a case that’s not empty can omit the break and
“fall through” to the next case. This is sometimes considered error-
prone, but can also reduce duplication. These two are equivalent:
// C#
switch (someVal)
{
case 3:
DoAtLeast3();
DoAtLeast2();
DoAtLeast1();
break;
case 2:
DoAtLeast2();
DoAtLeast1();
break;
case 1:
DoAtLeast1();
break;
}
// C++
switch (someVal)
{
case 3:
DoAtLeast3();
case 2:
DoAtLeast2();
case 1:
DoAtLeast1();
}
if (player == localPlayer)
{
// .. handle the local player
}
else if (player == adminPlayer)
{
// .. handle the admin player
}
Also not supported is goto case X;. Instead, we need to create our
own label and goto it:
switch (someVal)
{
case DO_B:
doB:
DoB();
break;
case DO_A_AND_B:
DoA();
goto doB;
}
Ternary
The ternary operator in C++ is also similar to the C# version:
int damage;
if (hasQuadDamage)
damage = weapon.Damage * 4;
else
damage = weapon.Damage;
The C++ version is much looser with what we can put into the ? and
: parts. For example, we can throw an exception:
In this case, the type of the expression is whatever type the non-
throw part has: the return value of Unpause. We could even throw in
both parts:
There are many more rules to determine the type of the ternary
expression, but normally we just use the same type in both the ? and
the : parts like we did with the damage example. In this most typical
case, the type of the ternary expression is the same as either part.
While, Do-While, Break, and Continue
while and do-while loops are essentially exactly the same as in C#:
while (NotAtTarget())
{
MoveTowardTarget();
}
do
{
MoveTowardTarget()
} while (NotAtTarget());
int index = 0;
int winnerIndex = -1;
while (index < numPlayers)
{
// Dead players can't be the winner
// Skip the rest of the loop body by using `continue`
if (GetPlayer(index).Health <= 0)
{
continue;
}
// Found the winner if they have at least 100 points
// No need to keep searching, so use `break` to end
the loop
if (GetPlayer(index).Points >= 100)
{
winnerIndex = index;
break;
}
}
if (winnerIndex < 0)
{
DebugLog("no winner yet");
}
else
{
DebugLog("Player", index, "won");
}
For
The regular three-part for loop is also basically the same as in C#:
C++ has a variant of for that takes the place of foreach in C#. It’s
called the “range-based for loop” and it’s denoted by a colon:
int totalScore = 0;
for (int score : scores)
{
totalScore += score;
}
int totalScore = 0;
for (int index = 0; int score : scores)
{
DebugLog("Score at index", index, "is", score);
totalScore += score;
index++;
}
int totalScore = 0;
{
int index = 0;
auto&& range = scores;
auto cur = begin(range); // or range.begin()
auto theEnd = end(range); // or range.end()
for ( ; cur != theEnd; ++cur)
{
int score = *cur;
DebugLog("Score at index", index, "is", score);
totalScore += score;
index++;
}
}
We’ll cover pointers and references soon, but for now auto&& range
= scores is essentially making a synonym for scores called range
and *cur is taking the value pointed at by the cur pointer.
There must be begin and end functions that take whatever type
scores is, otherwise scores must have methods called begin and end
that take no parameters. If the compiler can’t find either set of begin
and end functions, there will be a compiler error. Regardless of
where they are, these functions also need to return a type that can
be compared for inequality (cur != end), pre-incremented (++cur),
and dereferenced (*cur) or there will be a compiler error.
As we’ll see throughout the book, there are many types that fit this
criteria and many user-created types are designed to fit it too.
C#-Exclusive Operators
Some of C#’s control flow operators don’t exist in C++ at all. First,
there’s no ?? or ??= operator. The ternary operator or if is usually
used in its place:
We’ll go further into constructors later in the book. For now, there’s
an important guarantee in the C++ language about returned objects
like CircleStats: copy elision. This means that if the values in the
curly braces are “pure,” like these simple constants and primitives,
then the CircleStats object will be initialized at the call site. This
means CircleStats won’t be allocated on the stack within
GetCircleInfo and then copied to the call site when GetCircleInfo
returns. This helps us avoid expensive copies when copying the
return value involves copying a large amount of data such as a big
array.
Conclusion
A lot of the control flow mechanisms are shared between C++ and
C#. We still have if, else, ?:, switch, goto, while, do, for,
foreach/”range-based for“, break, continue, and return.
C# additionally has ??, ??=, ?., and ?[], but C++ additionally has “init
expressions” on if, switch, and range-based for loops, return value
copy elision, and more flexibility with ?:, goto, and switch.
int x = 123;
// Declare a pointer type: int* is a "pointer to an int"
// Get the address of x with &x
int* p = &x;
// Dereference the pointer to get its value
DebugLog(*p); // 123
// Dereference and assign the pointer to set its value
*p = 456;
DebugLog(x); // 456
// x->y is a convenient shorthand for (*x).y
Player* p = &localPlayer;
p->Health = 100;
int x = 123;
int* p = &x;
int** pp = &p;
DebugLog(**pp); // 123
**pp = 456;
DebugLog(x); // 456
int y = 1000;
*pp = &y;
**pp = 2000;
DebugLog(x); // 456
DebugLog(y); // 2000
As in C#, pointers may be null. There are three main ways this is
written in C++:
// nullptr is compatible with all pointer types, but not
integer arithmetic
// This is generally the preferred way since C++11
int* p1 = nullptr;
// NULL is commonly defined to be zero, but works with
integer arithmetic
int* p2 = NULL;
// The zero integer
int* p3 = 0;
Arrays
It may seem strange to see arrays lumped into the same chapter as
pointers, but they’re very similar in C++. Unlike in C#, arrays are not
an object that’s “managed” and subject to garbage collection. They
are instead simply a fixed-size contiguous allocation of the same
type of data:
int a0;
int a1;
int a2;
This means that there is no overhead for an array. It is literally just its
elements. It doesn’t even have an integer keeping track of its length
like the Length field in C#. This means that the C# stackalloc
keyword is unnecessary as C++ arrays are already allocated on the
stack when declared as local variables. Likewise, the fixed keyword
to create a fixed-size buffer as a struct or class field is unnecessary
as a C++ array’s elements are already stored inside the struct or
class.
The lines blur even more because we can implicitly convert arrays
into pointers:
int a[3];
a[0] = 123;
// Implicitly convert the int[3] array to an int*
// We get a pointer to the first element
int* p = a;
DebugLog(*p); // 123
// Indexing into pointers works just like in C#
DebugLog(p[0]); // 123
The opposite does not work though: we can’t write int b[3] = p.
It’s common to omit the array size when using curly braces to
initialize the array. This tells the compiler to count the number of
elements in the curly braces and make the array that long.
int a[] = { 1, 2, 3 };
// Add a * to make this a pointer to an array instead of
just an array
// This is similar to how int* is a pointer to an int
int (*p)[3] = &a;
// Dereference the pointer to get the array, which we can
index into
DebugLog((*p)[0], (*p)[1], (*p)[2]); // 1, 2, 3
int x = 1;
int y = 2;
int z = 3;
// Add a * to int to get int*: a pointer to an int
int* a[] = { &x, &y, &z };
// Index into the array to get the pointer then
dereference it to get the int
DebugLog(*a[0], *a[1], *a[2]); // 1, 2, 3
We’ll go into const more later, but for now it’s just important to know
that the characters of the array can’t be changed. For instance, this
would produce a compiler error:
p[0] = 'H';
As long as just one of the string literals has an encoding prefix, the
others will get it too:
Raw strings like this are commonly used when literals suffice, such
as log message text. When more advanced functionality is desired,
and it very commonly is, wrapper classes such as the C++ Standard
Library’s string or Unreal’s FString are used instead. We’ll go into
string later in the book.
Pointer Arithmetic
Like in C#, arithmetic may be performed on pointers:
int a[3] = { 0, 0, 0 };
int* p = a; // Make p point to the first element of a
*p = 1;
p += 2; // Make p point to the third element of a
*p = 3;
--p; // Make p point to the second element of a
*p = 2;
DebugLog(a[0], a[1], a[2]); // 1, 2, 3
int a[3] = { 0, 0, 0 };
int* theStart = a;
int* theEnd = theStart + 3;
while (theStart < theEnd) // Compare pointers
{
*theStart = 1;
theStart++;
}
DebugLog(a[0], a[1], a[2]); // 1, 1, 1
Recall from chapter six that this satisfies the criteria for a range-
based for loop:
int a[3] = { 1, 2, 3 };
for (int val : a)
{
DebugLog(val); // 1, 2, 3
}
{
int*&& range = a;
int* cur = range;
int* theEnd = range + 3;
for ( ; cur != theEnd; ++cur)
{
int val = *cur;
DebugLog(val);
}
}
Note that the begin and end functions aren’t required in the special
case of arrays because the compiler knows the beginning and
ending pointers since the size of the array is fixed at compile time.
Function Pointers
Unlike C#, in C++ we are allowed to make pointers to functions:
int GetHealth(Player p)
{
return p.Health;
}
// Get a pointer to GetHealth. Syntax in three parts:
// 1) Return type: int
// 2) Pointer name: (*p)
// 3) Parameter types: (Player)
int (*p)(Player) = GetHealth;
// Calling the function pointer calls the function
int health = p(localPlayer);
DebugLog(health);
There are two variants of this syntax that make no difference to the
functionality:
Function pointers are commonly used like delegates in C#. They are
an object that can be passed around that, when called, invokes a
function. They are much more lightweight though as they are just a
pointer. Delegates have much more functionality, such as the ability
to add, remove, and invoke multiple functions and bind to functions
of various types such as instance methods and lambdas. We’ll cover
how to do that in C++ later on in the book.
int GetHealth(Player p)
{
return p.Health;
}
int GetLives(Player p)
{
return p.Lives;
}
// Array of pointers to functions that take a Player and
return an int
int (*statFunctions[])(Player) = { GetHealth, GetLives };
// Index into the array like any other array
int health = statFunctions[0](localPlayer);
DebugLog(health);
int lives = statFunctions[1](localPlayer);
DebugLog(lives);
int GetTotalPoints(Player*);
This makes the reader ask themselves questions like “can the
Player pointer be null?” The reader might also wonder “is this a
single Player or an array of them?” and “if this is an array, how long
can it be?” The answers really depend on the implementation of
GetTotalPoints, but we don’t want readers to have to guess or
spend their time tracking down and reading the function definition.
The function definition might not even be available, such as with a
closed-source library.
Lvalue References
To address these issues, C++ introduces “references” as an
alternative to pointers. A reference is like an alias to something,
usually backed with a pointer in the compiled code. Here’s how one
looks:
int x = 123;
int& r = x; // <-- reference
DebugLog(x, r); // 123, 123
There are a several critical aspects of this. First, the syntax for a
reference is similar to a pointer except that we add a & instead of a *
to the type we want to refer to: int in this case. We can read the
resulting int& r as “r is a reference to an int.”
int x = 123;
int y = 456;
int& r = x;
// This is equivalent to:
// x = y;
// y is read and written to x
// r remains an alias of x
r = y;
DebugLog(x, r); // 456, 456
int x = 123;
// Alias to x
int& r1 = x;
// This is equivalent to:
// int& r2 = x;
// So this is also an alias to x
int& r2 = r1;
DebugLog(r1, r2); // 123, 123
x = 456;
DebugLog(r1, r2); // 456, 456
int& r(x);
int& r = {x};
int& r{x};
int nextId = 0;
int& GetNextId()
{
nextId++;
return nextId;
}
int& id = GetNextId();
DebugLog(id); // 1
id = 0; // Reset
DebugLog(nextId); // 0
Now let’s see a reference to a function. These look just like pointers
to functions, except that there’s a & instead of a *:
That seems like a lot of lost flexibility and a lot more rules to live by,
but it turns out that satisfying all of these constraints is extremely
common. Aside from the last three, these are mostly the constraints
that C# references impose on us and they’ve turned out to be quite
practical. In practice, C++ references are very heavily used to
succinctly convey all of these constraints to readers. Let’s look once
more at the function we started with, now using a reference:
int GetTotalPoints(Player&);
It’s now clear that the Player can’t be null because that’s not
possible with references. It’s clear that that this isn’t an array of
Player objects, because that’s also not possible. The & instead of *
means that it’s simply an alias for one non-null Player object.
Rvalue References
So far we’ve seen how references can make an alias for an “lvalue,”
which is something with a name. We can also make references to
things without a name. These references to “rvalues” were
introduced in C++11 and are used quite extensively now.
An rvalue reference has two & after the type it references and is
initialized with something that doesn’t have a name:
int&& r = 5;
The literal 5 doesn’t have a name like a variable does. Still, we can
reference it and its lifetime is extended to the lifetime of the
reference so that the reference never refers to something that no
longer exists. It works like this:
{
// 5 is the rvalue
// It's not just a temporary on this line
// Its lifetime is extended to match r
int&& r = 5;
// 123 is the rvalue, but it's just written to x
// 123 stops existing after the semicolon
int x = 123;
// Both the rvalue reference and the variable are
still readable
DebugLog(r, x); // 5, 123
// The temporary that r refers to is still accessible
via the alias
r = 6;
DebugLog(r, x); // 6, 123
// Don't worry, we didn't overwrite the fundamental
concept of 5 :)
DebugLog(5); // 5
// The scope that r is in ends
// r and 5 end their lifetime
// They can no longer be used
}
int&& r(5);
int&& r = {5};
int&& r{5};
Return values can also initialize rvalue references, but these will
become “dangling” references when returning a temporary because
its liftime is not extended past the end of the function call:
It’s important to keep this in mind and only return rvalue references
whose lifetime is already going to extend beyond the end of the
function call. We’ll see some techniques for doing this later on in the
book.
int from = 1;
int to = 3;
// Compiler error
// Can't pass int& when int&& is required
PrintRange(from, to);
No other kind of initialization of an rvalue reference is possible with
an lvalue, even something as simple as this:
int x = 123;
// Compiler error
// x is an lvalue when int&& requires an rvalue
int&& r = x;
// Compiler error
// 123 is an rvalue when int& requires an lvalue
int& error = 123;
int&& rr = 123;
int& lr = rr; // rr has a name, so it's an lvalue
DebugLog(rr, lr); // 123, 123
rr = 456;
DebugLog(rr, lr); // 456, 456
The opposite doesn’t work when the lvalue reference has a name,
because that makes it not an rvalue:
int x = 123;
int& lr = x;
// Compiler error
// lr is an lvalue when int&& requires an rvalue
int&& rr = lr;
C# References
C# has several types of references. Let’s compare them with C++
references.
First, there’s the ref keyword used to pass function arguments “by
reference.” This is pretty close to a C++ lvalue reference as the
argument must be an lvalue and acts like an alias for the variable
that was passed. There are some differences though. First, C++
uses & instead of ref in the function signature and doesn’t require
the ref keyword when calling the function. Second, C# ref
arguments can only be references to variables, not functions.
Second, there are ref return values and ref local variables. These
are also similar to C++ lvalue references since they create an alias
to an lvalue. C++ uses the same & syntax instead of ref in both the
function signature for ref returns and and variable declaration for
local variables. C# also requires ref at the return statement, but
C++ doesn’t.
enum Color
{
Red = 0xff0000,
Green = 0x00ff00,
Blue = 0x0000ff
};
DebugLog(Red); // 0xff0000
Second, we see how the Red, Green, and Blue enumerators are put
into the surrounding scope rather than inside the Color enum as
would have been the case in C#. This means the DebugLog line has
Red in scope to read and print out.
enum
{
Red = 0xff0000,
Green = 0x00ff00,
Blue = 0x0000ff
};
DebugLog(Red); // 0xff0000
Like in C#, the enumerators’ values are optional. They even follow
the same rules for default values: the first enumerator defaults to
zero and subsequent enumerators default to the previous
enumerator’s value plus 1:
enum Prime
{
One = 1,
Two,
Three,
Five = 5
};
DebugLog(One, Two, Three, Five); // 1, 2, 3, 5
int
unsigned int
long
unsigned long
long long
unsigned long long
Like in C#, we can cast enumerators to integers. Just note that it’s
undefined behavior if that integer is too small to hold the
enumerator’s value:
// OK cast to integer
int one = (int)One;
DebugLog(one); // 1
// Too big to fit in 1 byte: undefined behavior
char red = Red;
DebugLog(red); // could be anything...
Prime prime{3};
Note that we’ve used the enum name like a type here, just like we
could in C#. That means we can write functions like this:
Both the declaration and the definition are a type just like int or
float, so they can be followed by identifiers in order to create
variables:
// Declaration
enum Color : unsigned int red, green, blue;
red = Red;
green = Green;
blue = Blue;
DebugLog(red, green, blue); // 0xff0000, 0x00ff00,
0x0000ff
// Definition
enum Color : unsigned int
{
Red = 0xff0000,
Green = 0x00ff00,
Blue = 0x0000ff
} red = Red, green = Green, blue = Blue;
DebugLog(red, green, blue); // 0xff0000, 0x00ff00,
0x0000ff
enum Channel
{
RedOffset = 16,
GreenOffset = 8,
BlueOffset = 0
};
unsigned char GetRed(unsigned int color)
{
return (color & Red) >> RedOffset;
}
DebugLog(GetRed(0x123456)); // 0x12
Scoped Enumerations
Fittingly, the other type of enumeration in C++ is called a “scoped”
enumeration. As expected, this introduces a new scope which
contains the enumerators. They do not spill out into the surrounding
scope, so the “scope resolution” operator is required to access them:
// OK
unsigned int red = (unsigned int)Color::Red;
The choice of underlying type when not explicitly stated is a lot
simpler, too: it’s always int:
Those are actually all the differences between the two kinds of
enumerations. The following table compares and contrasts them with
each other and C# enumerations:
Initialization
of One = 1 Optional Optional Optional
enumerators
Casting
int one =
enumerators (int)One; Yes Yes Yes
to integers
Casting
Prime p =
integers to (Prime)4; Yes Yes Yes
enumerators
enum Prime
Name {}; Optional Required Required
Aspect Example Unscoped Scoped C#
C++: int
Implicit one =
enumerators- Prime::One
Yes No No
to-integer C#: int
conversion one =
Prime::One
Scope C++:
Prime::One
resolution Optional Required Required
C#:
operator Prime.One
Implicit
int or
underlying enum E {}; int int
larger
type
Underlying
N/A
type required
enum E; Yes No (no
for
declarations)
declaration
C++: Prime
Initialization p{4}
Yes Yes No
from integer C#: Prime
p = 4
Immediate enum Prime
{} p; Yes Yes No
variables
Requirement None
to use bitwise None Casting ([Flags]
operators optional)
Conclusion
Of the two kinds of enumerations in C++, scoped enumerations are
definitely closest to C# enumerations. Still, C++ has unscoped
enumerations and they are commonly used. It’s important to know
the differences between them, scoped enumerations, and C#
enumerations as they have a number of subtle differences to keep in
mind.
10. Struct Basics
Declaration and Definition
Just like with functions and enumerations, structs may be declared
and defined separately:
// Declaration
struct Vec3;
// Definition
struct Vec3
{
float x;
float y;
float z;
};
Vec3 vec;
As with primitives and enumerations, this variable is uninitialized.
Initialization of structs is a surprisingly complex topic compared to C#
and we’ll cover it in depth later on in the book. For now, let’s just
initialize by individually setting each of the struct’s data members.
That’s the C++ term for the equivalent of fields in C#. They’re also
commonly called “member variables.” To do this, we use the .
operator just like in C#:
Vec3 vec;
vec.x = 1;
vec.y = 2;
vec.z = 3;
DebugLog(vec.x, vec.y, vec.z); // 1, 2, 3
We can also initialize the data members in the struct definition with
either =x or {x}:
struct Vec3
{
float x = 1;
float y{2};
float z = 3;
};
Vec3 vec;
DebugLog(vec.x, vec.y, vec.z); // 1, 2, 3
As with enumerations, we can also declare variables between the
closing curly brace and the semicolon of a definition:
struct Vec3
{
float x;
float y;
float z;
} v1, v2, v3;
This is sometimes used when omitting the name of the struct. This
anonymous struct has no name we can type out, but it can be used
all the same in a similar way to C# tuples ((string Name, int Year)
t = ("Apollo 11", 1969);):
// Declaration
struct Vec3;
// Compiler error: can't create a variable before
definition
Vec3 v;
// Compiler error: can't take a function parameter before
definition
float GetMagnitudeSquared(Vec3 vec)
{
return 0;
}
// Compiler error: can't return a function return value
before definition
Vec3 MakeVec(float x, float y, float z)
{
// Compiler error: can't create a variable before
definition
Vec3 v;
// Compiler error: can't return a struct before
definition
return v;
}
// Declaration
struct Vec3;
// Pointer
Vec3* p = nullptr;
// lvalue reference
float GetMagnitudeSquared(Vec3& vec)
{
return 0;
}
// rvalue reference
float GetMagnitudeSquared(Vec3&& vec)
{
return 0;
}
// Variable
Vec3 vec;
vec.x = 1;
vec.y = 2;
vec.z = 3;
// Pointer
Vec3* p = &vec;
p->x = 10;
p->y = 20;
(*p).z = 30; // Alternate version of p->z
// lvalue reference
float GetMagnitudeSquared(Vec3& vec)
{
return vec.x*vec.x + vec.y*vec.y + vec.z*vec.z;
}
// rvalue reference
float GetMagnitudeSquared(Vec3&& vec)
{
return vec.x*vec.x + vec.y*vec.y + vec.z*vec.z;
}
Layout
Like in C#, the data members of a struct are grouped together in
memory. Exactly how they’re laid out in memory isn’t defined by the
C++ Standard though. Each compiler will lay out the data members
as appropriate for factors such as the CPU architecture being
compiled for.
That said, compilers virtually always lay out the data members in a
predictible pattern. Each is placed sequentially in the same order as
written in the source code. Padding is placed between the data
members according to the alignment requirements of the data types,
which varies by CPU architecture. For example:
struct Player
{
bool IsAlive : 1;
uint8_t Lives : 3;
uint8_t Team : 2;
uint8_t WeaponID : 2;
};
This struct takes up just one byte of memory because the sum of its
bit fields’ sizes is 8. Normally it would have taken up 4 bytes since
each data member would take up a whole byte of its own.
Player p;
p.IsAlive = true;
p.Lives = 5;
p.Team = 2;
p.WeaponID = 1;
DebugLog(p.IsAlive, p.Lives, p.Team, p.WeaponID); //
true, 5, 2, 1
The compiler will, as always, generate CPU instructions specific to
the arhitecture being compiled for and depending on settings such
as optimization level. Generally though, the instructions will read one
or more bytes containing the desired bits, use a bit mask to remove
the other bits that were read, and shift the desired bits to the least-
significant part of the data member’s type. Writing to a bit field is a
similar process.
struct Player
{
bool IsAlive : 1 = true;
uint8_t Lives : 3 {5};
uint8_t Team : 2 {2};
uint8_t WeaponID : 2 = 1;
};
DebugLog(p.IsAlive, p.Lives, p.Team, p.WeaponID); //
true, 5, 2, 1
Note that the size of a bit field may be larger than the stated type:
struct SixtyFourKilobits
{
uint8_t Val : 64*1024;
};
The size of Val and the struct itself is 64 kilobits, but Val is still used
just like an 8-bit integer.
struct FirstLast
{
uint8_t First : 1; // First bit of the byte
uint8_t : 6; // Skip six bits
uint8_t Last : 1; // Last bit of the byte
};
Unnamed bit fields can also have zero size, which tells the compiler
to put the next data member on the next byte it aligns to:
struct FirstBitOfTwoBytes
{
uint8_t Byte1 : 1; // First bit of the first byte
uint8_t : 0; // Skip to the next byte
uint8_t Byte2 : 1; // First bit of the second byte
};
FirstBitOfTwoBytes x;
// Compiler error: can't take the address of a bit field
uint8_t* p = &x.Byte1;
Static Data Members
Like static fields in C#, data members may be static in C++:
struct Player
{
int32_t Score;
static int32_t HighScore;
};
The meaning is the same as in C#. Each Player object doesn’t have
a HighScore but rather there is one HighScore for all Player objects.
Because it’s bound to the struct type, not an instance of the struct,
we use the scope resolution operator (::) as we did with scoped
enumerations to access the data member:
Player::HighScore = 0;
struct Player
{
int32_t Score;
static int32_t HighScore; // Declaration
};
// Definition
int32_t Player::HighScore;
// Incorrect definition
// This just creates a new HighScore variable
// We need the "Player::" part to refer to the
declaration
int32_t HighScore;
int32_t Player::HighScore = 0;
Because the static data member inside the struct definition is just a
declaration, it can use other types that haven’t yet been defined as
long as they’re defined by the time we define the static data member:
// Declaration
struct Vec3;
struct Player
{
int32_t Health;
// Declaration
static Vec3 Fastest;
};
// Definition
struct Vec3
{
float x;
float y;
float z;
};
// Definition
Vec3 Player::Fastest;
struct Player
{
int32_t Health;
const static int32_t MaxHealth = 100;
};
We’re still allowed to put the definition outside the struct, but it’s
optional to do so. If we do, we can only put the initialization in one of
the two places:
Static data members may also be inline, much like with global
variables:
struct Player
{
int32_t Health;
inline static int32_t MaxHealth = 100;
};
In this case, we can’t put a definition outside of the struct:
struct Player
{
int32_t Health;
inline static int32_t MaxHealth = 100;
};
// Compiler error: can't define outside the struct
int32_t Player::MaxHealth;
Lastly, static data members can’t be bit fields. This would make no
sense since they’re not part of instances of the struct and aren’t even
necessarily located together in memory with other static data
members of the struct:
struct Flags
{
// All of these are compiler errors
// Static data members can't be bit fields
static bool IsStarted : 1;
static bool WonGame : 1;
static bool GotHighScore : 1;
static bool FoundSecret : 1;
static bool PlayedMultiplayer : 1;
static bool IsLoggedIn : 1;
static bool RatedGame : 1;
static bool RanBenchmark : 1;
};
To work around this, make a struct with non-static bit fields and
another struct with a static instance of the first struct:
struct FlagBits
{
bool IsStarted : 1;
bool WonGame : 1;
bool GotHighScore : 1;
bool FoundSecret : 1;
bool PlayedMultiplayer : 1;
bool IsLoggedIn : 1;
bool RatedGame : 1;
bool RanBenchmark : 1;
};
struct Flags
{
static FlagBits Bits;
};
FlagBits Flags::Bits;
Flags::Bits.WonGame = true;
Disallowed Data Members
C++ forbids using some kinds of data members in structs. First, auto
is not allowed for the data type:
struct Bad
{
// Compiler error: auto isn't allowed even if we
initialize it inline
auto Val = 123;
};
struct Good
{
// OK since data member is static and const
static const auto Val = 123;
};
struct Bad
{
// Compiler error: data members can't be register
variables
register int Val = 123;
};
This is also true for other storage class specifiers like extern:
struct Bad
{
// Compiler error: data members can't be extern
variables
extern int Val = 123;
};
The entire struct can be declared with either storage class specifier
instead:
struct Good
{
uint8_t Val;
};
register Good r;
extern Good e;
While we saw above that declared types that aren’t yet defined can
be used for static data members, this is not the case for non-static
data members:
struct Vec3;
struct Bad
{
// Compiler error: Vec3 isn't defined yet
Vec3 Pos;
};
As with other variables of types that are declared but not yet defined,
we are allowed to have pointers and references:
struct Vec3;
struct Good
{
// OK to have a pointer to a type that's declared but
not yet defined
Vec3* PosPointer;
// OK to have an lvalue to a type that's declared but
not yet defined
Vec3& PosLvalueReference;
// OK to have an rvalue to a type that's declared but
not yet defined
Vec3&& PosRvalueReference;
};
Nested Types
C++ allows us to nest types within structs just like we can in C#.
Let’s start with a scoped enumeration:
struct Character
{
enum struct Type
{
Player,
NonPlayer
};
Type Type;
};
Character c;
c.Type = Character::Type::Player;
Also note how we can have both a Type enumeration and a Type
data member. The two are disambiguated by the operator used to
access the content of the struct:
Character c;
Character* p = &c;
Character& r = c;
// . operator means "access data member"
auto t = c.Type;
t = r.Type;
// -> operator means "dereference pointer then access
data member"
t = p->Type;
// :: operator means "get something scoped to the type"
Character::Type t2;
Ambiguity arises if the data member is static and has the same
name as a nested type:
struct Character
{
enum struct Type
{
Player,
NonPlayer
};
static Type Type;
};
// Compiler error: Character::Type is ambiguous
// It could be either the scoped enumeration or the
static data member
Character::Type Character::Type =
Character::Type::Player;
struct Character
{
enum Type
{
Player,
NonPlayer
};
Type Type;
} c;
// Optionally specify the unscoped enumeration type name
c.Type = Character::Type::Player;
// Or don't specify it
// Enumerators are added to the surrounding scope: the
struct
c.Type = Character::Player;
struct Flags
{
// Unnamed struct with bit fields
// The data member Bits is static
static struct
{
bool IsStarted : 1;
bool WonGame : 1;
bool GotHighScore : 1;
bool FoundSecret : 1;
bool PlayedMultiplayer : 1;
bool IsLoggedIn : 1;
bool RatedGame : 1;
bool RanBenchmark : 1;
} Bits;
};
// The unnamed struct has no name we can just type
// Use decltype to refer to its type
decltype(Flags::Bits) Flags::Bits;
Flags::Bits.WonGame = true;
struct S1
{
struct S2
{
struct S3
{
struct S4
{
struct S5
{
uint8_t Val;
};
};
};
};
};
S1::S2::S3::S4::S5 s;
s.Val = 123;
Conclusion
We’re only just scratching the surface of C++ structs and already
they have quite a few more advanced features than their C#
counterparts:
Feature Example
struct Vector2
{
float X;
float Y;
float SqrMagnitude()
{
return this->X*this->X + this->Y*this->Y;
}
};
float SqrMagnitude()
{
return X*X + Y*Y;
}
Unlike C#, but in keeping with other C++ functions and with data
member initialization, we can split the function’s declaration and
definition. If we do so, we need to place the definition outside the
class:
struct Vector2
{
float X;
float Y;
// Declaration
float SqrMagnitude();
};
// Definition
float Vector2::SqrMagnitude()
{
return X*X + Y*Y;
}
Vector2 v;
v.X = 2;
v.Y = 3;
float sqrMag = v.SqrMagnitude();
DebugLog(sqrMag); // 13
Calling the member function works just like in C# and lines up with
how we access data members. If we had a pointer, we’d use ->
instead of .:
Vector2* p = &v;
float sqrMag = p->SqrMagnitude();
All the rules that apply to the global functions we’ve seen before
apply to member functions. This includes support for overloading:
struct Weapon
{
int32_t Damage;
};
struct Potion
{
int32_t HealAmount;
};
struct Player
{
int32_t Health;
void Use(Weapon weapon, Player& target)
{
target.Health -= weapon.Damage;
}
void Use(Potion potion)
{
Health += potion.HealAmount;
}
};
Player player;
player.Health = 50;
Player target;
target.Health = 50;
Weapon weapon;
weapon.Damage = 10;
player.Use(weapon, target);
DebugLog(target.Health); // 40
Potion potion;
potion.HealAmount = 20;
player.Use(potion);
DebugLog(player.Health); // 70
struct Test
{
// Only allow calling this on lvalue objects
void Log() &
{
DebugLog("lvalue-only");
}
// Only allow calling this on rvalue objects
void Log() &&
{
DebugLog("rvalue-only");
}
// Allow calling this on lvalue or rvalue objects
// Note: not allowed if either of the above exist
void Log()
{
DebugLog("lvalue or rvalue");
}
};
// Pretend the "lvalue or rvalue" version isn't
defined...
// 'test' has a name, so it's an lvalue
Test test;
test.Log(); // lvalue-only
// 'Test()' doesn't have a name, so it's an rvalue
Test().Log(); // rvalue-only
We’ll go more into initialization of structs soon, but for now Test() is
a way to create an instance of a Test struct.
struct Player
{
int32_t Health;
static int32_t ComputeNewHealth(int32_t oldHealth,
int32_t damage)
{
return damage >= oldHealth ? 0 : oldHealth -
damage;
}
};
To call this, we refer to the member function using the struct type
rather than an instance of the type. This is just like in C#, except that
we use :: instead of . as is normal for referring to the contents of a
type in C++:
DebugLog(Player::ComputeNewHealth(100, 15)); // 85
DebugLog(Player::ComputeNewHealth(10, 15)); // 0
struct Vector2
{
float X;
float Y;
Vector2 operator+(Vector2 other)
{
Vector2 result;
result.X = X + other.X;
result.Y = Y + other.Y;
return result;
}
};
Vector2 a;
a.X = 2;
a.Y = 3;
Vector2 b;
b.X = 10;
b.Y = 20;
Vector2 c = a + b;
DebugLog(a.X, a.Y); // 2, 3
DebugLog(b.X, b.Y); // 10, 20
DebugLog(c.X, c.Y); // 12, 23
struct Vector2
{
float X;
float Y;
};
Vector2 operator+(Vector2 a, Vector2 b)
{
Vector2 result;
result.X = a.X + b.X;
result.Y = a.Y + b.Y;
return result;
}
// (usage is identical)
Vector2 d = a.operator+(b);
DebugLog(d.X, d.Y);
Operator C++ C#
+x Yes Yes
-x Yes Yes
!x Yes Yes
~x Yes Yes
x + y Yes Yes
x - y Yes Yes
Operator C++ C#
x * y Yes Yes
x / y Yes Yes
x % y Yes Yes
x ^ y Yes Yes
x | y Yes Yes
x = y Yes No
x->y Yes No
x(x) Yes No
x ??= y N/A No
x..y N/A No
=> N/A No
as N/A No
await N/A No
checked N/A No
unchecked N/A No
default N/A No
delegate N/A No
is N/A No
nameof N/A No
new Yes No
sizeof No No
stackalloc N/A No
typeof N/A No
While it can be used directly like this, it's especially valuable for
operator overloading as it implies a canonical implementation of all
the other comparison operators: ==, !=, <, <=, >, and >=. That allows
us to write code that either uses the three-way comparison operator
directly or indirectly:
struct Vector2
{
float X;
float Y;
float SqrMagnitude()
{
return this->X*this->X + this->Y*this->Y;
}
float operator<=>(Vector2 other)
{
return SqrMagnitude() - other.SqrMagnitude();
}
};
int main()
{
Vector2 a;
a.X = 2;
a.Y = 3;
Vector2 b;
b.X = 10;
b.Y = 20;
// Directly use <=>
float res = a <=> b;
if (res < 0)
{
DebugLog("a < b");
}
// Indirectly use <=>
if (a < b)
{
DebugLog("a < b");
}
}
Conclusion
This chapter we've seen C++'s version of methods, called member
functions, and overloaded operators. Member functions are quite
similar to their C# counterparts, but do have differences such as an
optional declaration-definition split, overloading based on lvalue and
rvalue objects, and conversion to function pointers.
struct Vector2
{
float X;
float Y;
Vector2(float x, float y)
{
X = x;
Y = y;
}
};
struct Vector2
{
float X;
float Y;
Vector2(float x, float y);
};
Vector2::Vector2(float x, float y)
{
X = x;
Y = y;
}
struct Ray2
{
Vector2 Origin;
Vector2 Direction;
Ray2(float originX, float originY, float directionX,
float directionY)
: Origin(originX, originY), Direction{directionX,
directionY}
{
}
};
The initializer list starts with a : and then lists a comma-delimited list
of data members. Each has its initialization arguments in either
parentheses (Origin(originX, originY)) or curly braces
(Direction{directionX, directionY}). The order doesn’t matter
since the order the data members are declared in the struct is
always used.
struct Vector2
{
float X;
float Y;
Vector2(float x, float y)
: X(x), Y(y)
{
}
};
struct Ray2
{
Vector2 Origin;
Vector2 Direction;
Ray2(float originX, float originY, float directionX,
float directionY)
: Origin(originX, originY), Direction{directionX,
directionY}
{
}
Ray2(float directionX, float directionY)
: Ray2(0, 0, directionX, directionY)
{
}
};
struct Vector2
{
float X;
float Y;
Vector2()
{
X = 0;
Y = 0;
}
};
// C#
Vector2 vecA = new Vector2(); // 0, 0
Vector2 vecB = default(Vector2); // 0, 0
Vector2 vecC = default; // 0, 0
C++ compilers also generate a default constructor for us. Like C#, it
also initializes all fields to their default values.
C++ structs also behave in the same way that C# classes behave: if
a struct defines a constructor then the compiler won’t generate a
default constructor. That means that this version of Vector2 doesn’t
get a compiler-generated default constructor:
struct Vector2
{
float X;
float Y;
Vector2(float x, float y)
: X(x), Y(y)
{
}
};
struct Vector2
{
float X;
float Y;
Vector2() = default;
Vector2(float x, float y)
: X(x), Y(y)
{
}
};
struct Vector2
{
float X;
float Y;
Vector2();
Vector2(float x, float y)
: X(x), Y(y)
{
}
};
Vector2::Vector2() = default;
struct Vector2
{
float X;
float Y;
Vector2();
};
// Compiler error
// Must be inside the struct
Vector2::Vector2() = delete;
struct Ray2
{
Vector2 Origin;
Vector2 Direction;
Ray2(float originX, float originY, float directionX,
float directionY)
// Compiler error
// Origin and Direction don't have a default
constructor
// The (float, float) constructor needs to be
called
// That needs to be done here in the initializer
list
{
// Don't have Vector2 objects to initialize
// They needed to be initialized in the
initializer list
Origin.X = originX;
Origin.Y = originY;
Origin.X = directionX;
Origin.Y = directionY;
}
};
struct Ray2
{
Vector2 Origin;
Vector2 Direction;
Ray2(float originX, float originY, float directionX,
float directionY)
: Origin(originX, originY), Direction{directionX,
directionY}
{
}
};
Copy and Move Constructors
A copy constructor is a constructor that takes an lvalue reference to
the same type of struct. This is typically a const reference. We’ll
cover const more later in the book, but for now it can be thought of
as “read only.”
struct Vector2
{
float X;
float Y;
// Default constructor
Vector2()
{
X = 0;
Y = 0;
}
// Copy constructor
Vector2(const Vector2& other)
{
X = other.X;
Y = other.Y;
}
// Copy constructor (parameter is not const)
Vector2(Vector2& other)
{
X = other.X;
Y = other.Y;
}
// Move constructor
Vector2(const Vector2&& other)
{
X = other.X;
Y = other.Y;
}
// Move constructor (parameter is not const)
Vector2(Vector2&& other)
{
X = other.X;
Y = other.Y;
}
};
struct Vector2
{
float X;
float Y;
// Inside struct
Vector2(const Vector2& other) = default;
};
struct Ray2
{
Vector2 Origin;
Vector2 Direction;
Ray2(Ray2&& other);
};
// Outside struct
// Explicitly defaulted move constructor can't take const
Ray2::Ray2(Ray2&& other) = default;
struct Vector2
{
float X;
float Y;
Vector2(const Vector2& other) = delete;
Vector2(const Vector2&& other) = delete;
};
Destructors
C# classes can have finalizers, often called destructors. C# structs
cannot, but C++ structs can.
struct File
{
FILE* handle;
// Constructor
File(const char* path)
{
// fopen() opens a file
handle = fopen(path, "r");
}
// Destructor
~File()
{
// fclose() closes the file
fclose(handle);
}
};
struct File
{
FILE* handle;
// Constructor
File(const char* path)
{
// fopen() opens a file
handle = fopen(path, "r");
}
// Destructor declaration
~File();
};
// Destructor definition
File::~File()
{
// fclose() closes the file
fclose(handle);
}
The destructor is usually called implicitly, but it can be called
explicitly:
File file("myfile.txt");
file.~File(); // Call destructor
void OpenCloseFile()
{
File file("myfile.txt");
DebugLog("file opened");
// Compiler generates: file.~File();
}
void OpenCloseFile()
{
File file("myfile.txt");
if (file.handle == nullptr)
{
DebugLog("file filed to open");
// Compiler generates: file.~File();
throw IOException();
}
DebugLog("file opened");
// Compiler generates: file.~File();
}
No matter how file goes out of scope, its destructor is called first.
void Foo()
{
label:
File file("myfile.txt");
if (RollRandomNumber() == 3)
{
// Compiler generates: file.~File();
return;
}
shouldReturn = true;
// Compiler generates: file.~File();
goto label;
}
To briefly see how this impacts the design of C++ code, let’s add a
GetSize member function to File so it can do something useful. Let’s
also add some exception-based error handling:
struct File
{
FILE* handle;
File(const char* path)
{
handle = fopen(path, "r");
if (handle == nullptr)
{
throw IOException();
}
}
long GetSize()
{
long oldPos = ftell(handle);
if (oldPos == -1)
{
throw IOException();
}
int fseekRet = fseek(handle, 0, SEEK_END);
if (fseekRet != 0)
{
throw IOException();
}
long size = ftell(handle);
if (size == -1)
{
throw IOException();
}
fseekRet = fseek(handle, oldPos, SEEK_SET);
if (fseekRet != 0)
{
throw IOException();
}
return size;
}
~File()
{
fclose(handle);
}
};
We can use this to get the size of the file like so:
long GetTotalSize()
{
File fileA("myfileA.txt");
File fileB("myfileB.txt");
long sizeA = fileA.GetSize();
long sizeB = fileA.GetSize();
long totalSize = sizeA + sizeB;
return totalSize;
}
The compiler generates several destructor calls for this. To see them
all, let’s see a pseudo-code version of what the constructor
generates:
long GetTotalSize()
{
File fileA("myfileA.txt");
try
{
File fileB("myfileB.txt");
try
{
long sizeA = fileA.GetSize();
long sizeB = fileA.GetSize();
long totalSize = sizeA + sizeB;
fileB.~File();
fileA.~File();
return totalSize;
}
catch (...) // Catch all types of exceptions
{
fileB.~File();
throw; // Re-throw the exception to the outer
catch
}
}
catch (...) // Catch all types of exceptions
{
fileA.~File();
throw; // Re-throw the exception
}
}
struct TwoFiles
{
File FileA;
File FileB;
};
void Foo()
{
// If we write this code...
TwoFiles tf;
// The compiler generates constructor calls: A then B
// Pseudo-code: can't really call a constructor
directly
tf.FileA();
tf.FileB();
// Then destructor calls: B then A
tf.~FileB();
tf.~FileA();
}
struct TwoFiles
{
File FileA;
File FileB;
// Compiler-generated destructor
~TwoFiles()
{
FileB.~File();
FileA.~File();
}
};
struct TwoFiles
{
File FileA;
File FileB;
~TwoFiles() = delete;
};
struct Vector2
{
float X;
float Y;
explicit Vector2(const Vector2& other)
{
X = other.X;
Y = other.Y;
}
};
struct Vector2
{
float X;
float Y;
explicit (2 > 1) Vector2(const Vector2& other)
{
X = other.X;
Y = other.Y;
}
};
struct Vector2
{
float X;
float Y;
operator bool()
{
return X != 0 || Y != 0;
}
};
struct Vector2
{
float X;
float Y;
explicit operator bool()
{
return X != 0 || Y != 0;
}
};
struct Vector2
{
float X;
float Y;
explicit (2 > 1) operator bool()
{
return X != 0 || Y != 0;
}
};
Vector2 v1;
v1.X = 2;
v1.Y = 4;
bool b = v1.operator bool();
Initialization Types
C++ classifies initialization into the following types:
Default
Aggregate
Constant
Copy
Direct
List
Reference
Value
Zero
The rules for how a type works frequently defers to the rules for how
another type works. This is similar to a function calling another
function. It creates a dependency of one type on another type. These
dependencies frequently form cycles in the graph, which looks
roughly like this:
This means that as we go through the initialization types we’re going
to refer to other initialization types that we haven’t seen yet. Feel free
to jump ahead to the referenced type or come back to revisit a type
after reading about its references later on in the chapter.
T object;
struct HasDataMember
{
T Object;
int X;
HasDataMember()
: X(123) // No mention of Object
{
}
};
struct ConstructorLogs
{
ConstructorLogs()
{
DebugLog("default");
}
};
ConstructorLogs single; // Prints "default"
ConstructorLogs array[3]; // Prints "default", "default",
"default"
float f;
DebugLog(f); // Undefined behavior!
void Foo()
{
const float f; // Compiler error: default initializer
does nothing
}
// Assignment style
T object = other;
// Function call
func(other)
// Return value
return other;
// Array assigned to curly braces
T array[N] = {other};
For the first three forms, only one object is involved. That object’s
copy constructor is called with other being passed in as the
argument:
struct Logs
{
Logs() = default;
Logs(const Logs& logs)
{
DebugLog("copy");
}
};
Logs Foo(Logs a)
{
Logs b = a; // "copy" for assignment style
return a; // "copy" for return value
}
Logs x;
Foo(x); // "copy" for function call
struct Logs
{
Logs() = default;
explicit Logs(const Logs& logs)
{
DebugLog("copy");
}
};
Logs Foo(Logs a)
{
Logs b = a; // Compiler error: copy constructor is
explicit
return a; // Compiler error: copy constructor is
explicit
}
Logs x;
Foo(x); // Compiler error: copy constructor is explicit
struct ConvertLogs
{
ConvertLogs() = default;
operator bool()
{
DebugLog("convert");
return true;
}
};
bool Foo(bool b)
{
ConvertLogs x;
return x; // "convert" for return value
}
ConvertLogs x;
bool b = x; // "convert" for assignment style
Foo(x); // "convert" for function call
For non-struct types like primitives, enums, and pointers, the value is
simply copied:
int x = y;
The last form deals with arrays. This happens during aggregate
initialization.
Aggregate Initialization
Aggregate initialization has the following forms:
The elements of these arrays and data members of these structs are
copy-initialized with the given values: val1, val2, etc. This is done in
index order starting at the first element for arrays. With structs, this is
done in the order that data members are declared, just like a
constructor’s initializer list does.
struct Vector2
{
float X;
float Y;
};
Vector2 v1 = { 2, 4 };
DebugLog(v1.X, v1.Y); // 2, 4
Vector2 v2{2, 4};
DebugLog(v2.X, v2.Y); // 2, 4
Vector2 v3 = { .X=2, .Y=4 };
DebugLog(v3.X, v3.Y); // 2, 4
Vector2 v4{ .X=2, .Y=4 };
DebugLog(v4.X, v4.Y); // 2, 4
Vector2 v5(2, 4);
DebugLog(v5.X, v5.Y); // 2, 4
It’s a compiler error to pass more values than there are data
members or array elements:
struct DefaultedVector2
{
float X = 1;
float Y;
};
DefaultedVector2 dv1 = {2};
DebugLog(dv1.X, dv1.Y); // 2, 0
float a2[2] = {2};
DebugLog(a2[0], a2[1]); // 2, 0
// Named variable
T object{val1, val2};
// Unnamed temporary variable
T{val1, val2}
struct MyStruct
{
// Data member
T member{val1, val2};
};
MyStruct::MyStruct()
// Initializer list entry
: member{val1, val2}
{
}
// Named variable
T object = {val1, val2};
// Function call
func({val1, val2})
// Return value
return {val1, val2};
// Overloaded subscript operator call
object[{val1, val2}]
// Assignment
object = {val1, val2}
struct MyStruct
{
// Data member
T member = {val1, val2};
};
Vector2 vec;
vec.X = 2;
vec.Y = 4;
// Direct list initialization direct-initializes vecA
with vec
Vector2 vecA{vec};
DebugLog(vecA.X, vecA.Y); // 2, 4
// Copy list initialization copy-initializes vecB with
vec
Vector2 vecB = {vec};
DebugLog(vecB.X, vecB.Y); // 2, 4
Fourth, if no values are passed in the curly braces and the variable
to initialize is a struct with a default constructor then it’s value-
initialized:
struct NonAggregateVec2
{
float X;
float Y;
NonAggregateVec2()
{
X = 2;
Y = 4;
}
};
NonAggregateVec2 vec = {}; // Value-initialized
DebugLog(vec.X, vec.Y); // 2, 4
Fifth, if the variable has a constructor that takes only the Standard
Library’s std::initializer_list type then that constructor is called.
We haven’t covered any of the Standard Library yet, but the details
of this type aren’t really important at this point. Suffice to say that this
is the C++ equivalent to initializing collections in C#: List<int> list
= new List<int> { 2, 4 };.
struct InitListVec2
{
float X;
float Y;
InitListVec2(std::initializer_list<float> vals)
{
X = *vals.begin();
Y = *(vals.begin() + 1);
}
};
InitListVec2 vec = {2, 4};
DebugLog(vec.X, vec.Y); // 2, 4
Sixth, if any constructor matches the passed values then the one
that matches best is called:
struct MultiConstructorVec2
{
float X;
float Y;
MultiConstructorVec2(float x, float y)
{
X = x;
Y = y;
}
MultiConstructorVec2(double x, double y)
{
X = x;
Y = y;
}
};
MultiConstructorVec2 vec1 = {2.0f, 4.0f}; // Call (float,
float) version
DebugLog(vec1.X, vec1.Y); // 2, 4
MultiConstructorVec2 vec2 = {2.0, 4.0}; // Call (double,
double) version
DebugLog(vec2.X, vec2.Y); // 2, 4
Eighth, if the variable isn’t a struct, only one value is passed, and
that value isn’t a reference, then the variable is direct-initialized:
float f = {3.14f};
DebugLog(f); // 3.14
Ninth, if the variable isn’t a struct, the curly braces have only one
value, and the variable isn’t a reference or is a reference to the type
of the single value, then the variable is direct-initialized for direct list
initialization or copy-initialized for copy list initialization with the
value:
float f = 3.14f;
float& r1{f}; // Direct list initialization direct-
initializes
DebugLog(r1); // 3.14
float& r2 = {f}; // Copy list initialization copy-
initializes
DebugLog(r2); // 3.14
float r3{f}; // Direct list initialization direct-
initializes
DebugLog(r3); // 3.14
float r4 = {f}; // Copy list initialization copy-
initializes
DebugLog(r4); // 3.14
float f = 3.14;
const int32_t& r1 = f;
DebugLog(r1); // 3
int32_t& r2 = f; // Compiler error: not const
DebugLog(r2);
float f = {};
DebugLog(f); // 0
One final detail to note is that the values passed in the curly braces
are evaluated in order. This is unlike the arguments passed to a
function which are evalutated in an order determined by the
compiler.
Reference Initialization
As mentioned above, references have their own type of initialization.
Here are the forms it takes:
float&& f = {3.14f};
DebugLog(f); // 3.14
First, for lvalue references of the same type the reference simply
binds to the passed object:
float f = 3.14f;
float& r = f;
DebugLog(r); // 3.14
float pi = 3.14f;
struct ConvertsToPi
{
operator float&()
{
return pi;
}
};
ConvertsToPi ctp;
float& r = ctp; // User-defined conversion operator
called
DebugLog(r); // 3.14
float&& Dangling1()
{
return 3.14f; // Returned temporary ends its lifetime
here
}
float& Dangling2(float x)
{
return x; // Returned argument ends its lifetime here
}
DebugLog(Dangling1()); // Undefined behavior
DebugLog(Dangling2(3.14f)); // Undefined behavior
struct HasRvalueRef
{
float&& Ref;
};
// Curly braces used. Lifetime of float with 3.14f value
extended.
HasRvalueRef hrr1{3.14f};
DebugLog(hrr1.Ref); // 3.14
// Parentheses used. Lifetime of float with 3.14f value
NOT extended.
HasRvalueRef hrr2(3.14f);
DebugLog(hrr2.Ref); // Undefined behavior. Ref has ended
its lifetime.
Value Initialization
Value initialization can look like this:
// Variable
T object{};
// Temporary variable (i.e. it has no name)
T()
T{}
// Initialize a data member in an initializer list
MyStruct::MyStruct()
: member1() // Parentheses version
, member2{} // Curly braces version
{
}
struct InitListVec2
{
float X;
float Y;
InitListVec2(std::initializer_list<float> vals)
{
int index = 0;
float x = 0;
float y = 0;
for (float cur : vals)
{
switch (index)
{
case 0: x = cur; break;
case 1: y = cur; break;
}
}
X = x;
Y = y;
}
};
InitListVec2 vec{}; // List initialization (passes empty
list)
DebugLog(vec.X, vec.Y); // 0, 0
struct Vector2
{
float X;
float Y;
Vector2() = delete;
};
Vector2 vec{}; // Default-initialized
DebugLog(vec.X, vec.Y); // 0, 0
struct Vector2
{
float X = 2;
float Y = 4;
};
Vector2 vec{}; // Zero initialization then direct
initialization
DebugLog(vec.X, vec.Y); // 2, 4
All of these look for a constructor matching the passed values. If one
is found, the one that matches best is called to initialize the variable.
struct MultiConstructorVec2
{
float X;
float Y;
MultiConstructorVec2(float x, float y)
{
X = x;
Y = y;
}
MultiConstructorVec2(double x, double y)
{
X = x;
Y = y;
}
};
MultiConstructorVec2 vec1{2.0f, 4.0f}; // Call (float,
float) version
DebugLog(vec1.X, vec1.Y); // 2, 4
MultiConstructorVec2 vec2{2.0, 4.0}; // Call (double,
double) version
DebugLog(vec2.X, vec2.Y); // 2, 4
struct Vector2
{
float X;
float Y;
};
// No constructor matches, but Vector2 is an aggregate
Vector2 vec{2, 4}; // Aggregate initialization
DebugLog(vec.X, vec.Y); // 2, 4
bool b{nullptr};
DebugLog(b); // false
struct Enemy
{
float X;
float Y;
};
struct Vector2
{
float X;
float Y;
Vector2() = default;
Vector2(Enemy enemy)
{
X = enemy.X;
Y = enemy.Y;
}
};
Vector2 defaultEnemySpawnPoint(Enemy());
The last line is ambiguous. The naming makes us think it’s a variable
with type Vector2 named defaultEnemySpawnPoint that’s being
direct-initialized with a value-initialized temporary Enemy variable.
Vector2 defaultEnemySpawnPoint(Enemy{});
DebugLog(defaultEnemySpawnPoint.X,
defaultEnemySpawnPoint.Y); // 0, 0
Constant Initialization
Constant initialization has just two forms:
Both of these only apply when the variable is both const and static,
such as for global variables and static struct data members.
Otherwise, the variable is zero-initialized.
struct Player
{
static const int32_t MaxHealth;
int32_t Health;
};
// Constant-initialize a data member
const int32_t Player::MaxHealth = 100;
// Constant-initialize a global reference
const int32_t& defaultHealth = Player::MaxHealth;
struct GameEntity
{
static const int32_t MaxHealth = 100;
int32_t Health = MaxHealth;
float GetHealthPercent()
{
return ((float)Health) / MaxHealth;
}
};
struct MovableGameEntity : GameEntity
{
float Speed = 0;
};
MovableGameEntity mge{};
mge.Health = 50;
DebugLog(mge.Health, mge.Speed, mge.GetHealthPercent());
// 50, 0, 0.5
Normally, any object must have a size of at least one byte. One
exception to this rule is when a base struct has no non-static data
members. In that case, it may add zero bytes to the size of the
structs that derive from it. An exception to this exception is if the first
non-static data member is also the base struct type.
MovableGameEntity mge{};
GameEntity* pge = &mge; // Pointers are compatible
GameEntity& rge = mge; // References are compatible
DebugLog(mge.Health, pge->Health, rge.Health); // 100,
100, 100
Derived structs can have members with the same names as their
base structs. For example, ArmoredGameEntity might get extra health
from its armor:
ArmoredGameEntity age{};
ArmoredGameEntity* page = &age;
// Refer to Health in GameEntity with
age.GameEntity::Health
DebugLog(age.Health, age.GameEntity::Health); // 50, 100
age.Die();
// Refer to Health in GameEntity via a pointer with page-
>GameEntity::Health
DebugLog(age.Health, page->GameEntity::Health); // 0, 0
This is sort of like using base in C#, except that we can refer to any
struct in the hierarchy rather than just the immedate base class type.
struct LogLifecycle
{
const char* str;
LogLifecycle(const char* str)
: str(str)
{
DebugLog(str, "constructor");
}
~LogLifecycle()
{
DebugLog(str, "destructor");
}
};
struct GameEntity
{
LogLifecycle a{"GameEntity::a"};
LogLifecycle b{"GameEntity::b"};
GameEntity()
{
DebugLog("GameEntity constructor");
}
~GameEntity()
{
DebugLog("GameEntity destructor");
}
};
struct ArmoredGameEntity : GameEntity
{
LogLifecycle a{"ArmoredGameEntity::a"};
LogLifecycle b{"ArmoredGameEntity::b"};
ArmoredGameEntity()
{
DebugLog("ArmoredGameEntity constructor");
}
~ArmoredGameEntity()
{
DebugLog("ArmoredGameEntity destructor");
}
};
void Foo()
{
ArmoredGameEntity age{};
DebugLog("--after variable declaration--");
} // Note: destructor of 'age' called here
// Logs printed:
// GameEntity::a, constructor
// GameEntity::b, constructor
// GameEntity constructor
// ArmoredGameEntity::a, constructor
// ArmoredGameEntity::b, constructor
// ArmoredGameEntity constructor
// --after variable declaration--
// ArmoredGameEntity destructor
// ArmoredGameEntity::b, destructor
// ArmoredGameEntity::a, destructor
// GameEntity destructor
// GameEntity::b, destructor
// GameEntity::a, destructor
struct GameEntity
{
static const int32_t MaxHealth = 100;
int32_t Health;
GameEntity(int32_t health)
{
Health = health;
}
};
struct ArmoredGameEntity : GameEntity
{
static const int32_t MaxArmor = 100;
int32_t Armor = 0;
ArmoredGameEntity()
: GameEntity(MaxHealth)
{
}
};
This is actually just part of the initializer list that we use to initialize
data members:
struct HasHealth
{
static const int32_t MaxHealth = 100;
int32_t Health = MaxHealth;
};
struct HasArmor
{
static const int32_t MaxArmor = 20;
int32_t Armor = MaxArmor;
};
// Player derives from both HasHealth and HasArmor
struct Player : HasHealth, HasArmor
{
static const int32_t MaxLives = 3;
int32_t NumLives = MaxLives;
};
Player p{};
// Access members in the HasHealth sub-object
DebugLog(p.HasHealth::Health); // 100
// Access members in the HasArmor sub-object
DebugLog(p.HasArmor::Armor); // 20
// Access members in the Player sub-object
DebugLog(p.Player::NumLives); // 3
Now let’s re-introduce the LogLifecycle utility struct and see how
multiple inheritance impacts the order of constructors and
destructors:
struct LogLifecycle
{
const char* str;
LogLifecycle(const char* str)
: str(str)
{
DebugLog(str, "constructor");
}
~LogLifecycle()
{
DebugLog(str, "destructor");
}
};
struct HasHealth
{
LogLifecycle a{"HasHealth::a"};
LogLifecycle b{"HasHealth::b"};
static const int32_t MaxHealth = 100;
int32_t Health = MaxHealth;
HasHealth()
{
DebugLog("HasHealth constructor");
}
~HasHealth()
{
DebugLog("HasHealth destructor");
}
};
struct HasArmor
{
LogLifecycle a{"HasArmor::a"};
LogLifecycle b{"HasArmor::b"};
static const int32_t MaxArmor = 20;
int32_t Armor = MaxArmor;
HasArmor()
{
DebugLog("HasArmor constructor");
}
~HasArmor()
{
DebugLog("HasArmor destructor");
}
};
struct Player : HasHealth, HasArmor
{
LogLifecycle a{"Player::a"};
LogLifecycle b{"Player::b"};
static const int32_t MaxLives = 3;
int32_t NumLives = MaxLives;
Player()
{
DebugLog("Player constructor");
}
~Player()
{
DebugLog("Player destructor");
}
};
void Foo()
{
Player p{};
DebugLog("--after variable declaration--");
} // Note: destructor of 'p' called here
// Logs printed:
// HasHealth::a, constructor
// HasHealth::b, constructor
// HasHealth constructor
// HasArmor::a, constructor
// HasArmor::b, constructor
// HasArmor constructor
// Player::a, constructor
// Player::b, constructor
// Player constructor
// --after variable declaration--
// Player destructor
// Player::b, destructor
// Player::a, destructor
// HasArmor destructor
// HasArmor::b, destructor
// HasArmor::a, destructor
// HasHealth destructor
// HasHealth::b, destructor
// HasHealth::a, destructor
We see here that base structs’ constructors are called in the order
that they’re derived from: HasHealth then HasArmor in this example.
Their destructors are called in the reverse order. This is analogous to
data members, which are constructed in declaration order and
destructed in the reverse order.
Given a Bottom, it’s easy to refer to its Id and the Id of the Left and
Right sub-objects:
Bottom b{};
DebugLog(b.Id); // Bottom
DebugLog(b.Left::Id); // Left
DebugLog(b.Right::Id); // Right
However, both the Left and Right sub-objects have a Top sub-
object. This is therefore ambiguous and causes a compiler error:
Bottom b{};
// Compiler error: ambiguous. Top sub-object of Left or
Right?
DebugLog(b.Top::Id);
struct Top
{
const char* Id = "Top Default";
};
struct Left : virtual Top
{
const char* Id = "Left";
};
struct Right : virtual Top
{
const char* Id = "Right";
};
struct Bottom : virtual Left, virtual Right
{
const char* Id = "Bottom";
};
// Top refers to the same sub-object in Bottom, Left, and
Right
Bottom b{};
Left& left = b;
Right& right = b;
DebugLog(b.Top::Id); // Top Default
DebugLog(left.Top::Id); // Top Default
DebugLog(right.Top::Id); // Top Default
// Changing Left's Top changes the one and only Top sub-
object
left.Top::Id = "New Top of Left";
DebugLog(b.Top::Id); // New Top of Left
DebugLog(left.Top::Id); // New Top of Left
DebugLog(right.Top::Id); // New Top of Left
// Same with Right's Top
right.Top::Id = "New Top of Right";
DebugLog(b.Top::Id); // New Top of Right
DebugLog(left.Top::Id); // New Top of Right
DebugLog(right.Top::Id); // New Top of Right
struct Enemy
{
int32_t Health = 100;
};
struct Weapon
{
int32_t Damage = 0;
explicit Weapon(int32_t damage)
{
Damage = damage;
}
virtual void Attack(Enemy& enemy)
{
enemy.Health -= Damage;
}
};
struct Bow : Weapon
{
Bow(int32_t damage)
: Weapon(damage)
{
}
virtual void Attack(Enemy& enemy)
{
enemy.Health -= Damage;
}
};
Enemy enemy{};
DebugLog(enemy.Health); // 100
Weapon weapon{10};
weapon.Attack(enemy);
DebugLog(enemy.Health); // 90
Bow bow{20};
bow.Attack(enemy);
DebugLog(enemy.Health); // 70
Weapon& weaponRef = bow;
weaponRef.Attack(enemy);
DebugLog(enemy.Health); // 50
Weapon* weaponPointer = &bow;
weaponPointer->Attack(enemy);
DebugLog(enemy.Health); // 30
C++ also has a replacement for C#’s abstract keyword for when we
don’t want to provide an implementation of a member function at all.
These member functions are called “pure virtual” and have an = 0
after them instead of a body:
struct Weapon
{
int32_t Damage = 0;
explicit Weapon(int32_t damage)
{
Damage = damage;
}
virtual void Attack(Enemy& enemy) = 0;
};
struct Weapon
{
int32_t Damage = 0;
explicit Weapon(int32_t damage)
{
Damage = damage;
}
virtual void operator+=(int32_t numLevels)
{
Damage += numLevels;
}
};
struct Bow : Weapon
{
int32_t Range;
Bow(int32_t damage, int32_t range)
: Weapon(damage), Range(range)
{
}
virtual void operator+=(int32_t numLevels) override
{
// Explicitly call base struct's overloaded
operator
Weapon::operator+=(numLevels);
Range += numLevels;
}
};
Bow bow{20, 10};
DebugLog(bow.Damage, bow.Range); // 20, 10
bow += 5;
DebugLog(bow.Damage, bow.Range); // 25, 15
Weapon& weaponRef = bow;
weaponRef += 5;
DebugLog(bow.Damage, bow.Range); // 30, 20
struct Weapon
{
int32_t Damage = 0;
explicit Weapon(int32_t damage)
{
Damage = damage;
}
virtual operator int32_t()
{
return Damage;
}
};
struct Bow : Weapon
{
int32_t Range;
Bow(int32_t damage, int32_t range)
: Weapon(damage), Range(range)
{
}
virtual operator int32_t() override
{
// Explicitly call base struct's user-defined
conversion operator
return Weapon::operator int32_t() + Range;
}
};
Bow bow{20, 10};
Weapon& weaponRef = bow;
int32_t bowVal = bow;
int32_t weaponRefVal = weaponRef;
DebugLog(bowVal, weaponRefVal); // 30, 30
struct ReadOnlyFile
{
FILE* ReadFileHandle;
ReadOnlyFile(const char* path)
{
ReadFileHandle = fopen(path, "r");
}
virtual ~ReadOnlyFile()
{
fclose(ReadFileHandle);
ReadFileHandle = nullptr
}
};
struct FileCopier : ReadOnlyFile
{
FILE* WriteFileHandle;
FileCopier(const char* path, const char*
writeFilePath)
: ReadOnlyFile(path)
{
WriteFileHandle = fopen(writeFilePath, "w");
}
virtual ~FileCopier()
{
fclose(WriteFileHandle);
WriteFileHandle = nullptr;
}
};
FileCopier copier("/path/to/input/file",
"/path/to/output/file");
// Calling a virtual destructor on a base struct
// Calls the derived struct's destructor
// If this was non-virtual, only the ReadOnlyFile
destructor would be called
ReadOnlyFile& rof = copier;
rof.~ReadOnlyFile();
struct Vector1
{
float X;
// Allows overriding
virtual void DrawPixel(float r, float g, float b)
{
GraphicsLibrary::DrawPixel(X, 0, r, g, b);
}
};
struct Vector2 : Vector1
{
float Y;
// Overrides DrawPixel in Vector1
// Stops overriding in derived structs
virtual void DrawPixel(float r, float g, float b)
override final
{
GraphicsLibrary::DrawPixel(X, Y, r, g, b);
}
};
struct Vector3 : Vector2
{
float Z;
// Compiler error: DrawPixel in base struct (Vector2)
is final
virtual void DrawPixel(float r, float g, float b)
override
{
GraphicsLibrary::DrawPixel(X/Z, Y/Z, r, g, b);
}
};
C# Equivalency
We’ve already seen the C++ equivalents for C# concepts like
abstract and sealed classes as well as abstract, virtual,
override, and sealed methods. C# has several other features that
have no explicit C++ equivalent. Instead, these are idiomatically
implemented with general struct features.
// Like an interface:
// * Has no data members
// * Has no non-abstract member functions
// * Is abstract (due to Log being pure virtual)
// * Enables multiple inheritance (always enabled in C++)
struct ILoggable
{
virtual void Log() = 0;
};
// To "implement" an "interface," just derive and
override all member functions
struct Vector2 : ILoggable
{
float X;
float Y;
Vector2(float x, float y)
: X(x), Y(y)
{
}
virtual void Log() override
{
DebugLog(X, Y);
}
};
// Use an "interface," not a "concrete class"
void LogTwice(ILoggable& loggable)
{
loggable.Log();
loggable.Log();
}
Vector2 vec{2, 4};
LogTwice(vec); // 2, 4 then 2, 4
// In PlayerShared.h
struct PlayerShared
{
const static int32_t MaxHealth = 100;
const static int32_t MaxLives = 3;
int32_t Health = MaxHealth;
int32_t NumLives = MaxLives;
float PosX = 0;
float PosY = 0;
float DirX = 0;
float DirY = 0;
float Speed = 0;
};
// In PlayerCombat.h
#include "PlayerShared.h"
struct PlayerCombat : virtual PlayerShared
{
void TakeDamage(int32_t amount)
{
Health -= amount;
}
};
// In PlayerMovement.h
#include "PlayerShared.h"
struct PlayerMovement : virtual PlayerShared
{
void Move(float time)
{
float distance = Speed * time;
PosX += DirX * distance;
PosY += DirY * distance;
}
};
// In Player.h
#include "PlayerCombat.h"
#include "PlayerMovement.h"
struct Player : virtual PlayerCombat, virtual
PlayerMovement
{
};
// In Game.cpp
#include "Player.h"
Player player;
player.DirX = 1;
player.Speed = 1;
player.TakeDamage(10);
DebugLog(player.Health); // 90
player.Move(5);
DebugLog(player.PosX, player.PosY); // 5, 0
// In StationaryPlayer.h
#include "PlayerCombat.h"
struct StationaryPlayer : virtual PlayerCombat
{
};
// In Game.cpp
#include "StationaryPlayer.h"
StationaryPlayer stationary;
stationary.TakeDamage(10); // OK, Health now 90
stationary.Move(5); // Compiler error: Move isn't a
member function
// In Object.h
// Ultimate base struct
struct Object
{
virtual int32_t GetHashCode()
{
return HashBytes((char*)this, sizeof(*this));
}
};
// In Player.h
#include "Object.h"
// Derives from ultimate base struct: Object
struct Player : Object
{
const static int32_t MaxHealth = 100;
const static int32_t MaxLives = 3;
int32_t Health = MaxHealth;
int32_t NumLives = MaxLives;
float PosX = 0;
float PosY = 0;
float DirX = 0;
float DirY = 0;
float Speed = 0;
void TakeDamage(int32_t amount);
void Move(float time);
// Can override if desired, like in C#
virtual int32_t GetHashCode() override
{
return 123;
}
};
// In Vector2.h
#include "Object.h"
// Derives from ultimate base struct: Object
struct Vector2 : Object
{
float X;
float Y;
// Can NOT override if desired, like in C#
// virtual int32_t GetHashCode() override
};
// Can pass any struct to this
// Because we made every struct derive from Object
void LogHashCode(Object& obj)
{
DebugLog(obj.GetHashCode());
}
// Can pass a Player because it derives from Object
Player player;
LogHashCode(player);
// Can pass a Vector2 because it derives from Object
Vector2 vector;
LogHashCode(vector);
Generic solutions for boxing are likewise possible and we’ll cover
those techniques later in the book.
Conclusion
C++ struct inheritance is in many ways a superset of C# class
inheritance. It goes above and beyond with support for multiple
inheritance, virtual inheritance, virtual overloaded operators, virtual
user-defined conversion functions, and skip-level sub-object
specifications like Level1::X from within Level3.
struct Player
{
// The default access specifier is public, so TakeDamage
is public
void TakeDamage(int32_t amount)
{
Health -= amount;
}
// Change the access specifier to private
private:
// Health is private
int32_t Health;
// NumLives is private
int32_t NumLives;
// Change the access specifier back to public
public:
// Heal is public
void Heal(int32_t amount)
{
Health += amount;
}
// GetExtraLife is public
void GetExtraLife()
{
NumLives++;
}
};
struct Player
{
public: void TakeDamage(int32_t amount)
{
Health -= amount;
}
private: int32_t Health;
private: int32_t NumLives;
public: void Heal(int32_t amount)
{
Health += amount;
}
public: void GetExtraLife()
{
NumLives++;
}
};
public Anywhere
Unlike C#, access specifiers may also be applied when deriving from
structs:
PublicPlayer pub{};
pub.Heal(10); // OK: Heal is public
ProtectedPlayer prot{};
prot.Heal(10); // Compiler error: Heal is protected
PrivatePlayer priv{};
priv.Heal(10); // Compiler error: Heal is private
DefaultPlayer def{};
def.Heal(10); // OK: Heal is public
struct Base
{
virtual void Foo()
{
DebugLog("Base Foo");
}
private:
virtual void Goo()
{
DebugLog("Base Goo");
}
};
struct Derived : Base
{
private:
virtual void Foo() override
{
DebugLog("Derived Foo");
}
public:
virtual void Goo() override
{
DebugLog("Derived Goo");
}
};
// These calls use the access specifiers in Base
Base b;
b.Foo(); // "Base Foo"
//b.Goo(); // Compiler error: Goo is private
// These calls use the access specifiers in Derived
Derived d;
//d.Foo(); // Compiler error: Foo is private
d.Goo(); // "Derived Goo"
// These calls use the access specifiers in Base, even
though the runtime object
// is a Derived
Base& dRef = d;
dRef.Foo(); // "Derived Foo"
//dRef.Goo(); // Compiler error: Goo is private in Base
When using virtual inheritance, the most accessible path through the
derived classes is used to determine access level:
struct Top
{
int32_t X = 123;
};
// X is private due to private inheritance
struct Left : private virtual Top
{
};
// X is public due to public inheritance
struct Right : public virtual Top
{
};
// X is public due to public inheritance via Right
// This takes precedence over private inheritance via
Left
struct Bottom : virtual Left, virtual Right
{
};
Top top{};
DebugLog(top.X); // 123
Left left{};
//DebugLog(left.X); // Compiler error: X is private
Right right{};
DebugLog(right.X); // 123
// Accessing X goes through Right
Bottom bottom{};
DebugLog(bottom.X); // 123
It's important to note that access levels may change the layout of the
struct's non-static data members in memory. While the data
members are guaranteed to be laid out sequentially, perhaps with
padding between them, this is only true of data members of the
same access level. For example, the compiler may choose to lay out
all the public data members then all the private data members or to
mix all the data members regardless of their access level:
struct Mixed
{
private: int32_t A = 1;
public: int32_t B = 2;
private: int32_t C = 3;
public: int32_t D = 4;
};
// Some possible layouts of Mixed:
// Ignore access level: A, B, C, D
// Private then public: A, C, B, D
// Public then private: B, D, A, C
class Player
{
int32_t Health = 0;
};
Player player{};
DebugLog(player.Health); // Compiler error: Health is
private
That's right: classes are just structs with a different default access
level!
class Player
{
// Private due to the default for classes
int64_t Id = 123;
int32_t Points = 0;
// Make the PrintId function a friend
friend void PrintId(const Player& player);
// Make the Stats class a friend
friend class Stats;
};
void PrintId(const Player& player)
{
// Can access Id and Points members because PrintId
is a friend of Player
DebugLog(player.Id, "has", player.Points, "points");
}
// It's OK that Stats is actually a struct, not a class
struct Stats
{
static int32_t GetTotalPoints(Player* players,
int32_t numPlayers)
{
int32_t totalPoints = 0;
for (int32_t i = 0; i < numPlayers; ++i)
{
// Can access Points because Stats is a
friend of Player
totalPoints += players[i].Points;
}
return totalPoints;
}
};
Player p;
PrintId(p); // 123 has 0 points
int32_t totalPoints = Stats::GetTotalPoints(&p, 1);
DebugLog(totalPoints); // 0
class Player
{
int64_t Id = 123;
int32_t Points = 0;
// Make the PrintId inline function and make it a
friend
friend void PrintId(const Player& player)
{
// Can access Id and Points members
// because PrintId is a friend of Player
DebugLog(player.Id, "has", player.Points,
"points");
}
};
// x is a const int
const int32_t x = 123;
// Compiler error: can't assign to a const variable
x = 456;
The const keyword can be placed on the left or right of the type. This
is known as "west const" and "east const" and both are in common
usage. The placement makes no difference in this case as both
result in a const type.
For even slightly more complicated types, the ordering matters more.
Consider a pointer type:
const char* str = "hello";
The rule is that const modifies what's immediately to its left. If there's
nothing to its left, it modifies what's immediately to its right.
Because there's nothing left of const in const char* str, the const
modifies the char immediately to its right. That means the char is
const and the pointer is non-const:
struct Player
{
const int64_t Id;
Player(int64_t id)
: Id(id)
{
}
};
Player player{123};
// Compiler error: can't modify data members of a const
struct
//player.Id = 1000;
class Player
{
int32_t Health = 100;
public:
int32_t GetHealth() const
{
// Compiler error: can't call non-const member
function from const
// member function because 'this'
is a 'const Player*'
TakeDamage(1);
return Health;
}
void TakeDamage(int32_t amount)
{
Health -= amount;
}
};
Player player{};
const Player& playerRef = player;
// Compiler error: can't call non-const TakeDamage on
const reference
playerRef.TakeDamage();
// OK: GetHealth is const
DebugLog(playerRef.GetHealth()); // 100
The mutable keyword is commonly used with cached values like the
above caching of the file's size. For example, it means that the
relatively-expensive file I/O operations only need to be performed
one time regardless of how many times GetSize is called:
The following table contrasts C++'s const keyword with C#'s const
and readonly keywords:
Numbers,
Types Any strings, Any
booleans
Factor C++ const C# const C# readonly
Fields, local
Applicability Everywhere Fields, references
variables
internal
protected internal
private protected
Access
Member Accessibile From
Specifier
internal Within same assembly
protected
Within same assembly or in derived classes
internal
Name Example
Vector2 v1 = 2.0_v2;
DebugLog(v1.X, v1.Y); // 2, 2
The C++ Standard Library reserves all suffixes that don't start with
an _ for its own use:
Still, there are situations where the brevity and expressiveness may
come in handy. This is especially the case for codebases that make
heavy use of auto:
void Foo()
{
struct Local
{
int32_t Val;
Local(int32_t val)
: Val(val)
{
}
};
Local ten{10};
DebugLog(ten.Val); // 10
}
Local classes are regular classes in most ways, but have a few
limitations. First, their member functions have to be defined within
the class definition: we can't split the declaration and the definition.
void Foo()
{
struct Local
{
int32_t Val;
Local(int32_t val);
};
// Compiler error
// Member function definition must be in the class
definition
Local::Local(int32_t val)
: Val(val)
{
}
}
Second, they can't have static data members but they can have
static member functions.
void Foo()
{
struct Local
{
int32_t Val;
// Compiler error
// Local classes can't have static data members
static int32_t Max = 100;
// OK: local classes can have static member
functions
static int32_t GetMax()
{
return 100;
}
};
DebugLog(Local::GetMax()); // 100
}
Third, and finally, they can have friends but they can't declare inline
friend functions:
class Classy
{
};
void Foo()
{
struct Local
{
// Compiler error
// Local classes can't define inline friend
functions
friend void InlineFriend()
{
}
// OK: local classes can have normal friends
friend class Classy;
};
}
Like local functions in C#, local classes in C++ are typically used to
reduce duplication of code inside the function but are placed inside
the function because they wouldn't be useful to code outside the
function. It's even common to see local classes without a name
when only one instance of them is needed. For example, this local
class de-duplicates code that's run on players, enemies, and NPCs
without requiring polymorphism:
struct Vector2
{
float X;
float Y;
// Compiler generates a copy assignment operator like
this:
// Vector2& operator=(const Vector2& other)
// {
// X = other.X;
// Y = other.Y;
// return *this;
// }
// Compiler generates a move assignment operator like
this:
// Vector2& operator=(const Vector2&& other)
// {
// X = other.X;
// Y = other.Y;
// return *this;
// }
};
void Foo()
{
Vector2 a{2, 4};
Vector2 b{0, 0};
b = a; // Call the compiler-generated copy assignment
operator
DebugLog(b.X, b.Y); // 2, 4
}
struct Vector2
{
float X;
float Y;
Vector2& operator=(const Vector2& other) = delete;
};
void Foo()
{
Vector2 a{2, 4};
Vector2 b{0, 0};
b = a; // Compiler error: copy assignment operator is
deleted
DebugLog(b.X, b.Y); // 2, 4
}
Unions
We've seen how the class keyword can be used instead of struct to
change the default access level from public to private. Similarly,
C++ provides the union keyword to change the data layout of the
struct. Instead of making the struct big enough to fit all of the non-
static data members, a union is just big enough to fit the largest non-
static data member.
union FloatBytes
{
float Val;
uint8_t Bytes[4];
};
void Foo()
{
FloatBytes fb;
fb.Val = 3.14f;
DebugLog(sizeof(fb)); // 4 (not 8)
// 195, 245, 72, 64
DebugLog(fb.Bytes[0], fb.Bytes[1], fb.Bytes[2],
fb.Bytes[3]);
fb.Bytes[0] = 0;
fb.Bytes[1] = 0;
fb.Bytes[2] = 0;
fb.Bytes[3] = 0;
DebugLog(fb.Val); // 0
}
Like local classes, there are some restrictions put on unions. First,
unions can't participate in inheritance. That means they can't have
any base classes, be a base class themselves, or have any virtual
member functions.
struct IGetHashCode
{
virtual int32_t GetHashCode() = 0;
};
// Compiler error: unions can't derive
union Id : IGetHashCode
{
int32_t Val;
uint8_t Bytes[4];
// Compiler error: unions can't have virtual member
functions
virtual int32_t GetHashCode() override
{
return Val;
}
};
// Compiler error: can't derive from a union
struct Vec2Bytes : Id
{
};
union IntRefs
{
// Compiler error: unions can't have lvalue
references
int32_t& Lvalue;
// Compiler error: unions can't have rvalue
references
int32_t&& Rvalue;
};
That's a lot of rules, but it's rather uncommon for unions to include
types with these kinds of non-trivial functions. Typically they're used
for simple primitives, structs, and arrays, like in the above examples.
For more advanced usage, we need to keep the rules in mind:
These are even more restricted than normal unions. They can't have
any member functions or static data members and all their data
members have to be public. Like unscoped enums, their members
are added to whatever scope the union is in: Foo in the above
example.
void Foo()
{
union
{
int32_t Int;
float Float;
};
// Int and Float are added to Foo, so they can be
used directly
Float = 3.14f;
DebugLog(Int); // 1078523331
}
This feature is commonly used to create what's called a "tagged
union" by wrapping the union and an enum in a struct:
struct IntOrFloat
{
// The "tag" remembers the active member
enum { Int, Float } Type;
// Anonymous union
union
{
int32_t IntVal;
float FloatVal;
};
};
IntOrFloat iof;
iof.FloatVal = 3.14f; // Set value
iof.Type = IntOrFloat::Float; // Set type
// Read value and type
DebugLog(iof.IntVal, iof.Type); // 1078523331, Float
union Vector2
{
struct
{
float X;
float Y;
};
float Components[2];
};
Vector2 v;
// Named field access
v.X = 2;
v.Y = 4;
// Array access: same values due to union
DebugLog(v.Components[0], v.Components[1]); // 2, 4
// Array access
v.Components[0] = 20;
v.Components[1] = 40;
// Named field access: same values due to union
DebugLog(v.X, v.Y); // 20, 40
Pointers to Members
Finally, let's look at how we create pointers to members of structs. To
simply get a pointer to a specific struct instance's non-static data
member, we can use the normal pointer syntax:
struct Vector2
{
float X;
float Y;
};
Vector2 v{2, 4};
float* p = &v.X; // p points to the X data member of a
struct Vector2
{
float X;
float Y;
};
struct Vector3 : Vector2
{
float Z;
};
float Vector2::* p = &Vector2::X;
Vector2 v{2, 4};
float* p2 = p; // Compiler error: not compatible
float f = 3.14f;
float Vector2::* pf = &f; // Compiler error: not
compatible
float Vector3::* p3 = p; // OK: Vector3 derives from
Vector2
DebugLog(v.*p3); // 2
struct Float
{
float Val;
};
struct PtrToFloat
{
float Float::* Ptr;
};
// Pointer to Val in a Float pointed to by Ptr in a
PtrToFloat
float Float::* PtrToFloat::* p1 = &PtrToFloat::Ptr;
Float f{3.14f};
PtrToFloat ptf{&Float::Val};
float Float::* pf = ptf.*p1; // Dereference first level
of indirection
float floatVal = f.*pf; // Dereference second level of
indirection
DebugLog(floatVal); // 3.14
// Dereference both levels of indirection at once
DebugLog(f.*(ptf.*p1)); // 3.14
struct Player
{
int32_t Health;
};
struct PlayerOps
{
Player& Target;
PlayerOps(Player& target)
: Target(target)
{
}
void Damage(int32_t amount)
{
Target.Health -= amount;
}
void Heal(int32_t amount)
{
Target.Health += amount;
}
};
// Pointer to a non-static member function of PlayerOps
that
// takes an int32_t and returns void
void (PlayerOps::* op)(int32_t) = &PlayerOps::Damage;
Player player{100};
PlayerOps ops(player);
// Call the Damage function via the pointer
(ops.*op)(20);
DebugLog(player.Health); // 80
// Re-assign to another compatible function
op = &PlayerOps::Heal;
// Call the Heal function via the pointer
(ops.*op)(10);
DebugLog(player.Health); // 90
Conclusion
This chapter we've seen a bunch of miscellaneous class functionality
that isn't available in C#. User-defined literals can make code both
more expressive and more terse at the same time. It's best used
sparingly for very stable, core types like the Standard Library's
string.
Local classes give a lot of the same benefits that local functions do
in C#, but go a step further and allow nearly full class functionality
including data members, constructors, destructors, and overloaded
operators.
namespace Math
{
struct Vector2
{
float X;
float Y;
};
}
namespace Math
{
struct Vector2
{
float X;
float Y;
};
}
namespace Math
{
struct Vector3
{
float X;
float Y;
float Z;
};
}
namespace Math
{
namespace LinearAlgebra
{
struct Vector2
{
float X;
float Y;
};
}
}
int32_t highScore = 0;
class Player
{
int32_t numPoints;
int32_t highScore;
void ScorePoints(int32_t num)
{
numPoints += num;
// highScore refers to the data member
if (numPoints > highScore)
{
highScore = numPoints;
}
// ::highScore refers to the global variable
if (numPoints > ::highScore)
{
::highScore = numPoints;
}
}
};
namespace Math::LinearAlgebra
{
struct Vector2
{
float X;
float Y;
};
}
Unlike C#, we’re not limited to only putting types like structs and
enums in a namespace. We can put anything we want there:
namespace Math
{
// Variable
const float PI = 3.14f;
// Function
bool IsNearlyZero(float val, float threshold=0.0001f)
{
return abs(val) < threshold;
}
}
namespace Math
{
// Declarations
struct Vector2;
bool IsNearlyZero(float val, float
threshold=0.0001f);
}
// Definitions
struct Math::Vector2
{
float X;
float Y;
};
bool Math::IsNearlyZero(float val, float threshold)
{
return abs(val) < threshold;
}
// Using directive
using namespace Math;
// No need for Math::
Vector2 vec{2, 4};
DebugLog(vec.X, vec.Y); // 2, 4
namespace MathUtils
{
// Using directive inside a namespace
using namespace Math;
bool IsNearlyZero(Vector2 vec, float
threshold=0.0001f)
{
return abs(vec.X) < threshold && abs(vec.Y) <
threshold;
}
}
void Foo()
{
// Using directive inside a function
using namespace Math;
// No need for Math::
Vector2 vec{2, 4};
DebugLog(vec.X, vec.Y); // 2, 4
}
enum struct Op
{
IS_NEARLY_ZERO
};
bool DoOp(Math::Vector2 vec, Op op)
{
if (op == Op::IS_NEARLY_ZERO)
{
// Using directive inside a block
using namespace MathUtils;
return IsNearlyZero(vec);
}
return false;
}
Also unlike C#, using directives are transitive. In the above, the
MathUtils namespace has using namespace Math. That means any
using namespace MathUtils implicitly includes a using namespace
Math:
Even members added to the namespace after the using directive are
included transitively:
namespace Math
{
struct Vector2
{
float X;
float Y;
};
}
namespace MathUtils
{
using namespace Math;
bool IsNearlyZero(Vector2 vec, float
threshold=0.0001f)
{
return abs(vec.X) < threshold && abs(vec.Y) <
threshold;
}
}
namespace Math
{
struct Vector3
{
float X;
float Y;
float Z;
};
}
void Foo()
{
// Implicitly includes MathUtils' "using namespace
Math"
// Includes Vector3, even though it was after "using
namespace Math"
using namespace MathUtils;
Vector3 vec{2, 4, 6};
DebugLog(vec.X, vec.Y, vec.Z); // 2, 4, 6
}
namespace Math
{
struct Vector2
{
float X;
float Y;
};
}
using namespace Math;
We can therefore use the members of the namespace without the
scope resolution operator or an explicit using directive:
namespace
{
struct Vector2
{
float X;
float Y;
};
}
// Implicitly added by the compiler
// UNNAMED is just a placeholder for the name the
compiler gives the namespace
using namespace UNNAMED;
// Can use members of the unnamed namespace
Vector2 vec{2, 4};
DebugLog(vec.X, vec.Y); // 2, 4
namespace Math
{
struct Vector2
{
float X;
float Y;
};
struct Vector3
{
float X;
float Y;
float Z;
};
}
// Use just Vector2, not Vector3
using Math::Vector2;
Vector2 vec2{2, 4}; // OK
Vector3 vec3a{2, 4, 6}; // Compiler error
Math::Vector3 vec3b{2, 4, 6}; // OK
Like variable declarations, we can name multiple namespace
members in a single using declaration:
namespace Game
{
class Player;
class Enemy;
}
void Foo()
{
// Use Vector2 and Player, not Vector3 or Enemy
using Alias::Vector2, Game::Player;
Vector2 vec2{2, 4}; // OK
Vector3 vec3{2, 4}; // Compiler error
Player* player; // OK
Enemy* enemy; // Compiler error
}
namespace Stats
{
int32_t score;
}
namespace Game
{
struct Player
{
int32_t Score;
};
}
bool HasHighScore(Game::Player* player)
{
using Stats::score;
int32_t score = player->Score; // Compiler error:
score already declared
return score > score; // Ambiguous reference to score
}
There’s one case where it’s OK to have more than one identifier with
the same name: function overloads. With multiple using declarations
referring to functions with the same name, we can create an
overload set within the function:
namespace Game
{
struct Player
{
int32_t Health;
};
}
namespace Damage
{
struct Weapon
{
int32_t Damage;
};
void Use(Weapon& weapon, Game::Player& player)
{
player.Health -= weapon.Damage;
}
}
namespace Healing
{
struct Potion
{
int32_t HealAmount;
};
void Use(Potion& potion, Game::Player& player)
{
player.Health += potion.HealAmount;
}
}
void DamageThenHeal(
Game::Player& player, Damage::Weapon& weapon,
Healing::Potion& potion)
{
using Damage::Use; // Now have one Use function
using Healing::Use; // Now have two Use functions: an
overload set
Use(weapon, player); // Call the Use(Weapon&,
Player&) overload
Use(potion, player); // Call the Use(Potion&,
Player&) overload
}
Game::Player player{100};
Damage::Weapon weapon{20};
Healing::Potion potion{10};
DamageThenHeal(player, weapon, potion);
DebugLog(player.Health); // 90
Namespace Aliases
The final namespace feature is a simple one: aliases. We can use
these to shorten long, usually nested namespace names:
namespace Math
{
namespace LinearAlgebra
{
struct Vector2
{
float X;
float Y;
};
}
}
// mla is an alias for Math::LinearAlgebra
namespace mla = Math::LinearAlgebra;
mla::Vector2 vec2{2, 4};
DebugLog(vec2.X, vec2.Y); // 2, 4
throw e;
IOException ex;
void Foo()
{
// Pointer to a class instance
throw &ex;
}
It’s typical to see throw all by itself in a statement but, like in C# 7.0,
it’s actually an expression that can be part of more complex
statements. Here it is as commonly seen with the ternary/conditional
operator:
class InvalidId{};
const int32_t MAX_PLAYERS = 4;
int32_t highScores[MAX_PLAYERS]{};
int32_t GetHighScore(int32_t playerId)
{
return playerId < 0 || playerId >= MAX_PLAYERS ?
throw InvalidId{} :
highScores[playerId];
}
Catching Exceptions
Exceptions are caught with try and catch blocks, just like in C#:
void Foo()
{
const int32_t id = 4;
try
{
GetHighScore(id);
}
catch (InvalidId)
{
DebugLog("Invalid ID", id);
}
}
struct InvalidId
{
int32_t Id;
};
const int32_t MAX_PLAYERS = 4;
int32_t highScores[MAX_PLAYERS]{};
int32_t GetHighScore(int32_t playerId)
{
return playerId < 0 || playerId >= MAX_PLAYERS ?
throw InvalidId{playerId} :
highScores[playerId];
}
void Foo()
{
try
{
GetHighScore(4);
}
catch (InvalidId ex)
{
DebugLog("Invalid ID", ex.Id);
}
}
In this version, InvalidId has the ID that was invalid so we give the
catch block’s exception object a name in order to access it.
struct InvalidId
{
int32_t Id;
};
struct NoHighScore
{
int32_t PlayerId;
};
const int32_t MAX_PLAYERS = 4;
int32_t highScores[MAX_PLAYERS]{-1, -1, -1, -1};
int32_t GetHighScore(int32_t playerId)
{
if (playerId < 0 || playerId >= MAX_PLAYERS)
{
throw InvalidId{playerId};
}
const int32_t highScore = highScores[playerId];
return highScore < 0 ? throw NoHighScore{playerId} :
highScore;
}
void Foo()
{
try
{
GetHighScore(2);
}
catch (InvalidId ex)
{
DebugLog("Invalid ID", ex.Id);
}
catch (NoHighScore ex)
{
DebugLog("No high score for player with ID",
ex.PlayerId);
}
}
The catch blocks are checked in the order they’re listed and the first
matching type’s catch block gets executed.
void Foo()
{
try
{
GetHighScore(2);
}
catch (...)
{
DebugLog("Couldn't get high score");
}
}
Lastly, C++ has an alterate form of try–catch blocks that are placed
at the function level:
void Foo() try
{
GetHighScore(2);
}
catch (...)
{
DebugLog("Couldn't get high score");
}
These are similar to a try that encompasses the whole function. The
main reason to use one is to be able to catch exceptions in
constructor initializer lists. Since these don’t appear in the function
body, there’s no other way to write a try block that includes them.
struct HighScore
{
int32_t Value;
HighScore(int32_t playerId) try
: Value(GetHighScore(playerId))
{
}
catch (...)
{
DebugLog("Couldn't get high score");
}
};
void Foo()
{
try
{
HighScore hs{2};
}
catch (NoHighScore ex)
{
DebugLog("No high score for player",
ex.PlayerId);
}
}
// This prints:
// * Couldn't get high score
// * No high score for player 2
// Force non-throwing
// Deprecated in C++11 and removed in C++20
void Foo() throw()
{
throw 1; // Compiler warning: this function is non-
throwing
}
// Can throw an int or a float
// Deprecated in C++11 and removed in C++17
void Goo(int a) throw(int, float)
{
if (a == 1)
{
throw 123; // Throw an int
}
else if (a == 2)
{
throw 3.14f; // Throw a float
}
}
Stack Unwinding
Just like when we throw exceptions in C#, exceptions thrown in C++
unwind the call stack looking for a try block that can handle the
exception. This triggers finally blocks in C#, but C++ doesn’t have
finally blocks. Instead, destructors of local variables are called
without the need for any explicit syntax such as finally:
struct File
{
FILE* handle;
File(const char* path)
{
handle = fopen(path, "r");
}
~File()
{
fclose(handle);
}
void Write(int32_t val)
{
fwrite(&val, sizeof(val), 1, handle);
}
};
void Foo()
{
File file{"/path/to/file"};
int32_t highScore = GetHighScore(123);
file.Write(highScore);
}
void SaveCrashReport()
{
// ...
}
void OnTerminate()
{
SaveCrashReport();
std::abort();
}
std::set_terminate(OnTerminate);
// ... anywhere else in the program ...
throw 123; // calls OnTerminate if not caught
struct Boom
{
~Boom() noexcept(false) // Force potentially-throwing
{
DebugLog("boom!");
// If called during stack unwinding, this calls
std::terminate
// Otherwise, it just throws like normal
throw 123;
}
};
void Foo()
{
try
{
Boom boom{};
throw 456; // Calls boom's destructor
}
catch (...)
{
DebugLog("never printed");
}
}
struct Boom
{
~Boom() // Non-throwing
{
throw 123; // Compiler warning: throwing in non-
throwing function
}
};
void Foo()
{
try
{
Boom boom{};
}
catch (...)
{
DebugLog("never printed");
}
}
struct Boom
{
Boom()
{
throw 123;
}
};
static Boom boom{};
struct Boom
{
Boom()
{
throw 123;
}
};
void Goo()
{
static Boom boom{}; // Static local variable who's
constructor throws
}
void Foo()
{
for (int i = 0; i < 3; ++i)
{
try
{
Goo();
}
catch (...)
{
DebugLog("caught"); // Prints three times
}
}
}
Slicing
One common mistake is to catch class instances that are part of an
inheritance hierarchy. We typically want to catch the base class
(IOError) to implicitly catch all the derived classes (FileNotFound,
PermissionDenied). This will lead to “slicing” off the base class sub-
object of the derived class. Since the subobject is really designed to
be used as a part of the derived class object, this may cause errors.
struct Exception
{
const char* Message;
virtual void Print()
{
DebugLog(Message);
}
};
struct IOException : Exception
{
const char* Path;
IOException(const char* path, const char* message)
{
Path = path;
Message = message;
}
virtual void Print() override
{
DebugLog(Message, Path);
}
};
FILE* OpenLogFile(const char* path)
{
FILE* handle = fopen(path, "r");
return handle == nullptr ? throw IOException{path,
"Can't open"} : handle;
}
void Foo()
{
try
{
FILE* handle = OpenLogFile("/path/to/log/file");
// ... use handle
}
// Catching the base class slices it off from the
whole IOException
catch (Exception ex)
{
// Calls Exception's Print, not IOException's
Print
ex.Print();
}
}
Either way, the appropriate virtual function will now be called and
we’ll get the right error message:
C++ also gains the ability to throw objects, not just references.
Those objects don’t have to be class instances as primitives, enums,
and pointers are also allowed. We can also gain a compiler safety
net and better optimization by using noexcept specifications. When
exceptions go uncaught, we can hook into std::terminate_handler
to add crash reporting or take any other actions before the program
exits.
19. Dynamic Allocation
History and Strategy
Let’s start by looking at a bit of a history which is still very relevant to
C++ programming today. In C, not C++, memory is dynamically
allocated using a family of functions in the C Standard Library whose
names end in alloc:
“Raw” use of malloc and free like this is still common in C++
codebases. It’s a pretty low-level way of working though, and
generally discouraged in most C++ codebases. That’s because it’s
quite easy to accidentally trigger undefined behavior. The three
mistakes in the above code are very common bugs.
struct Vector2
{
float X;
float Y;
Vector2()
: X(0), Y(0)
{
}
Vector2(float x, float y)
: X(x), Y(y)
{
}
};
// 1) Allocate enough memory for a Vector2:
sizeof(Vector2)
// 2) Call the constructor
// * "this" is the allocated memory
// * Pass 2 and 4 as arguments
// 3) Evaluate to a Vector2*
Vector2* pVec = new Vector2{2, 4};
DebugLog(pVec->X, pVec->Y); // 2, 4
The new operator combines several of the manual steps from the C
code so we can’t forget to do them or accidentally do them wrong.
As a result, safety is increased in numerous ways:
C# allows us to use new with classes and structs. In C++, we can use
new with any type:
try
{
// Attempt a 1 TB allocation
// Throws an exception if the allocation fails
char* big = new char[1024*1024*1024*1024];
// Never executed if the allocation fails
big[0] = 123;
}
catch (std::bad_alloc)
{
// This gets printed if the allocation fails
DebugLog("Failed to allocate big array");
}
// Attempt a 1 TB allocation
// Calls abort() if the allocation fails
char* big = new char[1024*1024*1024*1024];
// Never executed if the allocation fails
big[0] = 123;
Deallocation
All of the above examples create memory leaks. That’s because C++
has no garbage collector to automatically release memory that’s no
longer referenced. Instead, we must release the memory when we’re
done with it. We do that with the delete operator:
One way to address these issues is to set all pointers to the memory
to null after releasing them:
delete pVec;
pVec = nullptr;
// Undefined behavior: derefrencing null
DebugLog(pVec->X, pVec->Y);
delete pVec; // OK
Most of the time, such as when using the null pointer in some far-
flung part of the codebase, the compiler can’t determine that it’s null
and will assume a non-null pointer. In that case, dereferencing null
will crash the program. So this is only a moderate improvement as
we may only potentially get a crash instead of data corruption from
reading or writing the released memory.
struct HasId
{
int32_t Id;
// Non-virtual destructor
~HasId()
{
}
};
struct Combatant
{
// Non-virtual destructor
~Combatant()
{
}
};
struct Enemy : HasId, Combatant
{
// Non-virtual destructor
~Enemy()
{
}
};
// Allocate an Enemy
Enemy* pEnemy = new Enemy();
// Polymorphism is allowed because Enemy "is a" Combatant
due to inheritance
Combatant* pCombatant = pEnemy;
// Deallocate a Combatant
// 1) Call the Combatant, not Enemy, destructor
// 2) Release the allocated memory pointed to by
pCombatant
delete pCombatant;
There are several forms the overloaded operators can take, but they
should always be overloaded in pairs. Here’s the simplest form:
struct Vector2
{
float X;
float Y;
void* operator new[](std::size_t count)
{
return malloc(sizeof(Vector2)*count);
}
void operator delete[](void* ptr)
{
free(ptr);
}
};
Vector2* pVecs = new Vector2[1];
delete [] pVecs;
struct Vector2
{
float X;
float Y;
// Overload the new operator that takes (float,
float) arguments
void* operator new(std::size_t count, float x, float
y)
{
// Note: for demonstration purposes only
// Normal code would just use a constructor
Vector2* pVec =
(Vector2*)malloc(sizeof(Vector2)*count);
pVec->X = x;
pVec->Y = y;
return pVec;
}
// Overload the normal delete operator
void operator delete(void* memory, std::size_t count)
{
free(memory);
}
// Overload a delete operator corresponding with the
new operator
// that takes (float, float) arguments
void operator delete(void* memory, std::size_t count,
float x, float y)
{
// Forward the call to the normal delete operator
Vector2::operator delete(memory, count);
}
};
// Call the overloaded (float, float) new operator
Vector2* pVec = new (2, 4) Vector2;
DebugLog(pVec->X, pVec->Y); // 2, 4
// Call the normal delete operator
delete pVec;
struct Vector2
{
float X;
float Y;
};
void* operator new(std::size_t count, void* place)
noexcept
{
return place;
}
char buf[sizeof(Vector2)];
Vector2* pVec = new (buf) Vector2{2, 4};
DebugLog(pVec->X, pVec->Y); // 2, 4
float* pFloat = new (buf) float{3.14f};
DebugLog(*pFloat); // 3.14
Owning Types
So far we’ve overcome a lot of possible mistakes that could have
been made with low-level dynamic allocation functions like malloc
and free. Even so, “naked” use of new and delete is often frowned
upon in “Modern C++” (i.e. C++11 and newer) codebases. This is
because we are still susceptible to common bugs:
class FloatArray
{
int32_t length;
float* floats;
public:
FloatArray(int32_t length)
: length{length}
, floats{new float[length]{0}}
{
}
float& operator[](int32_t index)
{
if (index < 0 || index >= length)
{
throw IndexOutOfBounds{};
}
return floats[index];
}
virtual ~FloatArray()
{
delete [] floats;
floats = nullptr;
}
struct IndexOutOfBounds {};
};
try
{
FloatArray floats{3};
floats[0] = 3.14f;
// Index out of bounds
// Throws exception
// FloatArray destructor called
DebugLog(floats[-1]); // 3.14
}
catch (FloatArray::IndexOutOfBounds)
{
DebugLog("whoops"); // Gets printed
}
void Foo()
{
FloatArray f1{3};
FloatArray f2{f1}; // Copies floats and length
// 1) Call f1's destructor which deletes the
allocated memory
// 2) Call f2's destructor which deletes the
allocated memory again: crash
}
First up, and just like in C#, it happens when calling a function with a
type other than the type of the function’s parameter:
void Foo(float x)
{
}
Foo(1); // int -> float
float Bar()
{
return 1; // int -> float
}
All the boolean logic operators require bool operands, so any non-
bool needs conversion. This can be implicit in C++, but not C#:
The C++ delete operator only deletes typed pointers. Here we have
a user-defined conversion operator that converts a struct to an int*:
struct ConvertsToIntPointer
{
operator int*() { return nullptr; }
};
delete ConvertsToIntPointer{}; // ConvertsToIntPointer ->
int*
Usually these change the type itself, but in a few cases they just
change its classification:
int x = 123;
const int y = x; // int -> const int
// also, lvalue -> rvalue
With those out of the way, all the rest of the standard conversions
will change the type. First up we have function-to-pointer
conversions. The previous example “took the address” of DoStuff to
get a pointer to it, as we’ve seen before, but the is optional because
there’s a standard conversion from functions to function pointers:
void DoStuff()
{
}
void (*pFunc)() = DoStuff; // function -> function
pointer
struct Vector2
{
float X;
float Y;
float SqrMagnitude() const noexcept
{
return X*X + Y*Y;
}
};
Vector2 vec{2, 4};
// All of these are compiler errors:
float (*sqrMagnitude1)() = Vector2::SqrMagnitude
float (*sqrMagnitude2)() = vec.SqrMagnitude;
float (*sqrMagnitude3)(Vector2*) = Vector2::SqrMagnitude
float (*sqrMagnitude4)(Vector2*) = vec.SqrMagnitude;
The sizes of char, unsigned char, unsigned short, and int depend
on factors such as the compiler and CPU architecture. If int can
hold the full range of values for char, unsigned char, unsigned
short, and char8_t, which is usually the case, they’re promoted to
int. If it can’t, they’re promoted to unsigned int.
1. int
2. unsigned int
3. long
4. unsigned long
5. long long
6. unsigned long long
wchar_t c = 'A';
auto i = c + 1; // c is promoted from 'wchar_t' to at
least an int
DebugLog(i); // 66 (ASCII for 'B')
If it does have a fixed underlying type, it’s promoted to that type and
then that type can be promoted:
enum Color : int // Has an underlying type
{
Red,
Green,
Blue
};
Color c = Red;
long i = c + 1L; // c is promoted from 'Color' to int and
then to long
DebugLog(i); // 1
Bit fields will be promoted to the smallest size that can hold the full
value range of the bit field, but it’s a short list:
1. int
2. unsigned int
struct ByteBits
{
bool Bit0 : 1;
bool Bit1 : 1;
bool Bit2 : 1;
bool Bit3 : 1;
bool Bit4 : 1;
bool Bit5 : 1;
bool Bit6 : 1;
bool Bit7 : 1;
};
ByteBits bb{0};
int i = bb.Bit0 + 1; // bit field is promoted from 1 bit
to int
DebugLog(i); // 1
The bool type is promoted to int with false becoming 0 and true
becoming 1 (not just non-zero). This isn’t allowed in C#:
bool b = true;
int i = b + 1; // b is promoted from bool to int with
value 1
DebugLog(i); // 2
float f = 3.14f;
double d = f + 1.0; // f is promoted from float to double
DebugLog(d); // 4.14
int32_t si = 257;
uint8_t ui = si; // si is converted from int32_t to
uint8_t
// ui = 257 % 2^8 = 257 % 256 = 1
DebugLog(ui); // 1
We saw above that when bool must become an int, it’s promoted
from false to 0 and true to 1. For all other integer types, this is
technically a conversion but it generates the same result. Despite not
losing any precision like the above conversions, C# also forbids this:
bool b = true;
long i = b + 1; // b is converted from bool to long with
value 1
DebugLog(i); // 2
double d = 3.14f;
float f = d; // d is converted from double to float
DebugLog(f); // 3.14
int8_t i = 123;
float f1 = i; // i is converted from int8_t to float
DebugLog(f1); // 123
bool b = true;
float f2 = b; // b is converted from bool to float
DebugLog(f2); // 1
A “null pointer constant” in C++ is any integer literal with the value 0,
any constant with the value 0, or nullptr. These can all be
converted to any pointer type. Only null is allowed in C#.
int x = 123;
int* pi = &x;
void* pv = pi; // int* is converted to void*
DebugLog(pv == pi); // true
struct Vector2
{
float X;
float Y;
};
struct Vector3 : Vector2
{
float Z;
};
Vector3 vec{};
vec.X = 1;
vec.Y = 2;
vec.Z = 3;
Vector3* pVec3 = &vec;
Vector2* pVec2 = pVec3; // Vector3* is converted to
Vector2*
DebugLog(pVec2->X, pVec2->Y); // 1, 2
struct Vector2
{
float X;
float Y;
};
struct Vector3 : virtual Vector2 // Virtual inheritance
{
float Z;
};
float Vector2::* pVec2X = &Vector2::X;
float Vector3::* pVec3X = pVec2X; // Compiler error
int i = 123;
bool b1 = i; // int is converted to bool
DebugLog(b1); // true
float f = 3.14f;
bool b2 = f; // float is converted to bool
DebugLog(b2); // true
Color c = Red;
bool b3 = c; // Color is converted to bool
DebugLog(b3); // false
int* p = nullptr;
bool b4 = p; // int* is converted to bool
DebugLog(b4); // false
float Vector2::* pVec2X = &Vector2::X;
bool b5 = pVec2X; // Pointer to member is converted to
bool
DebugLog(b5); // true
Conversion Sequences
Now that we know about promotions and conversions, let’s see how
they’re sequenced in order to change types. First, C++ has a
“standard conversion sequence” that consists of the following steps
which mostly don’t apply to C#:
struct MyClass
{
MyClass(const int32_t)
{
}
};
uint8_t i1{123};
MyClass mc{i1}; // 1) lvalue to rvalue
// 2) Promotion from uin8_t to uint32_t
// 3) N/A
// 4) uint8_t to 'const uint8_t'
struct C
{
};
struct B
{
operator C()
{
return C{};
}
};
struct A
{
operator B()
{
return B{};
}
};
// Compiler error: user-defined conversion operators not
allowed here
C c = A{};
class File
{
FILE* handle;
public:
File(const char* path, const char* mode)
{
handle = fopen(path, mode);
}
~File()
{
fclose(handle);
}
operator FILE*()
{
return handle;
}
};
void Foo()
{
File writer{"/path/to/file", "w"};
char msg[] = "hello";
// fwrite looks like this:
// std::size_t fwrite(
// const void* buffer,
// std::size_t size,
// std::size_t count,
// std::FILE* stream);
// Last argument is implicitly converted from File to
FILE*
fwrite(msg, sizeof(msg), 1, writer);
// Note: File destructor called here to close the
file
}
Overflows
Integer math may result in an “overflow” where the result doesn’t fit
into the integer type. C++ doesn’t have C#’s checked and unchecked
contexts. Instead, it handles overflow differently depending on
whether the math is signed or unsigned.
int32_t a = 0x7fffffff;
int32_t b = a + 1; // Overflow. Undefined behavior!
DebugLog(b); // Could be anything!
uint8_t a = 255;
uint8_t b = a + 1; // Overflow. b = (255 + 1) % 256 = 0.
DebugLog(b); // 0
Arithmetic
We’ve seen a lot of promotion and conversion due to arithmetic
already, but only covered simple cases so far. There are quite a few
more rules for determining which operands are promoted or
converted and what the “common type” arithmetic is performed on
should be.
First of all, C++20 deprecates mixing floating point and enum types
or enum types with other enum types. These were never allowed in
C#.
enum Color
{
Red,
Green,
Blue
};
// Deprecated: mixed enum and float
auto a = Red + 3.14f;
enum RangeType
{
Melee,
Distance
};
// Deprecated: mixed enum types
auto b = Red + Melee;
Integers get promoted first. Then, for all the binary operators except
shifts, a book of specific type changes occur. First, if either operand
is a long double then the other operand is converted to a long
double. The same happens for double and float. C# has essentially
the same behavior.
int i = 123;
long double ld = 3.14;
long double sum1 = ld + i; // i is converted from int to
'long double'
DebugLog(sum1); // 126.14
double d = 3.14;
double sum2 = d + i; // i is converted from int to double
DebugLog(sum2); // 126.14
float f = 3.14f;
double sum3 = f + i; // i is converted from int to float
DebugLog(sum3); // 126.14
1. bool
2. signed char, unsigned char, and char
3. short and unsigned short
4. int and unsigned int
5. long and unsigned long
6. long long and unsigned long long
If that’s not the case but the signed type can represent all the values
of the unsigned type, the unsigned operand converted to the type of
the signed operand:
And if that’s not the case either, both operands are converted to the
unsigned counterpart of the signed type:
float f = 3.14f;
int i = f;
DebugLog(i); // 3
uint64_t i = 0xffffffffffffffff;
float f = i; // uint64_t -> float
DebugLog(f); // 1.84467e+19
uint64_t i2 = f;
DebugLog(i == i2); // false
The first cast in this suite is one of the simplest: const_cast. We use
this when we simply want to treat a const pointer or references as
non-const:
uint64_t i = reinterpret_cast<uint64_t>(nullptr);
DebugLog(i); // 0
If the types aren’t “similar” then we have two more chances to avoid
undefined behavior. First, if one type is the signed or unsigned
version of the same type:
struct File
{
FILE* handle;
File(const char* path, const char* mode)
{
handle = fopen(path, mode);
}
~File()
{
fclose(handle);
}
operator FILE*()
{
return handle;
}
};
File reader{"/path/to/file", "r"};
FILE* handle = static_cast<FILE*>(reader); // Implicit
conversion
struct Vector2
{
float X;
float Y;
};
struct Vector3 : Vector2
{
float Z;
};
Vector3 vec;
vec.X = 2;
vec.Y = 4;
vec.Z = 6;
Vector2& refVec2 = vec; // Implicit conversion from
Vector3& to Vector2&
Vector3& refVec3 = reinterpret_cast<Vector3&>(refVec2);
// Downcast
DebugLog(refVec3.X, refVec3.Y, refVec3.Z); // 2, 4, 6
We can also static_cast to void to explicitly discard a value. This is
sometimes used to silence an “unused variable” compiler warning:
int i = 123;
float f = static_cast<int>(i); // Undo standard
conversion: int -> float
DebugLog(f); // 123
void SayHello()
{
DebugLog("hello");
}
// lvalue to rvalue conversion
int i = 123;
int i2 = static_cast<int&&>(i);
DebugLog(i2); // 123
// Array to pointer decay
int a[3]{1, 2, 3};
int* p = static_cast<int*>(a);
DebugLog(p[0], p[1], p[2]); // 1, 2, 3
// Function to pointer conversion
void (*pFunc)() = static_cast<void(*)()>(SayHello);
pFunc(); // hello
We can go the other way, too: integers and floating point types can
be static_cast to scoped or unscoped enumerations. We can also
cast between enumeration types:
It’s undefined behavior if the underlying type of the enum isn’t fixed
and the value being cast to the enum is out of its range. If it is fixed,
the result is just like converting to the underlying type. Floating point
values are first converted to the underlying type.
We can also use static_cast to upcast from a pointer to a member
in a derived class to a pointer to a member in the base class:
int i = 123;
void* pv = &i;
int* pi = static_cast<int*>(pv);
DebugLog(*pi); // 123
C-Style Cast and Function-Style Cast
A “C-style” cast looks like a cast in C as well as C#:
(DestinationType)sourceType. It behaves quite differently in C++
compared to C#. In C++, it’s mostly a shorthand for the first “named”
cast whose prerequisites are met in this order:
1. const_cast<DestinationType>(sourceType)
2. static_cast<DestinationType>(sourceType) with more
leniency: pointers and references to or from derived classes or
members of derived classes can be cast to pointers or
references to base classes or members of base classes
3. static_cast (with more leniency) then const_cast
4. reinterpret_cast<DestinationType>(sourceType)
5. reinterpret_cast then const_cast
int i = 123;
float f = float(i);
DebugLog(f); // 123
dynamic_cast
All of the casts we’ve seen so far are “static.” That means the way
they operate is determined at compile time and don’t depend on the
run-time value of the expression being cast. For example, consider
this downcast:
Vector2* p1 = nullptr;
Vector2* p2 = dynamic_cast<Vector2*>(p1);
DebugLog(p2); // 0
Vector3 vec;
vec.X = 2;
vec.Y = 4;
vec.Z = 6;
Vector3& r3 = vec;
Vector2& r2 = dynamic_cast<Vector2&>(r3);
DebugLog(r2.X, r2.Y); // 2, 4
struct Combatant
{
virtual ~Combatant()
{
}
};
struct Player : Combatant
{
int32_t Id;
};
Player player;
player.Id = 123;
Combatant* p = &player;
void* pv = dynamic_cast<void*>(p); // Downcast to most-
derived class: Player*
Player* p2 = reinterpret_cast<Player*>(pv);
DebugLog(p2->Id); // 123
Finally, we have the primary use case of dynamic_cast: a downcast
from a pointer or reference to a base class to a pointer or reference
to a derived class. This generates CPU instructions that examine the
object being pointed to or referenced by the expression to cast. If
that object is really a base class of the destination type and that
destination type has only one sub-object of the base class, which
may not be the case with non-virtual inheritance, then the cast
succeeds with a pointer or reference to the derived class:
Player player;
player.Id = 123;
Combatant* p = &player;
Player* p2 = dynamic_cast<Player*>(p); // Downcast
DebugLog(p2->Id); // 123
This can also be used to perform a “sidecast” from one base class to
another base class:
struct RangedWeapon
{
float Range;
virtual ~RangedWeapon()
{
}
};
struct MagicWeapon
{
enum { FireType, WaterType, ArcaneType } Type;
};
struct Staff : RangedWeapon, MagicWeapon
{
const char* Name;
};
Staff staff;
staff.Name = "Staff of Freezing";
staff.Range = 10.0f;
staff.Type = MagicWeapon::WaterType;
Staff& staffRef = staff;
RangedWeapon& rangedRef = staffRef; // Implicit
conversion upcasts
MagicWeapon& magicRef = dynamic_cast<MagicWeapon&>
(rangedRef); // Sidecast
DebugLog(magicRef.Type); // 1
If neither the downcast nor the sidecast succeed, the cast fails.
When pointers are being cast, the cast evaluates to a null pointer of
the destination type. If references are being cast, a std::bad_cast
exception is thrown:
struct Combatant
{
virtual ~Combatant()
{
}
};
struct Player : Combatant
{
int32_t Id;
};
struct Enemy : Combatant
{
int32_t Id;
};
// Make a Combatant: the base class
Combatant combatant;
Combatant* pc = &combatant;
Combatant& rc = combatant;
// Cast fails. Combatant object isn't a Player. Null
returned.
Player* pp = dynamic_cast<Player*>(pc);
DebugLog(pp); // 0
try
{
// Cast fails. Combatant object isn't a Player.
std::bad_cast thrown.
Player& rp = dynamic_cast<Player&>(rc);
DebugLog(rp.Id); // Never called
}
catch (std::bad_cast const &)
{
DebugLog("cast failed"); // Gets printed
}
This virtual table pointer can therefore also be used to identify the
class of an object since there is one virtual function table per class.
The inheritance hierarchy is then conceptually expressed as a tree of
virtual table pointers with implementation details varying by compiler.
Because all this RTTI data adds to the executable size, many
compilers allow it to be disabled. That also disables dynamic_cast as
it depends on RTTI.
typeid
There is one other use of RTTI: the typeid operator. It’s used to get
information about a type, similar to typeof or GetType in C#. The
operand can be either be named statically like typeof in C# or
dynamically like GetType in C# to look up the type based on an
object’s value. The C++ Standard Library’s <typeinfo> header is
required to use this.
Enemy* pe = nullptr;
try
{
// Doesn't dereference null
// Instead, attempts to get the type_info for what pe
points to
std::type_info const & ti{typeid(*pe)};
// Not printed
DebugLog(ti.name());
}
catch (std::bad_typeid const &)
{
DebugLog("bad typeid call"); // Is printed
}
Another is that the name member function doesn’t return any specific
string. That string is also usually some compiler-specific code that
may or may not have the name of the type from the source code:
One more is that the std::type_info for one call might not be the
same object as the std::type_info for another call, even if they’re
the same type. The hash_code member function should be used
instead:
[]{ DebugLog("hi"); }
The first part ([]) is the list of captures, which we’ll go into deeply in
a bit. The second part ({ ... }) is the list of statements to execute
when the lambda is invoked.
Besides the capture list ([]) and the omission of an => after the
parameters list, this now looks just like a C# lambda. In the first form
that omitted the parameters list, the lambda simply takes no
parameters.
Note that, unlike all the named functions we’ve seen so far, there’s
no return type stated here. The return type is implicitly deduced by
the compiler by looking at the type of our return statements. That’s
just like we’ve seen before when declaring functions with an auto
return type or what we get in C#.
If we’d rather explicitly state the return type, we can do so with the
“trailing” return type syntax:
Since it’s just a normal class, we can use it like a normal class. The
only difference is that we don’t know its name, so we have to use
auto for its type:
void Foo()
{
// Instantiate the lambda class. Equivalent to:
// LambdaClass lc;
auto lc = [](int x, int y){ return x + y; };
// Invoke the overloaded function call operator
DebugLog(lc(200, 300)); // 500
// Invoke the user-defined conversion operator to get
a function pointer
int (*p)(int, int) = lc;
DebugLog(p(20, 30)); // 50
// Call the copy constructor
auto lc2{lc};
DebugLog(lc2(2, 3)); // 5
// Destructor of lc and lc2 called here
}
Default Captures
So far, our lambdas have always had an empty list of captures: []. In
C#, captures are always implicit. In C++, we have much more control
over what we capture and how we capture it.
To start, let’s look at the most C#-like kind of capture: [&]. This is a
“capture default” that says to the compiler “capture everything the
lambda uses as a reference.” Here’s how it looks:
int x = 123;
// Capture reference to x, not a copy of x
auto addX = [&](int val) { return x + val; };
// Modify x after the capture
x = 0;
// Invoke the lambda
// Lambda uses the reference to x, which is 0
DebugLog(addX(1)); // 1
int x = 123;
// Capture a copy of x, not a reference to x
auto addX = [=](int val) { return x + val; };
// Modify x after the capture
// Does not modify the lambda's copy
x = 0;
// Invoke the lambda
// Lambda uses the copy of x, which is 123
DebugLog(addX(1)); // 124
While it’s deprecated starting with C++20, it’s important to note that
[=] can implicitly capture a reference to the current object: *this.
Here’s one way that happens:
struct CaptureThis
{
int Val = 123;
auto GetLambda()
{
// Default capture mode is "copy"
// Lambda uses "this" which is outside the lambda
// "this" is copied to a CaptureThis*
return [=]{ DebugLog(this->Val); };
}
};
auto GetCaptureThisLambda()
{
// Instantiate the class on the stack
CaptureThis ct{};
// Get a lambda that's captured a pointer to "ct"
auto lambda = ct.GetLambda();
// Return the lambda. Calls the destructor for "ct".
return lambda;
}
void Foo()
{
// Get a lambda that's captured a pointer to "ct"
which has had its
// destructor called and been popped off the stack
auto lambda = GetCaptureThisLambda();
// Dereference that captured pointer to "ct"
lambda(); // Undefined behavior: could do anything!
}
There are a few forms of individual capture. First up, we can simply
put a name:
int x = 123;
// Individually capture "x" by copy
auto addX = [x](int val)
{
// Use the copy of "x"
return x + val;
};
// Modify "x" after the capture
x = 0;
DebugLog(addX(1)); // 124
int x = 123;
// Individually capture "x" by copying it to a variable
named "a"
auto addX = [a = x](int val)
{
// Use the copy of "x" via the "a" variable
return a + val;
};
// Modify "x" after the capture
x = 0;
DebugLog(addX(1)); // 124
The captured variable can even have the same name as what it
captures, similar to when we used just [x]:
int x = 123;
// Individually capture "x" by reference
auto addX = [&x](int val)
{
// Use the reference to "x"
return x + val;
};
// Modify "x" after the capture
x = 0;
DebugLog(addX(1)); // 1
int x = 123;
// Individually capture "x" by reference as a reference
named "a"
auto addX = [&a = x](int val)
{
// Use the reference to "x" via "a"
return a + val;
};
// Modify "x" after the capture
x = 0;
DebugLog(addX(1)); // 1
Regardless of whether we capture by reference or by copy, we can
initialize using arbitrary expressions rather than simply the name of a
variable:
We also have two ways to individually capture this. The first is just
[this] which captures this by reference:
struct CaptureThis
{
int Val = 123;
int Foo()
{
// Capture "this" by reference
auto lambda = [this]
{
// Use captured "this" reference
return this->Val;
};
// Modify "Val" after the capture
this->Val = 0;
// Invoke the lambda
// Uses reference to "this" which has a modified
Val
return lambda();
}
};
CaptureThis ct{};
DebugLog(ct.Foo()); // 0
struct CaptureThis
{
int Val = 123;
int Foo()
{
// Capture "this" by copy
auto lambda = [*this]
{
// Use captured "this" copy
return this->Val;
};
// Modify "Val" after the capture
this->Val = 0;
// Invoke the lambda
// Uses copy of "*this" which has the original
Val
return lambda();
}
};
CaptureThis ct{};
DebugLog(ct.Foo()); // 123
Captured Data Members
So what does it mean when a lambda “captures” something? Mostly,
it just means that data members are added to the lambda’s class
and initialized via its constructor. Say we have this lambda:
Since we may need control over the modifiers placed on the lambda
class’ data members, we can add keywords like mutable and
noexcept to the lambda and they’ll be added to the data members
too:
int x = 1;
// Compiler error
// LambdaClass::operator() is const and LambdaClass::x
isn't mutable
auto lambda1 = [x](){ x = 2; };
// OK: LambdaClass::x is mutable
auto lambda2 = [x]() mutable { x = 2; };
int x = 123;
// Compiler error: can't individually capture by
reference when the default
// capture mode is by reference
auto lambda = [&, &x]{ DebugLog(x); };
// Global scope...
// Compiler error: can't use default captures here
auto lambda1 = [=]{ DebugLog("hi"); };
auto lambda2 = [&]{ DebugLog("hi"); };
// Compiler error: can't use uninitialized captures here
auto lambda3 = [x]{ DebugLog(x); };
auto lambda4 = [&x]{ DebugLog(x); };
class Test
{
int Val = 123;
void Foo()
{
// Compiler error: member must be captured with
an initializer
auto lambda1 = [Val]{ DebugLog(Val); };
auto lambda2 = [Val=Val]{ DebugLog(Val); }; // OK
auto lambda3 = [&Val=Val]{ DebugLog(Val); }; //
OK
}
};
union
{
int32_t intVal;
float floatVal;
};
intVal = 123;
// Compiler error: can't capture an anonymous union
member
auto lambda = [intVal]{ DebugLog(intVal); };
void Foo()
{
int x = 1;
auto outerLambda = [x]() mutable
{
DebugLog("outer", x);
x = 2;
auto innerLambda = [x]
{
DebugLog("inner", x);
};
innerLambda();
};
x = 3;
outerLambda(); // outer 1 inner 2
}
void Foo()
{
int x = 1;
auto outerLambda = [&x]() mutable
{
DebugLog("outer", x);
x = 2;
auto innerLambda = [&x]
{
DebugLog("inner", x);
};
innerLambda();
};
x = 3;
outerLambda(); // outer 3 inner 2
}
IILE
A common idiom in C++, seen above in the default function
argument example, is known as an Immediately-Invoked Lambda
Expression. We can use these in a variety of situations to work
around various language rules. For example, many C++
programmers strive to keep everything const that can be const. If,
however, the value to initialize a const variable to requires multiple
statements then it may be necessary to remove const. For example:
Command command;
switch (byteVal)
{
case 0:
command = Command::Clear;
break;
case 1:
command = Command::Restart;
break;
case 2:
command = Command::Enable;
break;
default:
DebugLog("Unknown command: ", byteVal);
command = Command::NoOp;
}
To get around this, we can use an IILE to wrap the switch. To do so,
we put parentheses around the lambda and then parentheses
afterward to immediately invoke it:
The compiler will then create an instance of the lambda class that’s
destroyed at the end of the statement. The overhead of the
constructor and destructor will be optimized away, effectively making
the IILE and the const it enables “free.”
C# Equivalency
We’ve compared C++ lambdas to C# lambdas a little so far, but let’s
take a closer look. First, we’ve seen that only “statement lambdas”
are supported in C++. We can’t write a C# “expression lambda” like
this:
// C#
(int x, int y) => x + y
Similarly, C++ return types may be auto and that is in fact the default
when a trailing return type like -> float isn’t used. C# lambdas must
always have an implicit return type. To force it, a cast is typically
used within the body of the lambda:
// C#
(float x, float y) => { return (int)(x + y); };
On the other hand, C# is more explicit than C++ when storing the
lambda in a variable as var cannot be used:
// C#
Func<int, int, int> f1 = (int x, int y) => { return x +
y;}; // OK
var f2 = (int x, int y) => { return x + y;}; // Compiler
error
// C#
Func<int, int, int> f = (int x, int _) => { return x; };
// Discard y
int x = 123;
// Capture nothing
// Compiler error: can't access x
auto lambda1 = []{ DebugLog(x); };
// Capture implicitly by copy
auto lambda2 = [=]{ DebugLog(x); };
// Capture implicitly by reference
auto lambda3 = [&]{ DebugLog(x); };
// Capture explicitly by copy
auto lambda4 = [x]{ DebugLog(x); };
// Capture explicitly by reference
auto lambda5 = [&x]{ DebugLog(x); };
C# forbids capturing in, ref, and out variables. C++ references and
pointers, the closest match to C#, can be freely captured in a variety
of ways.
Instead, C++ lambdas are just regular C++ classes. They have
constructors, assignment operators, destructors, overloaded
operators, and user-defined conversion operators. As such, they
behave like other C++ class objects rather than as managed,
garbage-collected C# classes.
Conclusion
Lambdas in both languages fulfill a similar role: to provide unnamed
functions. Aside from async lambdas in C#, the C++ version of
lambdas offers a much broader feature set. The two languages’
approaches diverge as C# makes the trade-off in favor of safety by
making lambdas be managed delegates. C++ takes the low, or often
zero, overhead approach of using regular classes at the cost of
possible bugs such as dangling pointers and references.
23. Compile-Time Programming
Constant Variables
C# has const fields of classes and structs. They must be
immediately initialized by a constant expression, which is one that’s
evaluated at compile time. Their type must be a primitive like int and
float or string. A const is implicitly static and readonly.
struct MathConstants
{
// Constant member variable
// Implicitly `const`
// Not implicitly `static`. Need to add the keyword.
static constexpr float PI = 3.14f;
};
// Constant global variable
constexpr int32_t numGamesPlayed = 0;
void Foo()
{
// Constant local variable
constexpr int32_t expansionMultiplier = 3;
// Constant reference
// Need to add `const` because `numGamesPlayed` is
implicitly `const`
constexpr int32_t const& ngp = numGamesPlayed;
// Constant array
// Implicitly `const` with `const` elements
constexpr int32_t exponentialBackoffDelays[4] = {
100, 200, 400, 800 };
}
struct NonLiteralType
{
// Delete the copy and move constructors
NonLiteralType(const NonLiteralType&) = delete;
NonLiteralType(const NonLiteralType&&) = delete;
private:
// Add a private non-static data member so it's not
an aggregate
int32_t Val;
};
// Compiler error: not a "literal type"
constexpr NonLiteralType nlt{};
struct NonLiteralType
{
// Destructor isn't constexpr
NonLiteralType()
{
}
};
// Compiler error: NonLiteralType doesn't have a
constexpr destructor
constexpr NonLiteralType nlt{};
struct LiteralTypeA
{
// Explicitly compiler-generated destructor is
constexpr
LiteralTypeA() = default;
};
struct LiteralTypeB
{
// Implicitly compiler-generated destructor is
constexpr
};
// OK
constexpr LiteralTypeA lta{};
constexpr LiteralTypeB ltb{};
Third, if it’s a union then at least one of its non-static data members
must be a “literal type.” If it’s not a union, all of its non-static data
members and all the data members of its base classes must be
“literal types.”
union NonLiteralUnion
{
NonLiteralType nlt1;
NonLiteralType nlt2;
};
// Compiler error: all of the union's non-static data
members are non-literal
constexpr NonLiteralUnion nlu{};
struct NonLiteralStruct
{
NonLiteralType nlt1;
int32_t Val; // Primitives are literal types
};
// Compiler error: not all of the struct's non-static
data members are literal
constexpr NonLiteralStruct nls{};
struct Vector2
{
float X;
float Y;
};
// Constant class instance
constexpr Vector2 ORIGIN{0, 0};
DebugLog(6); // 6
First, all of their parameters and their return type must be “literal
types.”
struct NonLiteralBase
{
};
struct NonLiteralCtor : virtual NonLiteralBase
{
// Compiler error: constructor can't be constexpr
with virtual base classes
constexpr NonLiteralCtor()
{
}
};
struct NonLiteralDtor : virtual NonLiteralBase
{
// Compiler error: destructor can't be constexpr with
virtual base classes
constexpr ~NonLiteralDtor()
{
}
};
Third, the body of the function can’t have any goto statements or
labels except case and default in switch statements:
constexpr int32_t GotoLoop(int32_t n)
{
int32_t sum = 0;
int32_t i = 1;
// Compiler error: constexpr function can't have non-
case, non-default label
beginLoop:
if (i > n)
{
// Compiler error: constexpr function can't have
goto
goto endLoop;
}
sum += i;
// Compiler error: constexpr function can't have non-
case, non-default label
endLoop:
return sum;
}
For example, say we want to set the minimum severity level that is
logged at compile time and not have to check it every time a log
message is written. We could use a compiler-defined preprocessor
symbol (LOG_LEVEL), a constexpr string equality function
(IsStrEqual), and if constexpr to either log or not log:
static_assert(IsStrEqual(LOG_LEVEL, "DEBUG") ||
IsStrEqual(LOG_LEVEL, "WARN") ||
IsStrEqual(LOG_LEVEL, "ERROR"),
"Invalid log level: " LOG_LEVEL);
struct PlayerUpdatePacket
{
int32_t PlayerId;
float PositionX;
float PositionY;
float PositionZ;
float VelocityX;
float VelocityY;
float VelocityZ;
int32_t NumLives;
};
static_assert(
sizeof(PlayerUpdatePacket) == 32,
"PlayerUpdatePacket serializes to wrong size");
Since C++17, we can omit the error message if we don’t think it’s
helpful:
static_assert(sizeof(PlayerUpdatePacket) == 32);
Constant Expressions
Now that we have a wealth of compile-time features in constexpr
variables, constexpr functions, if constexpr, and static_assert,
we need to know when we’re allowed to use these and when we’re
not. Clearly literals like 32 are constant expressions and calls to
rand() to get a random number are not, but many cases are not so
clear-cut.
struct Test
{
int32_t Val{123};
int32_t GetVal()
{
// Explicitly use the this pointer
// Compiler error: can't use `this` outside of
constexpr member function
constexpr int32_t val = this->Val;
return val;
}
};
The same goes for lambdas referencing captured objects, since they
are effectively accessed via the this pointer to the lambda class:
struct Test
{
int32_t Val{123};
// User-defined conversion operator to int32_t
operator int32_t()
{
return Val;
}
};
// Try to make a Test and convert it to an int32_t
// Compiler error: call to Test constructor which isn't
constexpr
constexpr int32_t val = Test{};
struct Base
{
constexpr virtual int32_t GetVal()
{
return 1;
}
};
struct Derived : Base
{
constexpr virtual int32_t GetVal() override
{
return 123;
}
};
// Class begins lifetime before the constant expression
Derived d{};
// Compiler error: can't call constexpr virtual function
on class that began its
// lifetime before the constant expression
constexpr int32_t val = d.GetVal();
// OK: Derived object begins lifetime during constant
expression
constexpr int32_t val2 = Derived{}.GetVal();
struct HasVal
{
int32_t Val{123};
};
HasVal hv{};
constexpr HasVal const& rhv{hv};
constexpr int32_t ri{rhv.Val};
constexpr HasVal hv2{};
constexpr HasVal const& rhv2{hv2};
constexpr int32_t ri2{rhv2.Val};
Seventh, only the active member of a union can be used. Accessing
non-active members is a form of undefined behavior that’s
commonly allowed by compilers in run-time code but disallowed in
compile-time code.
union IntFloat
{
int32_t Int;
float Float;
constexpr IntFloat()
{
// Make Int the active member
Int = 123;
}
};
// Call constexpr constructor which makes Int active
constexpr IntFloat u{};
// OK: can use Int because it's the active member
constexpr int32_t i = u.Int;
// Compiler error: can't use Float because it's not the
active member
constexpr float f = u.Float;
The compiler-generated copy or move constructor or assignment
operator of a union whose active member is mutable is also
disallowed:
union IntFloat
{
mutable int32_t Int;
float Float;
constexpr IntFloat()
{
Int = 123;
}
};
constexpr IntFloat u{};
constexpr IntFloat u2{u};
int32_t i = 123;
int32_t* pi1 = &i;
int32_t* pi2 = &i;
// Compiler error: can't compare pointers in a constant
expression
constexpr bool b = pi1 < pi2;
Thirteenth, while we can catch exceptions, we can’t throw them:
struct Combatant
{
virtual int32_t GetMaxHealth() = 0;
};
struct Enemy : Combatant
{
virtual int32_t GetMaxHealth() override
{
return 100;
}
};
struct TutorialEnemy : Combatant
{
virtual int32_t GetMaxHealth() override
{
return 10;
}
};
Enemy e;
Combatant& c{e};
// Compiler error: can't call dynamic_cast in a constant
expression if it
// will throw an exception
constexpr TutorialEnemy& te{dynamic_cast<TutorialEnemy&>
(c)};
Enemy* p = nullptr;
// Compiler error: can't call typeid in a constant
expression if it will
// throw an exception
constexpr auto name = typeid(p).name();
// C#
static void Assert(bool condition)
{
#if DEBUG && (ASSERTIONS_ENABLED == true)
if (!condition)
{
throw new Exception("Assertion failed");
}
#endif
}
If the #if expression evaluates to false then the code between the
#if and #endif is removed:
// C#
static void Assert(bool condition)
{
}
// C#
void Foo()
{
#if Foo
DebugLog("Foo exists");
#else
DebugLog("Foo does not exist"); // Gets printed
#endif
}
There are a couple ways to avoid this. First, we can use the
preprocessor defined operator to check whether the symbol is
defined instead of checking its value:
C# requires #define and #undef to appear only at the top of the file,
but C++ allows them anywhere.
To see more clearly how bugs arise, consider this macro call:
void Foo()
{
int32_t i = 1;
int32_t result = square(++i);
// After preprocessing, the previous line becomes:
int32_t result = ++i*++i;
DebugLog(result, i); // 6, 3
}
Again, the argument (++i) isn’t evaluated before the macro call but
rather just repeated every time the macro refers to the parameter.
This means i is incremented from 1 to 2 then again to 3 before the
multiplication (*) produces the result of 2*3=6 and sets i to 3. If this
were a function call, we’d expect 2*2=4 and for the value of i to be 2
afterward. These potential bugs are one reason why macros are
discouraged.
1 if there is
__STDC_HOSTED__ an OS, 0 if
not
Name of the
__FILE__ "mycode.cpp"
current file
Current line
__LINE__ 38
number
Date the
__DATE__ "2020 10 26" code was
compiled
Time the
__TIME__ "02:00:00" code was
compiled
Name Value Meaning
Default
alignment of
__STDCPP_DEFAULT_NEW_ALIGNMENT__ 8 new. Only in
C++17 and
up.
Since C++20, there are a ton of "feature test" macros available in the
<version> header file. These are all object-like and their values are
the date that the language or Standard Library feature was added to
C++. The intention is to compare them to __cplusplus to determine
whether the feature is supported or not. There are way too many to
list here, but the following shows a couple in action:
void Foo()
{
if (__cplusplus >= __cpp_char8_t)
{
DebugLog("char8_t is supported in the language");
}
else
{
DebugLog("char8_t is NOT supported in the
language");
}
if (__cplusplus >= __cpp_lib_byte)
{
DebugLog("std::byte is supported in the Standard
Library");
}
else
{
DebugLog("std::byte is NOT supported in the
Standard Library");
}
}
void Foo()
{
DebugLog(__FILE__, __LINE__); // main.cpp, 38
#line 100
DebugLog(__FILE__, __LINE__); // main.cpp, 100
#line 200 "custom.cpp"
DebugLog(__FILE__, __LINE__); // custom.cpp, 200
}
#ifndef _MSC_VER
#error Only Visual Studio is supported
#endif
// mathutils.h
// Compiler-specific alternative to header guards
#pragma once
float SqrMagnitude(const Vector2& vec)
{
return vec.X*vec.X + vec.Y*vec.Y;
}
_Pragma("once")
// Before C++11
#define PI 3.14f
// After C++11
constexpr float PI = 3.14f;
// Before C++11
// This isn't usable in many contexts like
Foo(EXPONENTIAL_BACKOFF_TIMES)
#define EXPONENTIAL_BACKOFF_TIMES { 1000, 2000, 4000,
8000, 16000 }
// After C++11
// This works like any array object:
constexpr int32_t ExponentialBackoffTimes[] = { 1000,
2000, 4000, 8000, 16000 };
Likewise, constexpr and consteval functions have removed a lot of
the need for function-like macros:
Still, macros provide a sort of escape hatch for when we simply can't
express something without raw textual substitution. The PROP macro
example above generates members with access specifiers. There's
no way to do that otherwise. That example might not be the best
idea, but others really are. A classic example is an assertion macro:
// When assertions are enabled, define ASSERT as a macro
that tests a boolean
// and logs and terminates the program when it's false.
#ifdef ENABLE_ASSERTS
#define ASSERT(x) \
if (!(x)) \
{ \
DebugLog("assertion failed"); \
std::terminate(); \
}
// When assertions are disabled, assert does nothing
#else
#define ASSERT(x)
#endif
bool IsSorted(const float* vals, int32_t length)
{
for (int32_t i = 1; i < length; ++i)
{
if (vals[i] < vals[i-1])
{
return false;
}
}
return true;
}
float GetMedian(const float* vals, int32_t length)
{
ASSERT(vals != nullptr);
ASSERT(length > 0);
ASSERT(IsSorted(vals, length));
if ((length & 1) == 1)
{
return vals[length / 2]; // odd
}
float a = vals[length / 2 - 1];
float b = vals[length / 2];
return (a + b) / 2;
}
void Foo()
{
float oddVals[] = { 1, 3, 3, 6, 7, 8, 9 };
DebugLog(GetMedian(oddVals, 7));
float evenVals[] = { 1, 2, 3, 4, 5, 6, 8, 9 };
DebugLog(GetMedian(evenVals, 8));
DebugLog(GetMedian(nullptr, 1));
float emptyVals[] = {};
DebugLog(GetMedian(emptyVals, 0));
float notSortedVals[] = { 3, 2, 1 };
DebugLog(GetMedian(notSortedVals, 3));
}
ASSERT(IsSorted(vals, length));
// Becomes:
if (!(IsSorted(vals, length)))
{
DebugLog("assertion failed");
std::terminate();
}
ASSERT(IsSorted(vals, length));
// Becomes:
#ifdef ENABLE_ASSERTS
constexpr void ASSERT(bool x)
{
if (!x)
{
DebugLog("assertion failed");
std::terminate();
}
}
#else
constexpr void ASSERT(bool x)
{
}
#endif
ASSERT(IsSorted(vals, length));
// Is equivalent to:
bool x = IsSorted(vals, length);
Assert(x); // does nothing
The compiler might be able to determine that the call to IsSorted
has no side effects and can be safely removed. In many cases, it
won't be able to make this determination and an expensive call to
IsSorted will still take place. We don't want this to happen, so we
use a macro.
There are many more reasons that we'll get into when we cover
templates later in the book.
Conclusion
The two languages have a lot of overlap in their use of the
preprocessor. It runs at the same stage of compilation and features
many identically-named directives with the same functionality.
With this in mind, let’s start looking at the kinds of templates we can
make.
Variables
Perhaps the simplest form of template is a template for a variable.
Consider the case of defining π:
template<typename T>
constexpr T PI = 3.14;
After the template we have the thing that the template creates when
instantiated. In this case we have a variable named PI. It has access
to the template parameters, which in this case is just T. Here we use
T as the type of the variable: T PI. Just like other variables, we’re
free to make it constexpr and initialize it: = 3.14. We could also
make it a pointer, a reference, const, or use other forms of
initialization like {3.14}.
float pi = PI<float>;
DebugLog(pi); // 3.14
When the compiler sees this, it looks at the template (PI) and
matches up our template arguments (float) to the template
parameters (T). It then substitutes the arguments wherever they’re
used in the templated entity. In this case T is replaced with float, so
we get this:
float pi = PI;
DebugLog(pi); // 3.14
Looking back at the original example where we had double and
int32_t versions of π, we can now replace those with uses of the PI
template:
// Function template
template<typename T>
T Max(T a, T b)
{
return a > b ? a : b;
}
// int version
int maxi = Max<int>(2, 4);
DebugLog(maxi); // 4
// float version
float maxf = Max<float>(2.2f, 4.4f);
DebugLog(maxf); // 4.4
// double version
double maxd = Max<double>(2.2, 4.4);
DebugLog(maxd); // 4.4
Then the three function calls that caused this instantiation are
replaced by calls to the instantiated functions:
// int version
int maxi = MaxInt(2, 4);
DebugLog(maxi); // 4
// float version
float maxf = MaxFloat(2.2f, 4.4f);
DebugLog(maxf); // 4.4
// double version
double maxd = MaxDouble(2.2, 4.4);
DebugLog(maxd); // 4.4
Also, as with any template, we’re not limited to just primitive types.
We can use any type:
struct Vector2
{
float X;
float Y;
bool operator>(const Vector2& other) const
{
return X > other.X && Y > other.Y;
}
};
// Vector2 version
Vector2 maxv = Max<Vector2>(Vector2{4, 6}, Vector2{2,
4});
DebugLog(maxv.X, maxv.Y); // 4, 6
The implication here is that a template places prerequisites on its
parameters. The Max template requires that there’s a T > T operator
available. That’s definitely satisfied by int, float, and double, but
we needed to write an overloaded Vector2 > Vector2 operator in
order for it to work with Vector2. Without this operator, we’d get a
compiler error:
struct Vector2
{
float X;
float Y;
};
template <typename T>
T Max(T a, T b)
{
// Compiler error:
// "Invalid operands to binary expression (Vector2
and Vector2)"
return a > b ? a : b;
}
Vector2 maxv = Max<Vector2>(Vector2{4, 6}, Vector2{2,
4});
DebugLog(maxv.X, maxv.Y); // 4, 6
template<typename T>
struct Vector2
{
T X;
T Y;
T Dot(const Vector2<T>& other) const
{
return X*other.X + Y*other.Y;
}
};
struct Vector2Float
{
float X;
float Y;
float Dot(const Vector2Float& other) const
{
return X*other.X + Y*other.Y;
}
};
struct Vector2Double
{
double X;
double Y;
double Dot(const Vector2Double& other) const
{
return X*other.X + Y*other.Y;
}
};
struct Vector2Int32
{
int32_t X;
int32_t Y;
int32_t Dot(const Vector2Int32& other) const
{
return X*other.X + Y*other.Y;
}
};
Then the usages of the class template are replaced with usages of
these instantiated classes:
Note that the Dot calls are particularly compact and valid only
because of several rules we’ve seen so far. Take the case of v2f
which is a Vector2<float>. When we call v2f.Dot({1, 0}), the
compiler looks at the Dot it instantiated as part of the Vector2
template. That Dot takes a const Vector2<float>& parameter, so the
compiler interprets {1, 0} as aggregate initialization of a
Vector2<float>. Because {0, 1} doesn’t have a name, that
Vector2<float> is an rvalue. It can be passed to a const lvalue
reference and it’s lifetime is extended until after Dot returns.
struct Vector2
{
float X;
float Y;
template<typename T>
bool IsNearlyZero(T threshold) const
{
return X < threshold && Y < threshold;
}
};
struct Vector2
{
float X;
float Y;
bool IsNearlyZeroFloat(float threshold) const
{
return X < threshold && Y < threshold;
}
bool IsNearlyZeroDouble(double threshold) const
{
return X < threshold && Y < threshold;
}
bool IsNearlyZeroInt(int threshold) const
{
return X < threshold && Y < threshold;
}
};
Then the calls to the member function template are replaced with
calls to the instantiated functions:
Vector2 vec{0.5f, 0.5f};
// Float
DebugLog(vec.IsNearlyZeroFloat(0.6f)); // true
// Double
DebugLog(vec.IsNearlyZeroDouble(0.1)); // false
// Int
DebugLog(vec.IsNearlyZeroInt(1)); // true
struct HealthRange
{
template<typename T>
constexpr static T Min = 0;
template<typename T>
constexpr static T Max = 100;
};
They’re used the same way other static member variables are:
struct HealthRange
{
constexpr static float MinFloat = 0;
constexpr static int32_t MaxInt = 100;
};
And then replaces the member variable template usage with these
instantiated member variables:
struct Math
{
template<typename T>
struct Vector2
{
T X;
T Y;
};
};
These can then be used like normal member classes:
struct Math
{
struct Vector2Float
{
float X;
float Y;
};
struct Vector2Double
{
double X;
double Y;
};
};
Math::Vector2Float v2f{2, 4};
DebugLog(v2f.X, v2f.Y); // 2, 4
Math::Vector2Double v2d{2, 4};
DebugLog(v2d.X, v2d.Y); // 2, 4
struct Madd
{
// Abbreviated function template
auto operator()(auto x, auto y, auto z) const
{
return x*y + z;
}
};
Then the two calls to its operator() cause the abbreviated function
template to be instantiated into an overload set:
struct Madd
{
float operator()(float x, float y, float z) const
{
return x*y + z;
}
int operator()(int x, int y, int z) const
{
return x*y + z;
}
};
The compiler will then replace the lambda syntax with instantiation of
these classes and calls to their operator():
Madd madd{};
// Call (float, float, float) overload of operator()
DebugLog(madd(2.0f, 3.0f, 4.0f)); // 10
// Call (int, int, int) overload of operator()
DebugLog(madd(2, 3, 4)); // 10
// C#
public static class Wrapper<T>
{
// Variable
public static readonly T Default = default(T);
// Function
public static string SafeToString(T obj)
{
return object.ReferenceEquals(obj, null) ? "" :
obj.ToString();
}
}
// Variable
DebugLog(Wrapper<float>.Default); // 0
// Function
DebugLog(Wrapper<Player>.SafeToString(new Player())); //
"Player 1"
DebugLog(Wrapper<Player>.SafeToString(null)); // ""
Let’s try porting one of the above Vector2 examples from C++ to C#:
template<typename T>
struct Vector2
{
T X;
T Y;
T Dot(const Vector2<T>& other) const
{
return X*other.X + Y*other.Y;
}
};
First, a literal translation of the syntax looks like this:
// C#
public struct Vector2<T>
{
public T X;
public T Y;
public T Dot(in Vector2<T> other)
{
return X*other.X + Y*other.Y;
}
}
// C#
public interface IArithmetic<T>
{
T Multiply(T a, T b);
T Add(T a, T b);
}
// C#
public struct FloatArithmetic : IArithmetic<float>
{
public float Multiply(float a, float b)
{
return a * b;
}
public float Add(float a, float b)
{
return a + b;
}
}
Now we can pass a FloatArithmetic to Vector2 for it to call
IArithmetic.Multiply and IArithmetic.Add on instead of the built-
in * and + operators:
// C#
public struct Vector2<T, TArithmetic>
where TArithmetic : IArithmetic<T>
{
public T X;
public T Y;
private TArithmetic Arithmetic;
public Vector2(T x, T y, TArithmetic arithmetic)
{
X = x;
Y = y;
Arithmetic = arithmetic;
}
public T Dot(Vector2<T, TArithmetic> other)
{
T xProduct = Arithmetic.Multiply(X, other.X);
T yProduct = Arithmetic.Multiply(Y, other.Y);
return Arithmetic.Add(xProduct, yProduct);
}
}
Here’s how we’d use this:
// C#
var vecA = new Vector2<float, FloatArithmetic>(1, 0,
default);
var vecB = new Vector2<float, FloatArithmetic>(0, 1,
default);
DebugLog(vecA.Dot(vecB)); // 0
While this design works, it’s created several problems. First, we have
a lot of boilerplate in IArithmetic, FloatArithmetic, extra type
arguments to the generics (<T, TArithmetic> instead of just <T>),
and an extra arithmetic parameter to the constructor. That’s a hit to
productivity and readability, but at least not a concern that translates
much to the executable the compiler generates.
The second issue is that our Vector2 has increased in size since it
includes an Arithmetic field. That’s a managed reference to an
IArithmetic. On a 64-bit CPU, it’ll take up at least 8 bytes. Since X
and Y both take up 4 bytes, the size of Vector2 has doubled. This will
impact memory usage and, perhaps more importantly, cache
utilization as only half as many vectors can now fit in a cache line.
// C#
public interface INumeric<T>
{
INumeric<T> Create(T val);
T Value { get; set; }
INumeric<T> Multiply(T val);
INumeric<T> Add(T val);
}
public class FloatNumeric : INumeric<float>
{
public float Value { get; set; }
public FloatNumeric(float val)
{
Value = val;
}
public INumeric<float> Create(float val)
{
return new FloatNumeric(val);
}
public INumeric<float> Multiply(float val)
{
return Create(Value * val);
}
public INumeric<float> Add(float val)
{
return Create(Value + val);
}
}
Vector2 can now hold INumeric fields and call its virtual functions:
// C#
public struct Vector2<T, TNumeric>
where TNumeric : INumeric<T>
{
public TNumeric X;
public TNumeric Y;
public Vector2(TNumeric x, TNumeric y)
{
X = x;
Y = y;
}
public T Dot(Vector2<T, TNumeric> other)
{
INumeric<T> xProduct = X.Multiply(other.X.Value);
INumeric<T> yProduct = Y.Multiply(other.Y.Value);
INumeric<T> sum = xProduct.Add(yProduct.Value);
return sum.Value;
}
}
var vecA = new Vector2<float, FloatNumeric>(
new FloatNumeric(1),
new FloatNumeric(0)
);
var vecB = new Vector2<float, FloatNumeric>(
new FloatNumeric(0),
new FloatNumeric(1)
);
DebugLog(vecA.Dot(vecB)); // 0
// C#
public struct Vector2Float
{
public float X;
public float Y;
public float Dot(in Vector2Float other)
{
return X*other.X + Y*other.Y;
}
}
public struct Vector2Double
{
public double X;
public double Y;
public double Dot(in Vector2Double other)
{
return X*other.X + Y*other.Y;
}
}
public struct Vector2Int
{
public int X;
public int Y;
public int Dot(in Vector2Int other)
{
return X*other.X + Y*other.Y;
}
}
var vecA = new Vector2Float{X=1, Y=0};
var vecB = new Vector2Float{X=0, Y=1};
DebugLog(vecA.Dot(vecB)); // 0
This is efficient, but now suffers all the usual issues with code
duplication: the need to change many copies, bugs when the copies
get out of sync, etc. To address this, we may turn to a code
generation tool that uses some form of templates to generates .cs
files. This may be run in an earlier build step, but it won’t be
integrated into the main codebase, may require additional
languages, still requires unique naming, and a variety of other
issues.
C++ avoids the code duplication, the external tools, the virtual
function calls, the cold memory reads, the boxing and garbage
collection, the need for an interface and boilerplate implementation
of it, the extra type parameters, and the where constraints. Instead, it
simply produces a compiler error when T*T or T+T is a syntax error.
Conclusion
We’ve barely scratched the surface of templates and already we’ve
seen that they’re far more powerful than C# generics. We can easily
write code like Dot that’s simultaneously efficient, readable, and
generic. C# struggles with even simple examples like this and often
requires us to sacrifice one or more of these qualities.
26. Template Parameters
Type Template Parameters
All of the examples of templates in the intro chapter took one
parameter:
template<typename T>
Unlike C#, the names of the parameters are optional, even when
they have default values:
struct Vector
{
float X = 0;
float Y = 0;
};
// Template parameter has the same name as a class
outside the template: Vector
template<typename Vector>
Vector Make()
{
return Vector{};
}
// "int" used as the type named "Vector"
auto val = Make<int>();
DebugLog(val); // 0
// "Vector" doesn't refer to the type parameter
// The template isn't referenced here
auto vec = Vector{};
DebugLog(vec.X, vec.Y); // 0, 0
Template Template Parameters
Consider a Map class template that holds keys and values via a
List<T> class template:
template<typename T>
struct List
{
// ... implementation similar to C#
};
template<typename TKey, typename TValue>
struct Map
{
List<TKey> Keys;
List<TValue> Values;
};
Map<int, float> map;
template<typename T>
struct List
{
// ... implementation similar to C#
};
template<typename T>
struct FixedList
{
// ... implementation similar to C# except that it's
a fixed size
};
// The third parameter is a template, not a type
// That template needs to take one type parameter
template<typename TKey, typename TValue,
template<typename> typename TContainer>
struct Map
{
// Use the template parameter instead of directly
using List
TContainer<TKey> Keys;
TContainer<TValue> Values;
};
// Pass List, which is a template taking one type
parameter, as the parameter
// Do not pass an instantiation of the template like
List<int>
Map<int, float, List> listMap;
// Pass FixedList as the parameter
// It also takes one type parameter
Map<int, float, FixedList> fixedListMap;
struct MapList
{
List<int> Keys;
List<float> Values;
};
struct MapFixedList
{
FixedList<int> Keys;
FixedList<float> Values;
};
template<
typename TKey,
typename TValue,
template<typename> typename TKeysContainer=List,
template<typename> typename TValuesContainer=List>
struct Map
{
TKeysContainer<TKey> Keys;
TValuesContainer<TValue> Values;
};
// TKeysContainer=List, TValuesContainer=List
Map<int, float> map1;
// TKeysContainer=FixedList, TValuesContainer=List
Map<int, float, FixedList> map2;
// TKeysContainer=FixedList, TValuesContainer=FixedList
Map<int, float, FixedList, FixedList> map3;
Non-Type Template Parameters
The third kind of template parameter is known as a “non-type
template parameter.” These are compile-time constant values, not
the names of types or templates. For example, we can use this to
write the FixedList type backed by an array data member:
struct FixedListInt3
{
int Elements[3];
int& operator[](int index)
{
return Elements[index];
}
int GetLength() const noexcept
{
return 3;
}
};
struct FixedListFloat2
{
float Elements[2];
float& operator[](int index)
{
return Elements[index];
}
int GetLength() const noexcept
{
return 2;
}
};
We can now use these to tune the performance of our List classes
based on expected usage:
// Defaults are acceptable
List<int> list1;
// Start off with a lot of capacity
List<int, 1024> list2;
// Don't start with a little capacity, but grow fast
List<int, 4, 10> list3;
// Start empty and grow by doubling
List<int, 0, 2> list4;
Starting in C++20, there are two more kinds. First, floating point
types like float and double:
template<decltype(typeid(char)) tid>
void PrintTypeName()
{
DebugLog(tid.name());
}
// Compiler error: can't pass what typeid evaluates to
PrintTypeName<typeid(char)>();
While they’re not strictly prohibited, it’s important to know that arrays
in template parameters are implicitly converted to pointers. This can
have some important consequences:
namespace MyNamespace
{
// Class member of the namespace
class Thing
{
};
// Class template with one type parameter: T
template<class T>
struct MyClass
{
// Member function declaration, not definition
int GetSizeOfThing(T thing);
};
}
// Member function definition outside the class
// Uses 'Thing' instead of 'T' as the class' type
parameter name
// 'Thing' is the same name as the namespace member class
'Thing'
// 'Thing' is used as the type of a parameter to the
function
template<class Thing>
int MyNamespace::MyClass<Thing>::GetSizeOfThing(Thing
thing)
{
// 'Thing' refers to the type parameter, not the
namespace member
return sizeof(Thing);
}
// Instantiate the class template with T=double
MyNamespace::MyClass<double> mc{};
// Call the member function on a MyClass<double>
// Returns the size of the type parameter: 8 for double
DebugLog(mc.GetSizeOfThing({})); // 8, not 1
The third case is when a class template’s parameter has the same
name as a member of one of its base classes. In this case, the
ambiguity goes to the base class’ member:
struct BaseClass
{
struct Thing
{
};
};
// Class template with one type parameter: Thing
// 'Thing' is the same name as the base class' member
class 'Thing'
template<class Thing>
struct DerivedClass : BaseClass
{
// 'Thing' refers to the base class' member class,
not the type parameter
int Size = sizeof(Thing);
};
// Instantiate the class template with Thing=double
DerivedClass<double> dc;
// See how big 'Thing' was when initializing 'Size'
// It's the size of BaseClass::Thing: 1 for an empty
struct
DebugLog(dc.Size); // 1, not 8
Unlike the first two cases, this case is possible in C# as well. Unlike
C++, Thing refers to the type parameter, not the base class member:
// C#
public class BaseClass
{
public struct Thing
{
};
};
// Generic class with one type parameter: Thing
// 'Thing' is the same name as the base class' member
class 'Thing'
public class DerivedClass<Thing> : BaseClass
{
// 'Thing' refers to the type parameter, not base
class' member class
public Type ThingType = typeof(Thing);
};
// Instantiate the generic class with Thing=double
DerivedClass<double> dc = new DerivedClass<double>();
// See what type 'Thing' was when initializing
'ThingType'
// It's the type parameter 'double', not BaseClass.Thing
DebugLog(dc.ThingType); // System.Double
Conclusion
C# generics provide support for type parameters, but not the non-
type parameters and template parameters that C++ templates
provide support for. Even so, C++ type parameters include additional
functionality such as support for default arguments and omitting the
name of the parameter.
// C#
static class TypeUtils
{
// Generic method
public static void PrintType<T>(T x)
{
DebugLog(typeof(T));
}
}
// Type arguments explicitly specified
TypeUtils.PrintType<int>(123); // System.Int32
TypeUtils.PrintType<bool>(true); // System.Boolean
// Type arguments deduced by the compiler
TypeUtils.PrintType(123); // System.Int32
TypeUtils.PrintType(true); // System.Boolean
The same works in C++, as we see in this literal translation of the
C#:
template<class T>
void ArrayOrPointer(T)
{
DebugLog("is array?", typeid(T) == typeid(int[3]));
DebugLog("is pointer?", typeid(T) == typeid(int*));
}
int arr[3];
ArrayOrPointer(arr); // is array? false, is pointer? true
template<class T>
void ConstOrNonConst(T x)
{
// If T was 'const int' then this would be a compiler
error
x = {};
}
const int c = 123;
ConstOrNonConst(c); // Compiles, meaning T is non-const
int
Fourth, references to T become just T:
template<class T>
void RefDetector(T x)
{
// If T is a reference, this assigns to the caller's
value
// If T is not a reference, this assigns to the local
copy
x = 123;
}
int i = 42;
int& ri = i;
RefDetector(ri);
DebugLog(i); // 42
template<class T>
void RefDetector(T& x) // <-- Added &
{
x = 123;
}
int i = 42;
int& ri = i;
RefDetector(ri);
DebugLog(i); // 123
template<class T>
void Foo(T&&)
{
}
int i = 123; // lvalue, not lvalue reference
Foo(i); // T is int&&
Foo(123); // T is int&
template<typename T>
void TakeConstRef(const T& x)
{
}
template<typename T>
void TakeNonConstRef(T& x)
{
x = 42;
}
// Compiler deduces T='const int&' even though 'i1' is
non-const
int i1 = 123;
TakeConstRef(i1);
// Compiler deduces T='const int&'
const int i2 = 123;
TakeNonConstRef(i2); // Compiler error: can't assign to x
template<typename T>
void TakeConstRef(const T* p)
{
}
template<typename T>
void TakeNonConstRef(T* p)
{
*p = 42;
}
// Compiler deduces T='const int*' even though 'i1' is
non-const
int i1 = 123;
TakeConstRef(&i1);
// Compiler deduces T='const int*'
const int i2 = 123;
TakeNonConstRef(&i2); // Compiler error: can't assign to
*p
template<class T>
struct Base
{
};
template<class T>
struct Derived : public Base<T>
{
};
template<class T>
void TakeBaseRef(Base<T>&)
{
}
Derived<int> derived;
// Compiler accepts Derived<T> for Base<T> an deduces
that T is 'int'
TakeBaseRef(derived);
Class Template Argument Deduction
Since C++17, the arguments to a class template can also be
deduced:
// Class template
template<class T>
struct Vector2
{
T X;
T Y;
Vector2(T x, T y)
: X{x}, Y{y}
{
}
};
// Explicit class template argument: float
Vector2<float> v1{2.0f, 4.0f};
// Compiler deduces the class template argument: float
Vector2 v2{2.0f, 4.0f};
// Also works with 'new'
// 'v3' is a Vector<float>*
auto v3 = new Vector2{2.0f, 4.0f};
To help the compiler deduce these arguments, we can write a
“deduction guide” to tell it what to do:
// Class template
template<class T>
struct Range
{
// Constructor template
template<class Pointer>
Range(Pointer beg, Pointer end)
{
}
};
double arr[] = { 123, 456 };
// Compiler error: can't deduce T (class template
argument) from constructor
Range range1{&arr[0], &arr[1]};
// Deduction guide tells the compiler how to deduce the
class template argument
template<class T>
Range(T* b, T* e) -> Range<T>;
// OK: compiler uses deduction guide to deduce that T is
'double'
Range range2{&arr[0], &arr[1]};
As we see in this example, deduction guides are written like a
function template with the “trailing return syntax.” The major
difference is that their name is the name of a class template and
their “return type” is a class template with its arguments passed.
Specialization
So far, all of our templates have been instantiated the same way
regardless of the template arguments provided to them. Sometimes
we want to use an alternate version of the template when certain
arguments are provided. This is called specialization of a template.
Consider this class template:
Now let’s specialize Vector for a common use case: two float
components.
template<>
struct Vector<float, 2>
{
// No Components
float X;
float Y;
float Dot(const Vector<float, 2>& other) const
noexcept
{
return X*other.X + Y*other.Y;
}
};
Vector<float, 2> v1{2, 4};
// Compiler error: Vector<float, 2> doesn't have a
Components data member
DebugLog(v1.Components[0], v1.Components[1]); // 2, 4
// OK: Vector<float, 2> has X and Y
DebugLog(v1.X, v1.Y); // 2, 4
template<typename ...TArgs>
void LogAll(TArgs... args)
{
}
To use the parameter pack, we add ... after the name of the
parameter: TArgs.... The compiler expands this to a comma-
delimited list of the arguments.
Note that the compiler can never deduce this with class templates,
so the parameter pack must come at the end.
Pack Expansion
Now that we know how to declare packs of template parameters and
how to use them in function parameters, let’s look at some more
ways to use them. One common way is to pass them as function
arguments:
template<typename ...TArgs>
void LogPointers(TArgs... args)
{
// Apply dereferencing to each value in the pack
DebugLog(*args...);
}
// Pass pointers
float f = 3.14f;
int i1 = 123;
int i2 = 456;
LogPointers(&f, &i1, &i2); // 3.14, 123, 456
// The compiler instantiates this function
void LogPointers(float* arg1, int* arg2, int* arg3)
{
DebugLog(*arg1, *arg2, *arg3);
}
struct Pixel
{
int X;
int Y;
Pixel(int x, int y)
: X(x), Y(y)
{
}
};
// Function template takes a parameter pack of ints
template<int ...Components>
Pixel MakePixel()
{
// Expand into parentheses initialization
return Pixel(Components...);
};
Pixel pixel = MakePixel<2, 4>();
DebugLog(pixel.X, pixel.Y); // 2, 4
Or initializing with curly braces:
struct VitalityComponent
{
int Health;
int Armor;
};
struct WeaponComponent
{
float Range;
int Damage;
};
struct SpeedComponent
{
float Speed;
};
template<class... TComponents>
// Expand a pack of base classes
class GameEntity : public TComponents...
{
};
// turret is a class that derives from VitalityComponent
and WeaponComponent
GameEntity<VitalityComponent, WeaponComponent> turret;
turret.Health = 100;
turret.Armor = 200;
turret.Range = 10;
turret.Damage = 15;
// civilian is a class that derives from
VitalityComponent and SpeedComponent
GameEntity<VitalityComponent, SpeedComponent> civilian;
civilian.Health = 100;
civilian.Armor = 200;
civilian.Speed = 2;
template<class ...Args>
void Print(Args... args)
{
// Expand the 'args' pack into the lambda capture
list
auto lambda = [args...] { DebugLog(args...); };
lambda();
}
Print(123, 456, 789); // 123, 456, 789
Fifth, the sizeof operator has a variant that takes a parameter pack.
This evaluates to the number of elements in the pack, regardless of
their sizes:
DebugLog(15); // 15
DebugLog((x + y) / 2);
Conclusion
Variadic templates enable us to write templates based on arbitrary
numbers of parameters. This saves us from needing to write nearly-
identical versions of the same templates over and over. For example,
C# has Action<T>, Action<T1,T2>, Action<T1,T2,T3>, all the way up
to
Action<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16
>! The same massive duplication is applied to its Func counterpart:
Func<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T
Result>. This is so painful to write that we usually just don’t bother or
write a code generator to output all this redundant C#. At no point do
we end up with a solution that takes arbitrary numbers of
parameters, just arbitrary enough for now numbers of parameters.
29. Template Constraints
Constraints
C# has 11 specific where constraints we can put on type parameters
to generics. These include constraints like where T : new()
indicating that T has a public constructor that takes no parameters. In
contrast, C++ provides us with tools to build our own constraints out
of compile-time expressions.
struct Vector2
{
float X;
float Y;
};
// Variable template
// "Default" value is false
template <typename T>
constexpr bool IsVector2 = false;
// Specialization of the variable template
// Change value to true for a specific type
template <>
constexpr bool IsVector2<Vector2> = true;
// Function template
template <typename TVector, typename TComponent>
// Requires clause
// Compile-time expression evaluates to a bool
// Can use template parameters here
requires IsVector2<TVector>
// The function
TComponent Dot(TVector a, TVector b)
{
return a.X*b.X + a.Y*b.Y;
}
// OK
Vector2 vecA{2, 4};
Vector2 vecB{2, 4};
DebugLog(Dot<Vector2, float>(vecA, vecB));
// Compiler error:
//
// Candidate template ignored: constraints not satisfied
// [with TVector = int, TComponent = int]
// TComponent Dot(TVector a, TVector b)
// ^
// test.cpp:60:10: note: because 'IsVector2<int>'
evaluated to false
// requires IsVector2<TVector>
DebugLog(Dot<int, int>(2, 4));
// Variable template
template <typename T, int N>
// Recurse to next-lower value
constexpr T SumUpToN = N + SumUpToN<T, N-1>;
// Specialization for 0 stop recursion
template <typename T>
constexpr T SumUpToN<T, 0> = 0;
// OK
DebugLog(SumUpToN<float, 3>); // 6
// Compile error:
//
// test.cpp:44:28: fatal error: recursive template
instantiation exceeded
// maximum depth of 1024
// constexpr T SumUpToN = N + SumUpToN<T, N-1>;
// ^
// test.cpp:44:28: note: in instantiation of variable
template specialization
// 'SumUpToN<float, -1025>' requested here
// test.cpp:44:28: note: in instantiation of variable
template specialization
// 'SumUpToN<float, -1024>' requested here
// test.cpp:44:28: note: in instantiation of variable
template specialization
// 'SumUpToN<float, -1023>' requested here
// test.cpp:44:28: note: in instantiation of variable
template specialization
// 'SumUpToN<float, -1022>' requested here
// test.cpp:44:28: note: in instantiation of variable
template specialization
// 'SumUpToN<float, -1021>' requested here
//
// ... many more lines of errors
DebugLog(SumUpToN<float, -1>);
There are a few alternate syntaxes we can use. First, we can put the
requires clause after the parameter list:
Note that this Number concept is quite incomplete and for example
purposes only. The C++ Standard Library has many well-designed
concepts such as std::integral and std::floating_point that are
suitable for production code.
Combining Concepts
Many concepts are defined in terms of other concepts. Just like how
we used && in our requires clause, we can do the same when
defining a concept:
C#
C++ Concept (approximation)
Constraint
where T : template <class T> concept C =
struct std::is_class_v<T>;
where T : template <class T> concept C1 = !Nullable<T>
class && std::is_class_v<T>
where T : template <class T> concept C =
class? std::is_class_v<T>;
where T : template <class T> concept C = !Nullable<T>;
notnull
where T :
N/A. All C++ types are unmanaged.
unmanaged
where T : std::default_initializable<T>
new()
where T : template <class T> concept C = !Nullable<T> &&
BaseClass std::derived_from<T, BaseClass>;
where T : std::derived_from<T, BaseClass>
BaseClass?
where T : template <class T> concept C = !Nullable<T> &&
Interface std::derived_from<T, BaseClass>;
C#
C++ Concept (approximation)
Constraint
where T :
std::derived_from<T, BaseClass>
Interface?
where T : U std::derived_from<T, U>
C++ type aliases work differently. They can be added to other kinds
of scopes and used across files:
////////////
// Math.h //
////////////
namespace Integers
{
// Add a "uint32" alias for "unsigned int" in the
Integers namespace
typedef unsigned int uint32;
}
// Use "uint32" like any other member of the Integers
namespace
constexpr Integers::uint32 ZERO = 0;
////////////
// Game.h //
////////////
// Include header file to get access to the Integers
namespace and ZERO
#include "Math.h"
constexpr Integers::uint32 MAX_HEALTH = 100;
//////////////
// Game.cpp //
//////////////
// Include header file to get access to Integers, ZERO,
and MAX_HEALTH
#include "Game.h"
DebugLog(ZERO); // 0
DebugLog(MAX_HEALTH); // 100
// The type alias is usable here, too
for (Integers::uint32 i = 0; i < 3; ++i)
{
DebugLog(i); // 0, 1, 2
}
void Foo()
{
// Type alias scoped to just one function
typedef unsigned int uint32;
for (uint32 i = 0; i < 3; ++i)
{
DebugLog(i); // 0, 1, 2
}
}
struct Player
{
// Player::HealthType is now an alias for "unsigned
int"
typedef unsigned int HealthType;
// We can use it here without the namespace qualifier
HealthType Health = 0;
};
// We can use it outside of the class by adding the
namespace qualifier
void ApplyDamage(Player& player, Player::HealthType
amount)
{
player.Health -= amount;
}
// C code
struct Player
{
int Health;
int Speed;
};
struct Player p; // C requires "struct" prefix
p.Health = 100;
p.Speed = 10;
DebugLog(p.Health, p.Speed); // 100, 10
Again, neither the struct prefix nor the typedef workaround are
necessary in C++. It’s just important to know why typedef is used
like this since it’s still commonly seen in C++ codebases.
Using Aliases
Since C++11, typedef is no longer the preferred way of creating type
aliases. The new way looks a lot more like C#’s using X = Y;. Note
that the order of the alias and the type has reversed compared to
typedef:
We’re simply listing the type name on the right side. This is
particularly more readable than typedef for some of the more
complex types we’ve seen since the alias name isn’t mixed in with
the type being aliased:
class Outer
{
// Member type that is private, the default for
"class"
struct Inner
{
int Val = 123;
};
};
// Compiler error: Inner is private
Outer::Inner inner;
class Outer
{
// Member type is still private
struct Inner
{
int Val = 123;
};
public:
// Type alias is public
using InnerAlias = Inner;
};
// OK: uses permission level of InnerAlias, not Inner
Outer::InnerAlias inner;
Usually we’ll just specify the desired permission level to begin with.
In cases such as using third-party libraries, we don’t have the ability
to change that original permission level. This workaround can be
used to get the access we need.
Conclusion
Type aliases in C++ go way beyond their C# counterparts. They’re
not limited to a single source code file or namespace block. Instead,
we can and commonly do declare them in header files as globals, in
namespaces, and as class members. We declare terse names in
functions or even blocks in functions to avoid a lot of type verbosity,
especially when using generic code such as a HashMap<TKey,
TValue>. These aliases can be created once and shared across the
whole project, not just within one file.
// C#
// A type we want to deconstruct
struct Vector2
{
public float X;
public float Y;
// Create a Deconstruct method that takes an 'out'
param for each variable
// to deconstruct into and returns void
public void Deconstruct(out float x, out float y)
{
x = X;
y = Y;
}
}
// Instantiate the deconstructable type
var vec = new Vector2{X=2, Y=4};
// Deconstruct. Implicitly calls vec.Deconstruct(out x,
out y).
// x is a copy of vec.X and y is a copy of vec.Y
var (x, y) = vec;
DebugLog(x, y); // 2, 4
So far it’s essentially the same in the two language except for two
changes. First, C++ uses square brackets ([x, y]) instead of
parentheses ((x, y)). Second, C++ doesn’t require us to write a
Deconstruct function. Instead, the compiler simply uses the
declaration order of the fields of Vector2 so that x lines up with X and
y with Y. This mirrors initialization where Vector2{2, 4} initializes the
data members in declaration order: X then Y.
struct Vector2
{
float X;
float Y;
// Get a data member of a const vector
template<std::size_t Index>
const float& get() const
{
// Assert the only two valid indices
static_assert(Index == 0 || Index == 1);
// Return the right one based on the index
if constexpr(Index == 0)
{
return X;
}
return Y;
}
// Get a data member of a non-const vector
template <std::size_t Index>
float& get()
{
// Cast to const so we can call the const
overload of this function
// to avoid code duplication
const Vector2& constThis = const_cast<const
Vector2&>(*this);
// Call the const overload of this function
// Returns a const reference to the data member
const float& constComponent =
constThis.get<Index>();
// Cast the data member to non-const
// This is safe since we know the vector is non-
const
float& nonConstComponent = const_cast<float&>
(constComponent);
// Return the non-const data member reference
return nonConstComponent;
}
};
// Specialize the tuple_size class template to derive
from integral_constant
// Pass 2 since Vector2 always has 2 components
template<>
struct std::tuple_size<Vector2> :
std::integral_constant<std::size_t, 2>
{
};
// Specialize the tuple_element struct to indicate that
index 0 of Vector2 has
// the type 'float'
template<>
struct std::tuple_element<0, Vector2>
{
// Create a member named 'type' that is an alias for
'float'
using type = float;
};
// Same for index 1
template<>
struct std::tuple_element<1, Vector2>
{
using type = float;
};
// Usage is the same
Vector2 vec{2, 4};
auto [x, y] = vec;
DebugLog(x, y);
The result of the above code is that Vector2 is now a “tuple-like”
type, usable with structured bindings and several generic algorithms
of the C++ Standard Library.
int arr[] = { 2, 4 };
// Compiler error: must use the 'auto' type here
int [x, y] = arr;
There are two aspects of these to take notice of. First, and trivially,
C++ attributes use two square brackets ([[X]]) instead of one in C#
([X]). Second, all of the attributes are for one of two purposes:
controlling compiler warnings and optimizing generated code.
class File
{
FILE* handle = nullptr;
public:
~File()
{
if (handle)
{
::fclose(handle);
}
}
// Generate a compiler warning if the return value is
ignored
[[nodiscard]] bool Close()
{
if (!handle)
{
return true;
}
return ::fclose(handle) == 0;
}
// Generate a compiler warning if the return value is
ignored
[[nodiscard]] bool Open(const char* path, const char*
mode)
{
if (!handle)
{
// No compiler warning because return value
is used
if (!Close())
{
return false;
}
}
handle = ::fopen(path, mode);
return handle != nullptr;
}
};
File file{};
// Compiler warning: return value ignored
file.Open("/path/to/file", "r");
// Compiler warning: unused variable
bool success = file.Open("/path/to/file", "r");
// No compiler warning: suppress the unused variable
[[maybe_unused]] bool success =
file.Open("/path/to/file", "r");
// No compiler warning because return value is used
if (!file.Open("/path/to/file", "r"))
{
DebugLog("Failed to open file");
}
To use more than one attribute at a time, add commas like when
declaring multiple variables at a time:
C++ attributes are one of the rare areas of the language that’s
actually less powerful than its C# counterpart. It fulfills compile-time
purposes such as by controlling warnings and optimization, but it
doesn’t support any run-time use cases due to the lack of reflection
in the language. Third-party libraries (example) are required to add
on run-time reflection if needed, but they’re not integrated into the
core language. This may change in C++23 or another future version
as there has been much work on integrating compile-time reflection
into the language.
32. Thread-Local Storage and Volatile
Thread-Local Storage
Thread-Local Storage is a way of storing one variable per thread.
Both C# and C++ have support for this. In C#, we add the
[ThreadStatic] attribute to a static field. A common bug results
from the field’s initializer being run only once, like other static fields,
not once per thread.
// C#
public class Counter
{
// One int stored per thread
// Initialized once, not one per thread
[ThreadStatic] public static int Value = 1;
}
Action a = () => DebugLog(Counter.Value);
Thread t1 = new Thread(new ThreadStart(a));
Thread t2 = new Thread(new ThreadStart(a));
t1.Start();
t2.Start();
t1.Join();
t2.Join();
// First thread runs and the first use of Counter
initializes Value to 1
// Second thread runs and doesn't initialize Value. Uses
the default of 0.
// Output: 1 then 0
// Global variable
thread_local int global = 1;
namespace Counters
{
// Namespace variable
thread_local int ns = 1;
}
struct Counter
{
// Static data member
// Inline initialization isn't allowed for non-const
static data members
static thread_local int member;
};
// Initialization outside the class is OK
thread_local int Counter::member = 1;
void Foo()
{
// Local variable
thread_local int local = 1;
{
// Variable in any nested block
thread_local int block = 1;
}
}
struct LogLifecycle
{
int Value = 1;
LogLifecycle()
{
DebugLog("ctor");
}
~LogLifecycle()
{
DebugLog("dtor");
}
};
thread_local LogLifecycle x{};
auto a = []{ DebugLog(x.Value); };
std::thread t1{a};
std::thread t2{a};
t1.join();
t2.join();
// Possible annotated output, depending on thread
execution order:
// ctor // first thread initializes x
// ctor // second thread initializes x
// 1 // first thread prints x.Value
// dtor // first thread de-initializes x
// 1 // second thread prints x.Value
// dtor // second thread de-initializes x
All other types, including double, long, and all structs, can’t be
volatile:
// C#
public class Name
{
public string First;
public string Last;
}
public struct IntWrapper
{
public int Value;
}
public enum IntEnum : int
{
}
public enum LongEnum : long
{
}
unsafe public class Volatiles<T>
where T : class
{
// OK: reference type
volatile Name RefType;
// OK: type parameter known to be a reference type
due to where constraint
volatile T TypeParam;
// OK: pointer
volatile int* Pointer;
// OK: permitted primitive type
volatile int GoodPrimitive;
// Compiler error: denied primitive type
volatile long BadPrimitive;
// OK: enum based on permitted primitive type
volatile IntEnum GoodEnum;
// Compiler error: enum based on denied primitive
type
volatile LongEnum BadEnum;
// Compiler error: structs can't be volatile
// No exception for structs that only have one field
that can be volatile
volatile IntWrapper Struct;
// OK: Special-case for IntPtr and UIntPtr structs
volatile IntPtr SpecialPtr1;
volatile UIntPtr SpecialPtr2;
}
// C#
public struct Counter
{
public volatile int Value;
public void Increment()
{
// Reads get an implicit acquire fence
int cur = this.Value; // acquire-fenced
int next = cur + 1;
// Writes get an implicit release fence
this.Value = next; // release-fenced
}
}
The only problem is that the device status that we log can no longer
change. By marking the value that pDeviceStatus points to as
volatile, the compiler is prohibited from making this optimization. It
has to assume that there’s an external writer that might change the
device status.
struct Vector3d
{
double X;
double Y;
double Z;
};
volatile Vector3d V{2, 4, 6}; // Struct
volatile uint64_t L; // Long
volatile double D; // Double
volatile int A[1000]; // Array
The general rule here is that we can treat variables as “more const”
or “more volatile” but not “less const” or “less volatile” since this
would remove important restrictions.
struct EmptyStruct
{
};
struct Vector3
{
float X;
float Y;
float Z;
};
// Examples on x64 macOS with Clang compiler
DebugLog(alignof(char)); // 1
DebugLog(alignof(int)); // 4
DebugLog(alignof(bool)); // 1
DebugLog(alignof(int*)); // 8
DebugLog(alignof(EmptyStruct)); // 1
DebugLog(alignof(Vector3)); // 4
DebugLog(alignof(int[100])); // 4
Because the alignment requirements of all types in C++ is known at
compile time, the alignof operator is evaluated at compile time. This
means the above is compiled to the same machine code as if we
logged constants:
DebugLog(1);
DebugLog(4);
DebugLog(1);
DebugLog(8);
DebugLog(1);
DebugLog(4);
DebugLog(4);
Note that when using alignof with an array, like we did with
int[100], we get the alignment of the array’s element type. That
means we get 4 for int, not 8 for int* even though arrays and
strings are very similar in C++.
Alignas
Next we have the alignas specifier. This is applied to classes, data
members, and variables to control how they’re aligned. All kinds of
variables are supported except bit fields, parameters, or variables in
catch clauses. For example, say we wanted to align a struct to 16-
byte boundaries:
struct AlignedToDouble
{
double Double;
// Each data member has the same alignment as the
double type
alignas(double) float Float;
alignas(double) uint16_t Short;
alignas(double) uint8_t Byte;
};
// Struct is 32 bytes because of alignment requirements
DebugLog(sizeof(AlignedToDouble)); // 32
// Print distances between data members to see 8-byte
alignment
AlignedToDouble atd;
DebugLog((char*)&atd.Float - (char*)&atd.Double); // 8
DebugLog((char*)&atd.Short - (char*)&atd.Double); // 16
DebugLog((char*)&atd.Byte - (char*)&atd.Double); // 24
It’s rare, but if we specify multiple alignas then the largest value is
used:
struct Aligned
{
// 16 is the largest, so it's used as the alignment
alignas(4) alignas(8) alignas(16) int First = 123;
alignas(16) int Second = 456;
};
DebugLog(sizeof(Aligned)); // 32
Aligned a;
DebugLog((char*)&a.Second - (char*)&a.First); // 16
template<int... Alignments>
struct Aligned
{
alignas(Alignments...) int First = 123;
alignas(16) int Second = 456;
};
DebugLog(sizeof(Aligned<1, 2, 4, 8, 16>)); // 32
Aligned<1, 2, 4, 8, 16> a;
DebugLog((char*)&a.Second - (char*)&a.First); // 16
Assembly
C++ allows us to embed assembly code. This is called “inline
assembly” and its meaning is highly-specific to the compiler and the
CPU being compiled for. All that the C++ language standard says is
that we write asm("source code") and the rest is left up to the
compiler. For example, here’s some inline assembly that subtracts 5
from 20 on x86 as compiled by Clang on macOS:
int difference = 0;
asm(
"movl $20, %%eax;" // Put 20 in the eax register
"movl $5, %%ebx;" // Put 5 in the ebx register
"subl %%ebx, %%eax ":"=a"(difference)); // difference
= eax - ebx
DebugLog(difference); // 15
Each compiler will put its own constraints on inline assembly code.
This includes whether the Intel or AT&T assembly syntax is used,
how C++ code interacts with the inline assembly, and of course the
supported CPU architecture instruction sets.
////////////////////
// library.h (C++)
////////////////////
int Madd(int a, int b, int c);
////////////////////
// library.cpp (C++)
////////////////////
#include "library.h"
// Compiled into object file with name Maddiii_i
// Example name only. Actual name is effectively
unpredictable.
int Madd(int a, int b, int c)
{
return a*b + c;
}
////////////////////
// main.c (C)
////////////////////
#include "library.h"
void Foo()
{
// OK: library.h declares a Madd that takes three
ints and returns an int
int result = Madd(2, 4, 6);
// Print the result
printf("%d\n", result);
}
Both library.cpp and main.c compile, but the linker that takes in
library.o and main.o fails to link them together. The problem is that
main.o is trying to find a function called Madd but there isn’t one.
There’s a function called Maddiii_i, but that doesn’t count because
only exact names are matched.
To solve this problem, C++ provides a way to tell the compiler that
code should be compiled with the same language linkage rules as C:
////////////////////
// library.h (C++)
////////////////////
// Everything in this block should be compiled with C's
linkage rules
extern "C"
{
int Madd(int a, int b, int c);
}
////////////////////
// library.cpp (C++)
////////////////////
#include "library.h"
// Definitions need to match the language linkage of
their declarations
extern "C"
{
// Compiled into object file with name Madd
// Not mangled into Maddiii_i
int Madd(int a, int b, int c)
{
return a*b + c;
}
}
Now that Madd doesn’t have its name mangled the linker can find it
and produce a working executable.
Fourth, and again similarly, variables and functions can’t have the
same name even if they’re in different namespaces. All of these
rules stem from C’s requirement that everything has a unique name.
If only a single entity needs its language linkage changed, the curly
braces can be omitted similar to how they’re optional for one-
statement if blocks. This doesn’t, however, create a block scope as
it does with other curly braces:
// Change linkage to C
extern "C"
{
// Change linkage back to C++
extern "C++" int Madd(int a, int b, int c);
}
// OK: linkage is C++
extern "C++" int Madd(int a, int b, int c)
{
return a*b + c;
}
////////////////////
// library.h
////////////////////
// If compiled as C++, this is defined
#ifdef __cplusplus
// Make a macro called EXPORTED with the code to set
C language linkage
#define EXPORTED extern "C"
// If compiled as C (assumed since not C++)
#else
// Make an empty macro called EXPORTED
#define EXPORTED
#endif
// Add EXPORTED at the beginning
// For C++, this sets the language linkage to C
// For C, this does nothing
EXPORTED int Madd(int a, int b, int c);
////////////////////
// library.c
////////////////////
#include "library.h"
// Compiled into object file with name Madd regardless of
language
EXPORTED int Madd(int a, int b, int c)
{
return a*b + c;
}
Conclusion
In addition to the myriad low-level controls C++ gives us, these
features provide us with even more control. We can query and set
the alignment of various data types and variables to make optimal
use of specific CPU architectures’ requirements to improve
performance in a variety of ways. C# provides some control over
struct field layout, but that’s a far more limited tool than alignas in
C++.
C++ also allows for a high level of compatibility with its predecessor:
C. Despite having far more features, C++ code can be easily
integrated with C code by setting the language linking mode and
following a few special rules. This makes our C++ libraries available
for usage in C and in environments that follow C’s linkage rules.
There are quite a few of those, including language bindings for C#,
Rust, Python, and JavaScript via Node.js. The same goes for C#
with its P/Invoke system of language bindings that enables
interoperability with the C linkage model.
34. Fold Expressions and Elaborated Type
Specifiers
Fold Expressions
Fold expressions, available since C++17, allow us to apply a binary
operator to all the parameters in a template’s parameter pack. For a
simple example, say we want to add up some integers:
This kind of fold expression is called a “unary right fold.” This means
that the rightmost arguments have the operator applied to them fist.
template<int... Vals>
int SumOfAll = (... + Vals); // Swapped "..." and "Vals"
DebugLog(SumOfAll<1, 2, 3, 4>); // 10
The choice of a left or right fold doesn’t really matter when we’re just
adding integers, but it will surely matter with other types and other
operators.
template<bool... Vals>
bool AndAll = (... && Vals);
DebugLog(AndAll<false, false>); // false
DebugLog(AndAll<false, true>); // false
DebugLog(AndAll<true, false>); // false
DebugLog(AndAll<true, true>); // true
DebugLog(AndAll<>); // true
template<bool... Vals>
bool OrAll = (... || Vals);
DebugLog(OrAll<false, false>); // false
DebugLog(OrAll<false, true>); // true
DebugLog(OrAll<true, false>); // true
DebugLog(OrAll<true, true>); // true
DebugLog(OrAll<>); // false
And third, which is by far the least common use case, the , operator
will evaluate to void():
template<bool... Vals>
void Goo()
{
return (... , Vals); // Equivalent to "return
void();"
}
// OK
Goo();
Now that we’ve seen the “unary” fold expressions, let’s look at the
“binary” ones. To make these, we add the same binary operator after
the ... then an additional value:
template<int... Vals>
// Add the operator (+) then an additional value (1)
after the unary fold
int SumOfAllPlusOne = (Vals + ... + 1);
DebugLog(SumOfAllPlusOne<1, 2, 3, 4>); // 11
Here we’ve converted the unary fold expression (Vals + ...) into a
binary one by adding + 1 to the end of it. This adds another value in
addition to the values in the parameter pack. Since this was a “binary
right fold” the parentheses will be added on the rightmost values first:
The “binary left fold” version just has the additional value on the left:
template<int... Vals>
int SumOfAllPlusOne = (1 + ... + Vals);
When instantiated with four values in the parameter pack, it’ll look
like this:
+
-
*
/
%
^
&
|
=
<
>
<<
>>
+=
-=
*=
/=
%=
^=
&=
|=
<<=
>>=
==
!=
<=
>=
&&
||
,
.*
->*
Elaborated Type Specifiers
We’ve seen before that C code requires us to use struct Player
instead of just Player as the type name of the Player struct:
// C code
struct Player
{
int Health;
int Speed;
};
struct Player p; // C requires "struct" prefix
p.Health = 100;
p.Speed = 10;
DebugLog(p.Health, p.Speed); // 100, 10
// A class
struct Player
{
};
// A variable with the same name as the class
int Player = 123;
// Compiler error: Player is not a type
// This is because "Player" refers to the variable, not
the class
Player p;
Since struct and class are very similar, we can use them
interchangeably in our elaborated type specifiers:
Plain enum can be used with a scoped enumeration but enum class
or enum struct can’t be used with unscoped enumerations and must
be used with scoped enumerations:
namespace Gameplay
{
enum DamageType
{
Physical,
Water,
Fire,
Magic,
};
float DamageType = 3.14f;
}
// Elaborated type specifier using scope resolution
operator
enum Gameplay::DamageType d;
struct Gameplay
{
// Member type of the class
enum DamageType
{
Physical,
Water,
Fire,
Magic,
};
// Member variable of the class
constexpr static float DamageType = 3.14f;
};
// Elaborated type specifier referring to class member
type
enum Gameplay::DamageType d;
Conclusion
Fold expressions provide a way for us to cleanly apply binary
operators to templates’ parameter packs. Without them, we’d need
to resort to alternatives such as recursively instantiating templates
and using specialization to stop the recursion. That’s much less
readable and much slower to compile as many templates would
need to be instantiated and then then thrown away. We get our
choice of unary or binary and left or right folds so we can control how
the binary operator is applied to the values of the parameter pack.
Since C# doesn’t have variadic templates, it also doesn’t have fold
expressions.
///////////
// math.ixx
///////////
export module math;
We’ve done two things here. First, we’ve named the module with the
.ixx extension. Module files can be named with any extension, or no
extension at all, just like any other C++ source file. The .ixx
extension is used here simply because it’s the preference of
Microsoft Visual Studio 2019, one of the first compilers to support
modules.
Second, the line export module math; begins a module named math.
Like the rest of C++, the source file is read from top to bottom.
Everything after this statement is part of the math module, but
everything before it is not.
///////////
// math.ixx
///////////
// Normal function before the "export module" statement
float Average(float x, float y)
{
return (x + y) / 2;
}
// Exported function before the "export module" statement
export float MagnitudeSquared(float x, float y)
{
return x*x + y*y;
}
// The module begins here
export module math;
// Normal function after the "export module" statement
float Min(float x, float y)
{
return x < y ? x : y;
}
// Exported function after the "export module" statement
export float Max(float x, float y)
{
return x > y ? x : y;
}
There are a couple things to notice here, too. First, we can add
export before anything we want to be usable from outside the
module. This includes functions like these, variables, types, using
aliases, templates, and namespaces. It does not include
preprocessor directives such as macros.
Modules can seem analogous to namespaces, but the two are quite
distinct. A module can export a namespace and a module doesn’t
imply a namespace. Modules aren’t meant to replace namespaces,
but they may be used for similar purposes in grouping together
related functionality.
Second, two of these functions are before the export module math;
statement. These are part of the “global module” rather than the math
module, just like everything outside of a namespace is part of the
“global namespace.”
There can be only one module in a module unit source file. This isn’t
allowed:
// First module: OK
export module math;
float Min(float x, float y)
{
return x < y ? x : y;
}
// Second module: compiler error
export module util;
export bool IsNearlyZero(float val)
{
return val < 0.0001f;
}
Assuming we don’t do that, let’s now use this module from another
file:
///////////
// main.cpp
///////////
// Import the module for usage
import math;
// OK: Max is found in the "math" module we imported
DebugLog(Max(2, 4)); // 4
// Compiler error: none of these are part of the "math"
module
DebugLog(Average(2, 4));
DebugLog(MagnitudeSquared(2, 4));
DebugLog(Min(2, 4));
///////////////
// geometry.ixx
///////////////
// Specify that this is the "geometry" partition of the
"math" module
export module math:geometry;
export float MagnitudeSquared(float x, float y)
{
return x * x + y * y;
}
////////////
// stats.ixx
////////////
// Specify that this is the "stats" partition of the
"math" module
export module math:stats;
export float Min(float x, float y)
{
return x < y ? x : y;
}
export float Max(float x, float y)
{
return x > y ? x : y;
}
export float Average(float x, float y)
{
return (x + y) / 2;
}
///////////
// math.ixx
///////////
// This is the primary "math" module
export module math;
// Import the "stats" partition and export it
export import :stats;
// Import the "geometry" partition and export it
export import :geometry;
///////////
// main.cpp
///////////
// Import the "math" module as normal
import math;
// Use its exported entities as normal
DebugLog(Min(2, 4)); // 2
DebugLog(Max(2, 4)); // 4
DebugLog(Average(2, 4)); // 3
DebugLog(MagnitudeSquared(2, 4)); // 20
///////////////
// geometry.ixx
///////////////
// This is a primary "math.geometry" module
export module math.geometry;
export float MagnitudeSquared(float x, float y)
{
return x * x + y * y;
}
////////////
// stats.ixx
////////////
// This is a primary "math.stats" module
export module math.stats;
export float Min(float x, float y)
{
return x < y ? x : y;
}
export float Max(float x, float y)
{
return x > y ? x : y;
}
export float Average(float x, float y)
{
return (x + y) / 2;
}
///////////
// math.ixx
///////////
// This is the primary "math" module
export module math;
// Import the "math.stats" module and export it
export import math.stats;
// Import the "math.geometry" module and export it
export import math.geometry;
///////////
// main.cpp
///////////
// Import the "math" module as normal
import math;
// Use its exported entities as normal
DebugLog(Min(2, 4)); // 2
DebugLog(Max(2, 4)); // 4
DebugLog(Average(2, 4)); // 3
DebugLog(MagnitudeSquared(2, 4)); // 20
The difference here is that math.stats and math.geometry aren’t
partitions, they’re primary modules. Any of them can be used
directly:
Lastly, there is an implicit private “fragment” that can hold only code
that can’t possibly effect the module’s interface. This restriction
allows compilers to avoid recompiling code that uses the module
when only the private fragment changes:
// Primary module
export module math;
// Export some function declarations
export float Min(float x, float y);
export float Max(float x, float y);
// This begins the "private fragment"
module :private;
// Define some non-exported functions
float Min(float x, float y)
{
return x < y ? x : y;
}
float Max(float x, float y)
{
return x > y ? x : y;
}
Module Implementation Units
So far all of our module files have been “module interface units”
since they included the export keyword. They’re interfaces to be
used by code outside the module such as our main.cpp.
///////////////
// geometry.ixx
///////////////
// A non-exported module partition
module math:geometry;
// A non-exported function
float MagnitudeSquared(float x, float y)
{
return x * x + y * y;
}
///////////
// math.ixx
///////////
// Primary module
export module math;
// Import the module implementation partition
import :geometry;
// Export a function from the module implementation
partition by declaring it
// and adding the "export" keyword
export float MagnitudeSquared(float x, float y);
// Export more functions
export float Magnitude(float x, float y)
{
// Call functions in the imported module
implementation partition
float magSq = MagnitudeSquared(x, y);
return Sqrt(magSq); // TODO: write Sqrt()
}
This is similar to how we’d split code across header files (.hpp) and
translation units (.cpp). In that traditional build system, we’d add
declarations of functions in the header files and definitions of those
functions in the translation units.
If we don’t need the partitions but still want to separate the interface
from the implementation, we can drop the import and remove the
partition name:
///////////////
// geometry.cpp
///////////////
// A non-exported module
module math;
// A non-exported function
float MagnitudeSquared(float x, float y)
{
return x * x + y * y;
}
///////////
// math.ixx
///////////
export module math;
// Note: no need to "import math;" since this is already
the "math" module
export float MagnitudeSquared(float x, float y);
export float Magnitude(float x, float y)
{
float magSq = MagnitudeSquared(x, y);
return Sqrt(magSq); // TODO: write Sqrt()
}
///////////////////
// statsglobals.ixx
///////////////////
export module stats:globals;
// Variable with "module linkage"
export int NumEnemiesKilled = 0;
////////////
// stats.ixx
////////////
export module stats;
import :globals;
export void CountEnemyKilled()
{
// Refers to the same variable as in statsglobal.ixx
NumEnemiesKilled++;
}
export int GetNumEnemiesKilled()
{
// Refers to the same variable as in statsglobal.ixx
return NumEnemiesKilled;
}
///////////
// main.cpp
///////////
import stats;
DebugLog(GetNumEnemiesKilled()); // 0
CountEnemyKilled();
DebugLog(GetNumEnemiesKilled()); // 1
// Refers to the same variable as in statsglobal.ixx
DebugLog(NumEnemiesKilled); // 1
Compatibility
Given the 40+ year history of C++, the new build system must be
compatible with the old build system. There are a ton of existing
header files that we’ll want to use with modules. Thankfully, C++
provides a new preprocessor directive to do just that:
import "mylibrary.h";
// ...or...
import <mylibrary.h>;
Despite not starting with a # and requiring a ; at the end, this is really
a preprocessor directive. It’s distinct from a regular module import
because it either has double quotes ("mylibrary.h") or angle
brackets (<mylibrary.h>) depending on the header search rules
desired.
////////////////
// mylibrary.ixx
////////////////
// Module that wraps mylibrary.h
export module mylibrary;
// Export everything in the header file that can be
exported
import "mylibrary.h";
//////////////
// mylibrary.h
//////////////
int ReadVersion()
{
int version = ReadTextFileAsInteger("version.txt");
#if ENABLE_LOGGING
DebugLog("Version: ", version);
#endif
return version;
}
///////////
// main.cpp
///////////
#include "mylibrary.h"
int version = ReadVersion(); // Does not log
// ...equivalent to...
int ReadVersion()
{
int version = ReadTextFileAsInteger("version.txt");
#if ENABLE_LOGGING // Note: not defined
DebugLog("Version: ", version);
#endif
return version;
}
int version = ReadVersion();
/////////////////
// mainlogged.cpp
/////////////////
// Define a preprocessor symbol before #include
#define ENABLE_LOGGING 1
#include "mylibrary.h"
int version = ReadVersion(); // Does log
// ...equivalent to...
#define ENABLE_LOGGING 1
int ReadVersion()
{
int version = ReadTextFileAsInteger("version.txt");
#if ENABLE_LOGGING // Note: is defined
DebugLog("Version: ", version);
#endif
return version;
}
int version = ReadVersion();
///////////////
// metadata.ixx
///////////////
// No module name means "global module"
module;
// Define a preprocessor symbol before #include
// Only preprocessor symbols are allowed in this section
#define ENABLE_LOGGING 1
// Use #include instead of the import directive
#include "mylibrary.h"
// Our named module
export module metadata;
// Export a function from the header file
export int ReadVersion();
///////////
// main.cpp
///////////
// Use the module as normal
import metadata;
DebugLog(ReadVersion()); // 6
The second difference between the import directive and import with
a module is that preprocessor macros in the header file are
exported:
///////////////
// legacymath.h
///////////////
// Macro defined in the header file
#define PI 3.14
///////////
// math.ixx
///////////
export module math;
// Import directive exposes the PI macro
import "legacymath.h";
export double GetCircumference(double radius)
{
// Macros from the import directive are usable
return 2.0 * PI * radius;
}
///////////
// main.cpp
///////////
import math;
// OK
DebugLog(GetCircumference(10.0));
// Compiler error: macros from import directives are not
exported
DebugLog(PI);
Notice how the PI macro is available for use in the header unit that
used the import directive but not in users of that module. This
prevents macros from transitively “leaking” throughout an entire
program.
Conclusion
C++20’s new module build system is much more analogous to C#
than its own legacy header files and #include. In C++ terms, C#
mixes namespaces and modules together somewhat. We write the
name of a namespace (using Math;) in order to gain access to its
contents. C++ separates these two features. We can write import
math; without math being a namespace. We can layer namespaces
on top of modules and even export them.
Since C++ has no GC, our objects never move around. We therefore
have no need for a fixed statement as we can simply take the
address of objects:
struct ByteArray
{
int32_t Length;
uint8_t* Bytes;
};
void ZeroBytes(ByteArray& bytes)
{
ByteArray* pBytes = &bytes;
for (int i = 0; i < pBytes->Length; ++i)
{
pBytes->Bytes[i] = 0;
}
}
C++ has no need for fixed size buffers as it directly supports arrays:
struct FixedLengthArray
{
// 16 integers are directly part of the struct
int32_t Elements[16];
};
struct Vector2
{
float X;
float Y;
};
struct FixedLengthArray
{
// 16 Vector2s are directly part of the struct
Vector2 Elements[16];
};
Properties
C# structs and classes support a special kind of function called
“properties” that give the illusion that the user is referencing a field
rather than calling a function:
class Player
{
// Conventionally called the "backing field"
string m_Name;
// Property called Name of type string
public string Name
{
// Its "get" function takes no parameters and
must return the property
// type: string
get
{
return m_Name;
}
// The "set" function is implicitly passed a
single parameter of the
// property type (string) and must return void
set
{
m_Name = value;
}
}
}
Player p = new Player();
// Call "set" on the Name property and pass "Jackson" as
the value parameter
p.Name = "Jackson";
// Call "get" on the Name property and get the returned
string
DebugLog(p.Name);
When the bodies of the get and set functions and the “backing field”
are trivial, as shown above, automatically-implemented properties
can be used to tell the compiler to generate this boilerplate:
class Player
{
public string Name { get; set; }
}
struct Player
{
const char* m_Name;
const char* GetName() const
{
return m_Name;
}
void SetName(const char* value)
{
m_Name = value;
}
};
Player p{};
p.SetName("Jackson");
DebugLog(p.GetName());
struct Player
{
const char* m_Name;
const char* Name() const
{
return m_Name;
}
void Name(const char* value)
{
m_Name = value;
}
};
Player p{};
p.Name("Jackson");
DebugLog(p.Name());
using System.Runtime.InteropServices;
public static class WindowsApi
{
// This function is implemented in Windows'
User32.dll
[DllImport("User32.dll", CharSet=CharSet.Unicode)]
public static extern int MessageBox(
IntPtr handle, string message, string caption,
int type);
}
// Call the external function
WindowsApi.MessageBox((IntPtr)0, "Hello!", "Title", 0);
///////////////
// platform.hpp
///////////////
// Windows
#ifdef _WIN32
#include <windows.h>
// Non-Windows (e.g. macOS)
#else
// TODO
#endif
class Platform
{
#if _WIN32
using MessageBoxFuncPtr = int32_t(*)(
void*, const char*, const char*, uint32_t);
HMODULE dll;
MessageBoxFuncPtr mb;
#else
// TODO
#endif
public:
Platform()
{
#if _WIN32
dll = LoadLibraryA("User32.dll");
mb = (MessageBoxFuncPtr)(GetProcAddress(dll,
"MessageBoxA"));
#else
// TODO
#endif
}
~Platform()
{
#if _WIN32
FreeLibrary(dll);
#else
// TODO
#endif
}
// Abstracts calls to MessageBoxA on Windows and
something else on other
// platforms (e.g. macOS)
void MessageBox(const char* message, const char*
title)
{
#if _WIN32
(*mb)(nullptr, message, title, 0);
#else
// TODO
#endif
}
};
///////////
// game.cpp
///////////
#include "platform.hpp"
Platform platform{};
platform.MessageBox("Hello!", "Title");
Functions like this could be put into a namespace or made into static
member functions of a class, but the principal remains: the function
is disconnected from what it “extends” with no special access to it.
Checked Arithmetic
C# features the checked keyword to perform runtime checks on
arithmetic. We can opt into this on a per-expression basis or for a
whole block:
struct OverflowException
{
};
struct Player
{
uint32_t Health;
void TakeDamage(uint32_t amount)
{
if (amount > Health)
{
throw OverflowException{};
}
Health -= amount;
}
};
Player p{ 100 };
// OK: Health is now 50
p.TakeDamage(50);
// OverflowException: tried to underflow Health to -20
p.TakeDamage(70);
struct CheckedUint32
{
uint32_t Value;
// Conversion from uint32_t
CheckedUint32(uint32_t value)
: Value(value)
{
}
// Overload the subtraction operator to check for
underflow
CheckedUint32 operator-(uint32_t amount)
{
if (amount > Value)
{
throw OverflowException{};
}
return Value - amount;
}
// Implicit conversion back to uint32_t
operator uint32_t()
{
return Value;
}
};
struct Player
{
uint32_t Health;
void TakeDamage(uint32_t amount)
{
// Put Health in a wrapper struct to check its
arithmetic operators
Health = CheckedUint32{ Health } - amount;
}
};
C++ doesn’t have this feature built in, but there’s a library available
that provides a NAMEOF macro for similar functionality:
Player p{};
DebugLog(NAMEOF(p)); // p
float f = 1.0f;
for (int i = 0; i < 10; ++i)
{
f -= 0.1f;
DebugLog(f);
}
0.9
0.8
0.6999999
0.5999999
0.4999999
0.3999999
0.2999999
0.1999999
0.09999993
-7.450581E-08
This prints:
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
C++ doesn’t have a built-in decimal type, but libraries such as GMP
and decimal_for_cpp create such types. For example, in the latter
library we can write this:
#include "decimal.h"
using namespace dec;
decimal<1> d{ 1.0 };
for (int i = 0; i < 10; ++i)
{
d -= decimal<1>{ 0.1 };
DebugLog(d);
}
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
Reflection
C# implicitly stores a lot of information about the structure of the
program in the binaries it compiles to. This information is then
accessible at runtime for the C# code to query via “reflection”
methods like GetType that return classes like Type.
This prints:
Name: Jackson
Health: 100
The only information like this that C++ stores is data for RTTI to
support dynamic_cast and typeid. It’s a very small subset of what’s
available in C# since even full type names are not usually preserved
in typeid and only classes with virtual functions are supported by
dynamic_cast.
Name: Jackson
Health: 100
#include <rttr/registration>
using namespace rttr;
class Player
{
const char* Name;
uint32_t Health;
};
RTTR_REGISTRATION
{
registration::class_<Player>("Player")
.property("Name", &Player::Name)
.property("Health", &Player::Health);
}
Player p;
p.Name = "Jackson";
p.Health = 100;
type t = type::get<Player>();
for (auto& prop : t.get_properties())
{
DebugLog(prop.get_name(), ": ", prop.get_value(p));
}
Conclusion
Neither language is a subset of the other. In almost every chapter of
this book, we’ve seen how the C++ version of various language
features is larger and more powerful than the C# equivalent. In this
chapter we’ve seen the opposite: several features that C# has that
C++ doesn’t.
Since C++ has no GC, our objects never move around. We therefore
have no need for a fixed statement as we can simply take the
address of objects:
struct ByteArray
{
int32_t Length;
uint8_t* Bytes;
};
void ZeroBytes(ByteArray& bytes)
{
ByteArray* pBytes = &bytes;
for (int i = 0; i < pBytes->Length; ++i)
{
pBytes->Bytes[i] = 0;
}
}
C++ has no need for fixed size buffers as it directly supports arrays:
struct FixedLengthArray
{
// 16 integers are directly part of the struct
int32_t Elements[16];
};
struct Vector2
{
float X;
float Y;
};
struct FixedLengthArray
{
// 16 Vector2s are directly part of the struct
Vector2 Elements[16];
};
Properties
C# structs and classes support a special kind of function called
“properties” that give the illusion that the user is referencing a field
rather than calling a function:
class Player
{
// Conventionally called the "backing field"
string m_Name;
// Property called Name of type string
public string Name
{
// Its "get" function takes no parameters and
must return the property
// type: string
get
{
return m_Name;
}
// The "set" function is implicitly passed a
single parameter of the
// property type (string) and must return void
set
{
m_Name = value;
}
}
}
Player p = new Player();
// Call "set" on the Name property and pass "Jackson" as
the value parameter
p.Name = "Jackson";
// Call "get" on the Name property and get the returned
string
DebugLog(p.Name);
When the bodies of the get and set functions and the “backing field”
are trivial, as shown above, automatically-implemented properties
can be used to tell the compiler to generate this boilerplate:
class Player
{
public string Name { get; set; }
}
struct Player
{
const char* m_Name;
const char* GetName() const
{
return m_Name;
}
void SetName(const char* value)
{
m_Name = value;
}
};
Player p{};
p.SetName("Jackson");
DebugLog(p.GetName());
struct Player
{
const char* m_Name;
const char* Name() const
{
return m_Name;
}
void Name(const char* value)
{
m_Name = value;
}
};
Player p{};
p.Name("Jackson");
DebugLog(p.Name());
using System.Runtime.InteropServices;
public static class WindowsApi
{
// This function is implemented in Windows'
User32.dll
[DllImport("User32.dll", CharSet=CharSet.Unicode)]
public static extern int MessageBox(
IntPtr handle, string message, string caption,
int type);
}
// Call the external function
WindowsApi.MessageBox((IntPtr)0, "Hello!", "Title", 0);
///////////////
// platform.hpp
///////////////
// Windows
#ifdef _WIN32
#include <windows.h>
// Non-Windows (e.g. macOS)
#else
// TODO
#endif
class Platform
{
#if _WIN32
using MessageBoxFuncPtr = int32_t(*)(
void*, const char*, const char*, uint32_t);
HMODULE dll;
MessageBoxFuncPtr mb;
#else
// TODO
#endif
public:
Platform()
{
#if _WIN32
dll = LoadLibraryA("User32.dll");
mb = (MessageBoxFuncPtr)(GetProcAddress(dll,
"MessageBoxA"));
#else
// TODO
#endif
}
~Platform()
{
#if _WIN32
FreeLibrary(dll);
#else
// TODO
#endif
}
// Abstracts calls to MessageBoxA on Windows and
something else on other
// platforms (e.g. macOS)
void MessageBox(const char* message, const char*
title)
{
#if _WIN32
(*mb)(nullptr, message, title, 0);
#else
// TODO
#endif
}
};
///////////
// game.cpp
///////////
#include "platform.hpp"
Platform platform{};
platform.MessageBox("Hello!", "Title");
Functions like this could be put into a namespace or made into static
member functions of a class, but the principal remains: the function
is disconnected from what it “extends” with no special access to it.
Checked Arithmetic
C# features the checked keyword to perform runtime checks on
arithmetic. We can opt into this on a per-expression basis or for a
whole block:
struct OverflowException
{
};
struct Player
{
uint32_t Health;
void TakeDamage(uint32_t amount)
{
if (amount > Health)
{
throw OverflowException{};
}
Health -= amount;
}
};
Player p{ 100 };
// OK: Health is now 50
p.TakeDamage(50);
// OverflowException: tried to underflow Health to -20
p.TakeDamage(70);
struct CheckedUint32
{
uint32_t Value;
// Conversion from uint32_t
CheckedUint32(uint32_t value)
: Value(value)
{
}
// Overload the subtraction operator to check for
underflow
CheckedUint32 operator-(uint32_t amount)
{
if (amount > Value)
{
throw OverflowException{};
}
return Value - amount;
}
// Implicit conversion back to uint32_t
operator uint32_t()
{
return Value;
}
};
struct Player
{
uint32_t Health;
void TakeDamage(uint32_t amount)
{
// Put Health in a wrapper struct to check its
arithmetic operators
Health = CheckedUint32{ Health } - amount;
}
};
C++ doesn’t have this feature built in, but there’s a library available
that provides a NAMEOF macro for similar functionality:
Player p{};
DebugLog(NAMEOF(p)); // p
float f = 1.0f;
for (int i = 0; i < 10; ++i)
{
f -= 0.1f;
DebugLog(f);
}
0.9
0.8
0.6999999
0.5999999
0.4999999
0.3999999
0.2999999
0.1999999
0.09999993
-7.450581E-08
This prints:
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
C++ doesn’t have a built-in decimal type, but libraries such as GMP
and decimal_for_cpp create such types. For example, in the latter
library we can write this:
#include "decimal.h"
using namespace dec;
decimal<1> d{ 1.0 };
for (int i = 0; i < 10; ++i)
{
d -= decimal<1>{ 0.1 };
DebugLog(d);
}
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
Reflection
C# implicitly stores a lot of information about the structure of the
program in the binaries it compiles to. This information is then
accessible at runtime for the C# code to query via “reflection”
methods like GetType that return classes like Type.
This prints:
Name: Jackson
Health: 100
The only information like this that C++ stores is data for RTTI to
support dynamic_cast and typeid. It’s a very small subset of what’s
available in C# since even full type names are not usually preserved
in typeid and only classes with virtual functions are supported by
dynamic_cast.
Name: Jackson
Health: 100
#include <rttr/registration>
using namespace rttr;
class Player
{
const char* Name;
uint32_t Health;
};
RTTR_REGISTRATION
{
registration::class_<Player>("Player")
.property("Name", &Player::Name)
.property("Health", &Player::Health);
}
Player p;
p.Name = "Jackson";
p.Health = 100;
type t = type::get<Player>();
for (auto& prop : t.get_properties())
{
DebugLog(prop.get_name(), ": ", prop.get_value(p));
}
Conclusion
Neither language is a subset of the other. In almost every chapter of
this book, we’ve seen how the C++ version of various language
features is larger and more powerful than the C# equivalent. In this
chapter we’ve seen the opposite: several features that C# has that
C++ doesn’t.
For example, there are “families” of functions that all do the same
thing but on different data types. To take an absolute value of a
floating point value we call fabs for double, fabsf for float, and
fabsl for long double. In C++, we’d just overload abs with different
parameter types and the compiler would choose the right one to call.
The C++ Standard Library includes many more modern designs that
rely on C++ language features. It has that abs overloaded function,
for example. The C Standard Library is included in the C++ Standard
Library largely as part of C++’s broad goal to maintain a high degree
of compatibility with C code. There are a few parts of it that are
genuinely useful on their own, but these are few and far between.
We’re not going to go in depth and cover every little corner of the C
Standard Library in this chapter, but we’ll survey its highlights.
General Purpose
As for composition, the C++ Standard Library is made up of header
files. As of C++20, modules are also available. The C Standard
Library is available only as header files. C Standard Library header
files are named with a .h extension: math.h. These can be included
directly into C++ files: #include <math.h>. They are also wrapped by
the C++ Standard Library. The wrapped versions begin with a c and
drop the .h extension, so we can #include <cmath>. These wrapped
header files place everything in the std namespace and may also
place everything in the global namespace so both std::fabs and
::fabs work.
#include <stdlib.h>
// sizeof() evaluates to size_t
size_t intSize = sizeof(int);
DebugLog(intSize); // Maybe 4
// NULL can be used as a pointer to indicate "null"
int* ptr = NULL;
// It's vulnerable to accidental misuse in arithemtic
int sum = NULL + NULL;
// nullptr isn't: this is a compiler error
int sum2 = nullptr + nullptr;
Before C++ introduced the new and delete operators for dynamic
memory allocation, C code would use the malloc, calloc, realloc,
and free functions. The C# equivalent of malloc is
Marshal.AllocHGlobal, realloc is Marshal.ReallocHGlobal, and
free is Marshal.FreeHGlobal:
// Parse a double
double d = atof("3.14");
DebugLog(d); // 3.14
// Parse an int
int i = atoi("123");
DebugLog(i); // 123
// Parse a float and get a pointer to its end in a string
const char* floatStr = "2.2 123.456";
char* pEnd;
float f = strtof(floatStr, &pEnd);
DebugLog(f); // 2.2
// Use the end pointer to parse more
f = strtof(pEnd, &pEnd);
DebugLog(f); // 123.456
#include <stdint.h>
int32_t i32; // Always signed 32-bit
int_fast32_t if32; // Fastest signed integer type with at
least 32 bits
intptr_t ip; // Signed integer that can hold a pointer
int_least32_t il; // Smallest signed integer with at
least 32 bits
intmax_t imax; // Biggest available signed integer
// Range of 32-bit integer values
DebugLog(INT32_MIN, INT32_MAX); // -2147483648,
2147483647
// Biggest size_t
DebugLog(SIZE_MAX); // Maybe 18446744073709551615
#include <cstddef>
// C and C++ types
std::max_align_t ma; // Type with the biggest alignment
std::ptrdiff_t pd; // Big enough to hold the subtraction
of two pointers
// C++-specific types
std::nullptr_t np = nullptr; // The type of nullptr
std::byte b; // An "enum class" version of a single byte
#include <limits.h>
// Range of int values
DebugLog(INT_MIN, INT_MAX); // Maybe -2147483648,
2147483647
// Range of char values
DebugLog(CHAR_MIN, CHAR_MAX); // Maybe -128, 127
#include <inttypes.h>
// Parse a hexadecimal string to an int
// The nullptr means we don't want to get a pointer to
the end
intmax_t i = strtoimax("f0a2", nullptr, 16);
DebugLog(i); // 61602
#include <float.h>
// Biggest float
float f = FLT_MAX;
DebugLog(f); // 3.40282e+38
// Difference between 1.0 and the next larger float
float ep = FLT_EPSILON;
DebugLog(ep); // 1.19209e-07
#include <string.h>
// Compare strings: 0 for equality, -1 for less than, 1
for greater than
DebugLog(strcmp("hello", "hello")); // 0
DebugLog(strcmp("goodbye", "hello")); // -1
// Copy a string
char buf[32];
strcpy(buf, "hello");
DebugLog(buf);
// Concatenate strings
strcat(buf + 5, " world");
DebugLog(buf); // hello world
// Count characters in a string (its length)
// This iterates until NUL is found
DebugLog(strlen(buf)); // 11
// Get a pointer to the first occurrence of a character
in a string
DebugLog(strchr(buf, 'o')); // o world
// Get a pointer to the first occurrence of a string in a
string
DebugLog(strstr(buf, "ll")); // llo world
// Get a pointer to the next "token" in a string,
separated by a delimiter
// Stores state globally: not thread-safe
char* next = strtok(buf, " ");
DebugLog(next); // hello
next = strtok(nullptr, ""); // null means to continue the
global state
DebugLog(next); // world
// Copy the first three bytes of buf ("hel") to later in
the buffer
memcpy(buf + 3, buf, 3);
DebugLog(buf); // helhelworld
// Set all bytes in buf to 65 and put a NUL at the end
memset(buf, 65, 31);
buf[31] = 0;
DebugLog(buf); // AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
#include <ctype.h>
// Check for alphabetical characters
DebugLog(isalpha('a') != 0); // true
DebugLog(isalpha('9') != 0); // false
// Check for digit characters
DebugLog(isdigit('a') != 0); // false
DebugLog(isdigit('9') != 0); // true
// Change to uppercase
DebugLog(toupper('a')); // A
#include <wctype.h>
// Check for alphabetical characters
DebugLog(iswalpha(L'a') != 0); // true
DebugLog(iswalpha(L'9') != 0); // false
// Check for digit characters
DebugLog(iswdigit(L'a') != 0); // false
DebugLog(iswdigit(L'9') != 0); // true
// Change to uppercase
DebugLog(towupper(L'a')); // A
#include <stdarg.h>
// The "..." indicates a variadic function
void PrintLogs(int count, ...)
{
// A va_list holds the state
va_list args;
// Use the "va_start" macro to start getting args
va_start(args, count);
for (int i = 0; i < count; ++i)
{
// Use the "va_arg" macro to get the next arg
const char* log = va_arg(args, const char*);
DebugLog(log);
}
// Use the "va_end" macro to stop getting args
va_end(args);
}
// Call the variadic function
PrintLogs(3, "foo", "bar", "baz"); // foo, bar, baz
#include <assert.h>
assert(2 + 2 == 4); // OK
assert(2 + 2 == 5); // Calls std::abort and maybe more
Foo
Foo got status, 0
Foo calling Goo
Goo calling longjmp with, 1
Foo got status, 1
Foo calling Goo
Goo calling longjmp with, 2
Foo got status, 2
Foo calling Goo
Goo calling longjmp with, 3
Foo got status, 3
#include <time.h>
// Get the time in the return value and in the pointer we
pass
time_t t1{};
time_t t2 = time(&t1);
DebugLog(t1, t2); // Maybe 1612052060, 1612052060
// Get the amount of CPU time the program has used
// Not in relation to any particular time (like the UNIX
epoch)
clock_t c1 = clock();
// Do something expensive we want to benchmark
volatile float f = 123456;
for (int i = 0; i < 1000000; ++i)
{
f = sqrtf(f);
}
// Check the clock again
clock_t c2 = clock();
double secs = ((double)(c2) - c1) / CLOCKS_PER_SEC;
DebugLog("Took", secs, "seconds"); // Maybe: Took 0.011
seconds
#include <signal.h>
signal(SIGTERM, [](int val){DebugLog("terminated with",
val); });
raise(SIGTERM); // Maybe: terminated with 15
#include <locale.h>
// Set the locale for everything to Japanese
// This is global: not thread-safe
setlocale(LC_ALL, "ja_JP.UTF-8");
// Get the global locale
lconv* lc = localeconv();
DebugLog(lc->currency_symbol); // ¥
And finally, we’ll end with the header that enables “Hello, world!” in
C: stdio.h/cstdio. This is like Console in C#. There’s also file
system access, similar to the methods of File in C#:
#include <stdio.h>
// Output a formatted string to stdout
// The first string is the "format string" with value
placeholders: %s %d
// Subsequent values must match the placeholders' types
// This is a variadic function
printf("%s %d\n", "Hello, world!", 123); // Hello, world!
123
// Read a value from stdin
// The same "format string" is used to accept different
types
int val;
int numValsRead = scanf("%d", &val);
DebugLog(numValsRead); // {1 if the user entered a
number, else 0}
if (numValsRead == 1)
{
DebugLog(val); // {Number the user typed}
}
// Open a file, seek to its send, get the position, and
close it
FILE* file = fopen("/path/to/myfile.dat", "r");
fseek(file, 0, SEEK_END);
long len = ftell(file);
fclose(file);
DebugLog(len); // {Number of bytes in the file}
// Delete a file
int deleted = remove("/path/to/deleteme.dat");
DebugLog(deleted == 0); // True if the file was deleted
// Rename a file
int renamed = rename("/path/to/oldname.dat",
"/path/to/newname.dat");
DebugLog(renamed == 0); // True if the file was renamed
Conclusion
The C Standard Library is very old, but still very commonly used.
Being so old and based on the much less powerful C, a lot of its
design leaves a lot to be desired. The global states used by
functions like rand and strtok and macros like errno aren’t thread-
safe and are difficult to understand how to use correctly. Using
special int values, even inconsistently, instead of more structured
outputs like exceptions and enumerations is similarly difficult to use.
From here on we’ll be covering the C++ part of the C++ Standard
Library. We’ll see a lot more modern designs for areas like
containers, algorithms, I/O, strings, math, and threading!
39. Language Support Library
Source Location
Let’s start with an easy one that was just added in C++20:
<source_location>. Right away we see how the naming convention
of the C++ Standard Library differs from the C Standard Library and
it’s C++ wrappers. The C Standard Library header file name would
likely be abbreviated into something like srcloc.h. The C++ wrapper
would then be named csrcloc. The C++ Standard Library usually
prefers to spell out names more verbosely in snake_case and without
any extension, .h or otherwise.
#include <source_location>
void Foo()
{
std::source_location sl =
std::source_location::current();
DebugLog(sl.line()); // 42
DebugLog(sl.column()); // 61
DebugLog(sl.file_name()); // example.cpp
DebugLog(sl.function_name()); // void Foo()
}
It’s worth noting here at the start that a lot of code will include a
using statement to remove the need to type std:: over and over. It’s
a namespace like any other, so we have all the normal options. For
example, a using namespace std; at the file level right after the
#include lines is common:
#include <source_location>
using namespace std;
void Foo()
{
source_location sl = source_location::current();
DebugLog(sl.line()); // 43
DebugLog(sl.column()); // 61
DebugLog(sl.file_name()); // example.cpp
DebugLog(sl.function_name()); // void Foo()
}
To avoid bringing the entire Standard Library into scope, we might
using just particular classes:
#include <source_location>
using std::source_location;
void Foo()
{
source_location sl = source_location::current();
DebugLog(sl.line()); // 43
DebugLog(sl.column()); // 61
DebugLog(sl.file_name()); // example.cpp
DebugLog(sl.function_name()); // void Foo()
}
Or we might put the using just where the Standard Library is being
used:
#include <source_location>
void Foo()
{
using namespace std;
source_location sl = source_location::current();
DebugLog(sl.line()); // 43
DebugLog(sl.column()); // 61
DebugLog(sl.file_name()); // example.cpp
DebugLog(sl.function_name()); // void Foo()
}
All of these are commonly seen in C++ codebases and provide good
options for removing a lot of the std:: clutter. There is, however, one
bad option which should be avoided: adding using at the top level of
header files. Because header files are essentially copied and pasted
into other files via #include, these using statements introduce the
Standard Library to name lookup for all the files that #include them.
When header files #include other header files, this impact extends
even further:
// top.h
#include <source_location>
using namespace std; // Bad idea
// middlea.h
#include "top.h" // Pastes "using namespace std;" here
// middleb.h
#include "top.h" // Pastes "using namespace std;" here
// bottoma.cpp
#include "middlea.h" // Pastes "using namespace std;"
here
// bottomb.cpp
#include "middlea.h" // Pastes "using namespace std;"
here
// bottomc.cpp
#include "middleb.h" // Pastes "using namespace std;"
here
// bottomd.cpp
#include "middled.h" // Pastes "using namespace std;"
here
// top.h
#include <source_location>
struct SourceLocationPrinter
{
static void Print()
{
// OK: only applies to this function, not files
that #include
using namespace std;
source_location sl = source_location::current();
DebugLog(sl.line()); // 43
DebugLog(sl.column()); // 61
DebugLog(sl.file_name()); // example.cpp
DebugLog(sl.function_name()); // void
SourceLocationPrinter::Print()
}
};
// middlea.h
#include "top.h" // Does not paste "using namespace std;"
here
// middleb.h
#include "top.h" // Does not paste "using namespace std;"
here
Initializer List
Next up let’s look at <initializer_list>. We touched on
std::initializer_list before, but now we’ll take a closer look. An
instance of this class template is automatically created and passed
to the constructor when we use braced list initialization:
struct AssetLoader
{
AssetLoader(std::initializer_list<const char*> paths)
{
for (const char* path : paths)
{
DebugLog(path);
}
}
};
AssetLoader loader = {
"/path/to/model",
"/path/to/texture",
"/path/to/audioclip"
};
#include <typeinfo>
struct Vector2
{
float X;
float Y;
};
struct Vector3
{
float X;
float Y;
float Z;
};
Vector2 v2{ 2, 4 };
Vector3 v3{ 2, 4, 6 };
// All constructors are deleted, but we can still get a
reference
const std::type_info& ti2 = typeid(v2);
const std::type_info& ti3 = typeid(v3);
// There are only three public members
// They are all implementation-specific
DebugLog(ti2.name()); // Maybe struct Vector2
DebugLog(ti2.hash_code()); // Maybe 3282828341814375180
DebugLog(ti2.before(ti3)); // Maybe true
#include <typeinfo>
struct Vector2
{
float X;
float Y;
// A virtual function makes this class polymorphic
virtual bool IsNearlyZero(float epsilonSq)
{
return abs(X*X + Y*Y) < epsilonSq;
}
};
void Foo()
{
Vector2* pVec = nullptr;
try
{
// Try to take typeid of a null pointer to a
polymorphic class
DebugLog(typeid(*pVec).name());
}
// This particular exception is thrown
catch (const std::bad_typeid& e)
{
DebugLog(e.what()); // Maybe "Attempted a typeid
of nullptr pointer!"
}
}
#include <typeinfo>
struct Vector2
{
float X;
float Y;
virtual bool IsNearlyZero(float epsilonSq)
{
return abs(X*X + Y*Y) < epsilonSq;
}
};
struct Vector3
{
float X;
float Y;
float Z;
virtual bool IsNearlyZero(float epsilonSq)
{
return abs(X*X + Y*Y + Z*Z) < epsilonSq;
}
};
void Foo()
{
Vector3 vec3{};
try
{
Vector2& vec2 = dynamic_cast<Vector2&>(vec3);
}
catch (const std::bad_cast& e)
{
DebugLog(e.what()); // Maybe "Bad dynamic_cast!"!
}
}
The <typeindex> header provides the std::type_index class, not an
integer, which wraps the std::type_info we saw above. This class
provides some overloaded operators so we can compare them in
various ways, not just with the before member function:
#include <typeindex>
struct Vector2
{
float X;
float Y;
};
struct Vector3
{
float X;
float Y;
float Z;
};
Vector2 v2{ 2, 4 };
Vector3 v3{ 2, 4, 6 };
// Pass a std::type_info to the constructor
const std::type_index ti2{ typeid(v2) };
const std::type_index ti3{ typeid(v3) };
// Some member functions from std::type_info carry over
DebugLog(ti2.name()); // Maybe struct Vector2
DebugLog(ti2.hash_code()); // Maybe 3282828341814375180
// Overloaded operators are provided for comparison
DebugLog(ti2 == ti3); // false
DebugLog(ti2 < ti3); // Maybe true
DebugLog(ti2 > ti3); // Maybe false
#include <compare>
struct Integer
{
int Value;
std::strong_ordering operator<=>(const Integer&
other) const
{
// Determine the relationship once
return Value < other.Value ?
std::strong_ordering::less :
Value > other.Value ?
std::strong_ordering::greater :
std::strong_ordering::equal;
}
};
Integer one{ 1 };
Integer two{ 2 };
std::strong_ordering oneVsTwo = one <=> two;
// All the individual comparison operators are supported
DebugLog(oneVsTwo < 0); // true
DebugLog(oneVsTwo <= 0); // true
DebugLog(oneVsTwo > 0); // false
DebugLog(oneVsTwo >= 0); // false
DebugLog(oneVsTwo == 0); // false
DebugLog(oneVsTwo != 0); // true
DebugLog(std::is_lt(oneVsTwo)); // true
DebugLog(std::is_lteq(oneVsTwo)); // true
DebugLog(std::is_gt(oneVsTwo)); // false
DebugLog(std::is_gteq(oneVsTwo)); // false
DebugLog(std::is_eq(oneVsTwo)); // false
DebugLog(std::is_neq(oneVsTwo)); // true
#include <concepts>
template <typename T1, typename T2>
requires std::same_as<T1, T2>
bool SameAs;
template <typename T>
requires std::integral<T>
bool Integral;
template <typename T>
requires std::default_initializable<T>
bool DefaultInitializable;
SameAs<int, int>; // OK
SameAs<int, float>; // Compiler error
Integral<int>; // OK
Integral<float>; // Compiler error
struct NoDefaultCtor { NoDefaultCtor() = delete; };
DefaultInitializable<int>; // OK
DefaultInitializable<NoDefaultCtor>; // Compiler error
#include <coroutine>
struct ReturnObj
{
ReturnObj()
{
DebugLog("ReturnObj ctor");
}
~ReturnObj()
{
DebugLog("ReturnObj dtor");
}
struct promise_type
{
promise_type()
{
DebugLog("promise_type ctor");
}
~promise_type()
{
DebugLog("promise_type dtor");
}
ReturnObj get_return_object()
{
DebugLog("promise_type::get_return_object");
return ReturnObj{};
}
std::suspend_never initial_suspend()
{
DebugLog("promise_type::initial_suspend");
return std::suspend_never{};
}
void return_void()
{
DebugLog("promise_type::return_void");
}
std::suspend_never final_suspend()
{
DebugLog("promise_type::final_suspend");
return std::suspend_never{};
}
void unhandled_exception()
{
DebugLog("promise_type unhandled_exception");
}
};
};
ReturnObj SimpleCoroutine()
{
DebugLog("Start of coroutine");
co_return;
DebugLog("End of coroutine");
}
void Foo()
{
DebugLog("Calling coroutine");
ReturnObj ret = SimpleCoroutine();
DebugLog("Done");
}
Calling coroutine
promise_type ctor
promise_type::get_return_object
ReturnObj ctor
promise_type::initial_suspend
Start of coroutine
promise_type::return_void
promise_type::final_suspend
promise_type dtor
Done
ReturnObj dtor
#include <version>
// These print "true" or "false" depending on whether the
Standard Library has
// these features available
DebugLog("Standard Library concepts?", __cplusplus >=
__cpp_lib_concepts);
DebugLog("source_location?", __cplusplus >=
__cpp_lib_source_location);
#include <type_traits>
// Use a static value data member of a class template
static_assert(std::is_integral<int>::value); // OK
static_assert(std::is_integral<float>::value); //
Compiler error
// Use a variable template
static_assert(std::is_integral_v<int>); // OK
static_assert(std::is_integral_v<float>); // Compiler
error
There are tons of these available and they can check for nearly any
feature of a type. Here are some more advanced ones:
#include <type_traits>
struct Vector2
{
float X;
float Y;
};
struct Player
{
int Score;
Player(const Player& other)
{
Score = other.Score;
}
};
static_assert(std::is_bounded_array_v<int[3]>); // OK
static_assert(std::is_bounded_array_v<int[]>); //
Compiler error
static_assert(std::is_trivially_copyable_v<Vector2>); //
OK
static_assert(std::is_trivially_copyable_v<Player>); //
Compiler error
Besides type checks, there are various utilities for querying types:
#include <type_traits>
DebugLog(std::rank_v<int[10]>); // 1
DebugLog(std::rank_v<int[10][20]>); // 2
DebugLog(std::extent_v<int[10][20], 0>); // 10
DebugLog(std::extent_v<int[10][20], 1>); // 20
DebugLog(std::alignment_of_v<float>); // Maybe 4
DebugLog(std::alignment_of_v<double>); // Maybe 8
#include <type_traits>
// We know T is a pointer (e.g. int*)
// We don't have a name for what it points to (e.g. int)
// Use std::remove_pointer_t to get it
template <typename T>
auto Dereference(T ptr) -> std::remove_pointer_t<T>
{
return *ptr;
}
int x = 123;
int* p = &x;
int result = Dereference(p);
DebugLog(result); // 123
static_assert(std::is_same_v<std::underlying_type_t<TEnum
>, TInt>);
// Perform the cast
return static_cast<TEnum>(i);
}
// "Cast" from an enum to an integer
template <typename TEnum>
auto ToInteger(TEnum e) -> std::underlying_type_t<TEnum>
{
// Make sure the template parameter is an enum
static_assert(std::is_enum_v<TEnum>);
// Perform the cast
return static_cast<std::underlying_type_t<TEnum>>(e);
}
enum class Color : uint64_t
{
Red,
Green,
Blue
};
Color c = Color::Green;
DebugLog(c); // Green
// Cast from enum to integer
auto i = ToInteger(c);
DebugLog(i); // 1
// Cast from integer to enum
Color c2 = FromInteger<Color>(i);
DebugLog(c2); // Green
Some of this functionality exists in C# via the Type class and its
related reflection classes: FieldInfo, PropertyInfo, etc. In contrast
to C++, these all execute at runtime where their C++ counterparts
execute at compile time.
Conclusion
Some parts of C++ rely on the Standard Library. We need to use
std::initializer_list to handle braced list initialization and we
need to use std::coroutine_handle to implement coroutine return
objects. This is similar to C# that enshrines parts of the .NET API
into the language: Type, System.Single, etc.
#include <exception>
// Derive our own exception type
struct MyException : public std::exception
{
const char* msg;
MyException(const char* msg)
: msg(msg)
{
}
virtual const char* what() const noexcept override
{
return msg;
}
};
try
{
throw MyException{ "boom" };
}
catch (const std::exception& ex)
{
DebugLog(ex.what()); // boom
}
#include <exception>
// A class that acts like a pointer to a captured
exception
std::exception_ptr capturedEx;
try
{
// Do something that throws
throw MyException{ "boom" };
}
// Catch anything
catch (...)
{
// Capture the current exception
capturedEx = std::current_exception();
}
// Later...
try
{
// Check if an exception was captured
if (capturedEx)
{
// If so, re-throw it
std::rethrow_exception(capturedEx);
}
}
catch (const std::exception& ex)
{
DebugLog(ex.what()); // boom
}
#include <exception>
// Recursively print an exception and all its nested
exceptions
void PrintNestedExceptions(const std::exception& ex)
{
DebugLog(ex.what());
try
{
// If ex is a std::nested_exception, re-throw its
nested std::exception
// Otherwise do nothing
std::rethrow_if_nested(ex);
}
catch (const std::exception& nestedEx)
{
// Recurse to print the nested exception (and its
nested exceptions)
PrintNestedExceptions(nestedEx);
}
}
// Function that throws an exception
FILE* OpenFile(const char* path)
{
FILE* handle = fopen(path, "r");
if (!handle)
{
throw MyException{ "Error opening file" };
}
return handle;
}
// Function that calls a function that throws an
exception
// It throws an exception with the caught exception
nested
void PrintFirstByte(const char* path)
{
try
{
// Call a function that throws an exception
FILE* f = OpenFile(path);
DebugLog("First byte:", fgetc(f));
fclose(f);
}
// Catch OpenFile exceptions
catch (...)
{
// Throw an exception with the caught exception
nested in it
std::throw_with_nested(MyException{ "Failed to
read file" });
}
}
try
{
// Call a function that throws a
std::nested_exception
PrintFirstByte("/path/to/missing/file");
}
// Catch all std::exception objects
// Includes the derived std::nested_exception type
catch (const std::exception& ex)
{
PrintNestedExceptions(ex);
}
#include <exception>
// Set a lambda to be called when std::terminate is
called
std::set_terminate([]() { DebugLog("std::terminate
called"); });
// Throw an exception and never catch it
// This causes std::terminate to be called
// The lambda is then called
throw MyException{ "boom" };
#include <exception>
struct Second
{
~Second()
{
DebugLog("Second", std::uncaught_exceptions());
}
};
struct First
{
~First()
{
DebugLog("First before",
std::uncaught_exceptions());
try
{
Second sec;
throw std::runtime_error{ "boom" };
} // Note: sec destructor called
catch (const std::exception& e)
{
DebugLog("First caught", e.what());
}
DebugLog("First after",
std::uncaught_exceptions());
}
};
void Foo()
{
try
{
First fir;
throw std::runtime_error{ "boom" };
} // Note: fir destructor called
catch (const std::exception& e)
{
DebugLog("Foo", e.what()); // boom
}
First fir2;
} // Note: fir2 destructor called
First before 1
Second 2
First caught boom
First after 1
Foo boom
First before 0
Second 1
First caught boom
First after 0
Standard Exceptions
Now let’s look at some of the classes that derive from
std::exception to describe particular categories of errors. These are
available in <stdexcept>:
#include <stdexcept>
int GetLastElement(int* array, int length)
{
if (array == nullptr || length <= 0)
{
// C# approximation: ArgumentException
throw std::invalid_argument{ "Invalid array" };
}
return array[length - 1];
}
float Sqrt(float val)
{
if (val < 0)
{
// C# approximation: ArgumentNullException,
DivideByZeroException, etc.
throw std::domain_error{ "Value must be non-
negative" };
}
return std::sqrt(val);
}
template <typename T, int N>
void WriteToBuffer(const T& obj, char buf[N])
{
if (sizeof(T) > N)
{
// C# approximation: ArgumentException
throw std::length_error{ "Object is too big for
the buffer" };
}
std::memcpy(buf, &obj, sizeof(T));
}
void CheckedIncrement(uint32_t& x)
{
if (x == 0xffffffff)
{
// C# approximation: ArgumentException
throw std::out_of_range{ "Overflow" };
}
x++;
}
int BinarySearch(int* array, int length)
{
#if NDEBUG
for (int i = 1; i < length; ++i)
{
if (array[i - 1] > array[i])
{
// C# approximation: ArgumentException
// Note: base class of all of the above
throw std::logic_error{ "Array isn't
sorted" };
}
}
#endif
// ...implementation...
}
The Standard Library itself throws these exception types. We’re also
free to throw them in our own code and it’s common to do so.
System Error
Next up, let’s look at the <system_error> header. As we saw in the C
Standard Library, there are a lot of “error codes” exposed to us via
mechanisms like return values and the global errno macro. These
error codes are platform-specific. The C++ Standard Library includes
a platform-independent alternative in a pair of types:
std::error_condition and std::error_category.
#include <system_error>
// Get the "generic" error category
const std::error_category& category =
std::generic_category();
DebugLog(category.name()); // Maybe "generic"
// Build an error_condition representing the "no space on
device" code
std::error_condition condition =
category.default_error_condition(ENOSPC);
DebugLog(condition.value() == ENOSPC); // true
DebugLog(condition.message()); // Maybe "no space on
device"
// There are other categories
const std::error_category& sysCat =
std::system_category();
DebugLog(sysCat.name()); // Maybe "system"
We use these classes to convert platform-specific error codes to
platform-independent error codes and then take action on them. We
get the added bonus of stronger typing since these classes aren’t
simply an int and are therefore less likely to be misused.
#include <system_error>
try
{
// Handle a platform-specific error: ENOSPC
throw std::system_error{
ENOSPC, std::generic_category(), "Disk is full"
};
}
catch (const std::system_error& e)
{
// Platform-specific error (ENOSPC) converted to a
std::errc enumerator
DebugLog(e.code() == std::errc::no_space_on_device);
// true
DebugLog(e.what()); // Maybe "Disk is full: no space
on device"
}
Utility
Moving on from error-handling, let’s look at some truly generic utility
functions provied by the <utility> header:
#include <utility>
// Swap two values
int x = 2;
int y = 4;
std::swap(x, y);
DebugLog(x, y); // 4, 2
// Set a value and return the old value
int old = std::exchange(x, 6);
DebugLog(x, old); // 6, 4
// Get a const version of anything
const int& c = std::as_const(x);
DebugLog(c); // 6
// Compare integers without conversion
DebugLog(-1 > 1U); // true!
DebugLog(std::cmp_greater(-1, 1U)); // false
// Check if an integer fits in an integer type
DebugLog(std::in_range<uint8_t>(200)); // true
DebugLog(std::in_range<uint8_t>(500)); // false
// Cast to an rvalue reference
int&& rvr = std::move(x);
DebugLog(x, rvr); // 6, 6
// Forward a value as an lvalue or rvalue reference
int f1 = std::forward<int&>(x);
int f2 = std::forward<int&&>(x);
#include <utility>
// Variadic function template taking a
std::integer_sequence
template<typename T, T... vals>
void PrintInts(std::integer_sequence<T, vals...> is)
{
// Provides the number of integers
DebugLog(is.size());
// Use the parameter pack to get the values
DebugLog(vals...);
}
// Prints "3" then "123, 456, 789"
PrintInts(std::integer_sequence<int32_t, 123, 456, 789>
{});
#include <utility>
// Make a struct with an int and a float as non-static
data members
std::pair<int, float> p{ 123, 3.14f };
// Get them in two ways
DebugLog(p.first, p.second); // 123, 3.14
DebugLog(std::get<0>(p), std::get<1>(p)); // 123, 3.14
// We can also use std::make_pair to use type deduction
to avoid
// specifying the types ourselves
p = std::make_pair(123, 3.14f);
DebugLog(p.first, p.second); // 123, 3.14
// make_pair is less necessary in C++17 with template
argument deduction
std::pair p2{ 456, 2.2f };
DebugLog(p2.first, p2.second); // 456, 2.2
// std::swap works with std::pair
std::swap(p, p2);
DebugLog(p.first, p.second); // 456, 2.2
DebugLog(p2.first, p2.second); // 123, 3.14
Tuple
std::pair has largely been eclipsed by the more generic std::tuple
in <tuple>. It can hold any number of data members, not just two.
This is like the ValueTuple family of classes in C#: ValueTuple<T>,
ValueTuple<T1, T2>, ValueTuple<T1, T2, T3>, etc. There’s only one
class template in C++ since variadic templates are supported, so
truly any number of data members may be added to a std::tuple:
#include <tuple>
// Make a struct with an int and a float as non-static
data members
std::tuple<int, float> t{ 123, 3.14f };
// Get them, but only with std::get since there are no
names
DebugLog(std::get<0>(t), std::get<1>(t)); // 123, 3.14
// We can also use std::make_tuple to use type deduction
to avoid
// specifying the types ourselves
t = std::make_tuple(123, 3.14f);
DebugLog(std::get<0>(t), std::get<1>(t)); // 123, 3.14
// make_tuple is less necessary in C++17 with template
argument deduction
std::tuple t2{ 456, 2.2f };
DebugLog(std::get<0>(t2), std::get<1>(t2)); // 456, 2.2
// std::swap works with std::tuple
std::swap(t, t2);
DebugLog(std::get<0>(t), std::get<1>(t)); // 456, 2.2
DebugLog(std::get<0>(t2), std::get<1>(t2)); // 123, 3.14
#include <tuple>
std::tuple t{ 123, 3.14f, "hello" };
// Get the number of elements in the tuple at compile
time
constexpr std::size_t size =
std::tuple_size_v<decltype(t)>;
DebugLog(size); // 3
// Get the type of an element of the tuple
std::tuple_element_t<1, decltype(t)> second = std::get<1>
(t);
DebugLog(second); // 3.14
// Create a tuple of lvalue references to variables
int i = 456;
float f = 2.2f;
std::tuple tied = std::tie(i, f);
i = 100;
f = 200;
DebugLog(std::get<0>(tied), std::get<1>(tied)); // 100,
200
// Convert from std::pair to std::tuple
std::pair p{ 2, 4 };
std::tuple t2{ 0, 0 };
t2 = p;
DebugLog(std::get<0>(t2), std::get<1>(t2)); // 2, 4
// Concatenate tuples
std::tuple<
// Types from t
int, float, const char*,
// Types from tied
int, float,
// Types from t2
int, int
> cat = std::tuple_cat(t, tied, t2);
DebugLog(
std::get<0>(cat),
std::get<1>(cat),
std::get<2>(cat),
std::get<3>(cat),
std::get<4>(cat),
std::get<5>(cat),
std::get<6>(cat)); // 123, 3.14, hello, 100, 200, 2,
4
struct IntVector
{
int X;
int Y;
};
// Instantiate a class by passing the data members of a
tuple to a
// constructor of that class
IntVector iv = std::make_from_tuple<IntVector>(t2);
DebugLog(iv.X, iv.Y); // 2, 4
// Make a function call, passing the data members of a
tuple as arguments
DebugLog(std::apply([](int a, int b) { return a + b; },
t2)); // 6
#include <variant>
// Make a variant that holds either an int32_t or a
double
// Start off holding an int32_t
std::variant<int32_t, double> v{ 123 };
DebugLog(std::get<int32_t>(v)); // 123
DebugLog(v.index()); // 0
// Switch to holding a double
v = 3.14;
DebugLog(std::get<double>(v)); // 3.14
DebugLog(v.index()); // 1
// Trying to get a type that's not current throws an
exception
DebugLog(std::get<int>(v)); // throws
std::bad_variant_access
// Check the type before getting it
if (std::holds_alternative<int32_t>(v))
{
DebugLog("int32_t", std::get<int32_t>(v)); // not
printed
}
else
{
DebugLog("double", std::get<double>(v)); // double
3.14
}
// Get an int32_t pointer if that's the current type
// If it's not, get nullptr
if (int32_t* pVal = std::get_if<int32_t>(&v))
{
DebugLog(*pVal); // not printed
}
else
{
DebugLog("not an int"); // printed
}
// These helpers are common boilerplate to use lambdas
with std::visit
// They're usually stashed away in some "utilities"
header file
template<class... TFuncs> struct overloaded : TFuncs...
{
using TFuncs::operator()...;
};
template<class... TFuncs> overloaded(TFuncs...) ->
overloaded<TFuncs...>;
// Call the appropriate lambda for the variant's current
type
std::visit(overloaded {
[](double val) { DebugLog("double", val); }, //
double 3.14
[](int32_t val) { DebugLog("int32_t", val); } // not
printed
}, v);
// A class without a default constructor
struct IntWrapper
{
int Val;
IntWrapper(int val)
: Val(val)
{
}
};
// Compiler error: first type needs to be default
constructible
std::variant<IntWrapper, float> v2;
// No compiler error: std::monostate is default
constructible
// It's just a placeholder to work around this issue
std::variant<std::monostate, IntWrapper, float> v2;
// We can get a monostate, but it has no members so
there's no reason to
std::monostate m = std::get<std::monostate>(v2);
Optional
Similar to std::variant holding one of many types, std::optional
holds either a value or the absence of a value. It’s similar to
Nullable<T>/T? in C# as well as Optional.
#include <optional>
// Create an optional with a value
std::optional<float> f{ 3.14f };
// Dereference it like a pointer to get its value
DebugLog(*f); // 3.14
// By default it has no value
std::optional<float> f2;
// Dereferencing without a value is undefined behavior
DebugLog(*f2); // Could be anything!
// Manually check for a value
if (f2.has_value())
{
DebugLog(*f2); // not printed
}
else
{
DebugLog("no value"); // gets printed
}
// Can also check by converting to bool
if (f2)
{
DebugLog(*f2); // not printed
}
else
{
DebugLog("no value"); // gets printed
}
// The value member function throws an exception if
there's no value
DebugLog(f2.value()); // Throws std::bad_optional_access
// Get the value or a default
DebugLog(f2.value_or(0)); // 0
// Assign a value
f2 = 2.2f;
DebugLog(*f2); // 2.2
// Clear a value
f2.reset();
DebugLog(f2.has_value()); // false
// The nullopt constant indicates "no option"
f = std::nullopt;
DebugLog(f.has_value()); // false
Any
Similar to C#’s base System.Object/object type, C++ has std::any
in the <any> header. This is a container for any type of object or,
similar to null, no object at all.
#include <any>
// Create an empty std::any
std::any a;
// Check whether it has a value or is empty
if (a.has_value())
{
DebugLog("has value"); // not printed
}
else
{
DebugLog("empty"); // gets printed
}
// Set its value
a = 3.14f;
// Check the type
DebugLog(a.type() == typeid(float)); // true
// Get the value
// Note: not a real cast. Just a function with "cast" in
the name.
DebugLog(std::any_cast<float>(a)); // 3.14
// Getting the wrong type throws an exception
try
{
DebugLog(std::any_cast<int32_t>(a));
}
catch (const std::bad_any_cast& ex)
{
DebugLog(ex.what()); // Maybe "Bad any_cast"
}
// Destroy the value and go back to being empty
a.reset();
DebugLog(a.has_value()); // false
// Another way to create a std::any
a = std::make_any<int32_t>(123);
DebugLog(std::any_cast<int32_t>(a)); // 123
Bit Set
C# has BitArray to represent an array of bits. The C++ equivalent is
std::bitset which is a class templated on the number of bits it
holds:
#include <bitset>
// Holds three bits that are all zero
std::bitset<3> zeroes;
// Indexing gives us bool values
DebugLog(zeroes[0], zeroes[1], zeroes[2]); // false,
false, false
// Get a bit, but throw an exception if out of bounds
DebugLog(zeroes.test(1)); // false
//DebugLog(zeroes.test(3)); // throws std::out_of_range
// Manually bounds-check against the number of bits
if (3 < zeroes.size())
{
DebugLog(zeroes[3]); // not printed
}
else
{
DebugLog("out of bounds"); // gets printed
}
// Convert the bits of an unsigned long to a bitset
std::bitset<3> bits{ 0b101ul };
DebugLog(bits[0], bits[1], bits[2]); // true, false, true
// Compare bitsets
DebugLog(zeroes == bits); // false
// Check all the bits against 1
DebugLog(bits.all()); // false
DebugLog(bits.any()); // true
DebugLog(bits.none()); // false
DebugLog(bits.count()); // 2
// Set a bit
bits.set(0, false);
DebugLog(bits[0], bits[1], bits[2]); // false, false,
true
// Set all bits to true or false
bits.set();
DebugLog(bits[0], bits[1], bits[2]); // true, true, true
bits.reset();
DebugLog(bits[0], bits[1], bits[2]); // false, false,
false
// Perform bit operations on the set
bits |= 0b010;
DebugLog(bits[0], bits[1], bits[2]); // false, true,
false
bits >>= 1;
DebugLog(bits[0], bits[1], bits[2]); // true, false,
false
// Get bits as an integer
unsigned long ul = bits.to_ulong();
DebugLog(ul); // 1
// Bits are represented in a compact manner
std::bitset<1024> kb;
DebugLog(sizeof(kb)); // 128
Functional
Finally for this chapter, <functional> contains function-related
utilities. Some of these are class templates that have an operator()
so they can be called like functions. These were more useful before
lambdas were introduced to the language, but still commonly seen
as a named shorthand alternative to them:
#include <functional>
// An object to perform +
std::plus<int32_t> add;
DebugLog(add(2, 3)); // 5
// An object to perform ==
std::equal_to<int32_t> equal;
DebugLog(equal(2, 2)); // true
// An object to perform ||
std::logical_and<int32_t> la;
DebugLog(la(1, 0)); // false
// An object to perform |
std::bit_and<int32_t> ba;
DebugLog(ba(0b110, 0b011)); // 2 (0b010)
// An object to perform the negation of another object
auto ne = std::not_fn(equal);
DebugLog(ne(2, 3)); // true
// Some class with a member function
struct Adder
{
int32_t AddOne(int32_t val)
{
return val + 1;
}
};
// An object to call a member function
auto addOne = std::mem_fn(&Adder::AddOne);
Adder adder;
DebugLog(addOne(adder, 2)); // 3
#include <functional>
// Create a std::function that calls a lambda
std::function<int32_t(int32_t, int32_t)> add{
[](int32_t a, int32_t b) {return a + b; } };
DebugLog(add(2, 3)); // 5
// Create a std::function that calls a free function
int32_t Add(int32_t a, int32_t b)
{
return a + b;
}
std::function<int32_t(int32_t, int32_t)> add2{Add};
DebugLog(add2(2, 3)); // 5
// Create a std::function that calls operator() on a
class object
struct Adder
{
int32_t operator()(int32_t a, int32_t b)
{
return a + b;
}
};
std::function<int32_t(int32_t, int32_t)> add3{ Adder{} };
DebugLog(add3(2, 3)); // 5
#include <functional>
// Create an object that calls a lambda
auto add = std::bind(
// Lambda to call
[](int32_t a, int32_t b) { return a + b; },
// Placeholders for parameters
std::placeholders::_1,
std::placeholders::_2);
DebugLog(add(2, 3)); // 5
// Create an object that calls a free function
int32_t Add(int32_t a, int32_t b)
{
return a + b;
}
auto add2 = std::bind(
// Free function to call
Add,
// Placeholders for parameters
std::placeholders::_1,
std::placeholders::_2);
DebugLog(add2(2, 3)); // 5
// Create an object that calls a member function
struct Adder
{
int32_t Add(int32_t a, int32_t b)
{
return a + b;
}
};
Adder adder;
auto add3 = std::bind(
// Member function to call
&Adder::Add,
// Object to call it on
&adder,
// Placeholders for parameters
std::placeholders::_1,
std::placeholders::_2);
DebugLog(add3(2, 3)); // 5
#include <limits>
DebugLog(std::numeric_limits<int32_t>::min()); //
-2147483648
DebugLog(std::numeric_limits<int32_t>::max()); //
2147483647
The min and max member functions are constexpr, so they can be
used in compile-time programming just like the equivalent C# const
fields.
#include <limits>
// Difference between 1.0 and the next representable
floating-point value
DebugLog(std::numeric_limits<float>::epsilon()); //
1.19209e-07
// Largest error in rounding a floating-point value
DebugLog(std::numeric_limits<float>::round_error()); //
0.5
// Floating-point constants
DebugLog(std::numeric_limits<float>::infinity()); // inf
DebugLog(std::numeric_limits<float>::quiet_NaN()); // nan
DebugLog(std::numeric_limits<float>::signaling_NaN()); //
nan
// Type info useful when writing templates
DebugLog(std::numeric_limits<float>::is_integer); //
false
DebugLog(std::numeric_limits<float>::is_exact); // false
DebugLog(std::numeric_limits<float>::is_modulo); // false
DebugLog(std::numeric_limits<float>::digits10); // 6
Numbers
The <numbers> header was introduced in C++20 to provide
mathematical constants in the std::numbers namespace. C# has a
few of these as const fields of Math, but the selection is limited and
only double values are provided. C++ provides a more robust set as
variable templates for each numeric type:
#include <numbers>
// Base 2 log of e
DebugLog(std::numbers::log2e_v<float>); // 1.4427
// Base 10 log of e
DebugLog(std::numbers::log10e_v<float>); // 0.434294
// Pi
DebugLog(std::numbers::pi_v<float>); // 3.14159
// 1 divided by pi
DebugLog(std::numbers::inv_pi_v<float>); // 0.31831
// 1 divided by the square root of pi
DebugLog(std::numbers::inv_sqrtpi_v<float>); // 0.56419
// Natural logarithm of 2
DebugLog(std::numbers::ln2_v<float>); // 0.693147
// Natural logarithm of 10
DebugLog(std::numbers::ln10_v<float>); // 2.30259
// Square root of 2
DebugLog(std::numbers::sqrt2_v<float>); // 1.41421
// Square root of 3
DebugLog(std::numbers::sqrt3_v<float>); // 1.73205
// 1 divided by the square root of 3
DebugLog(std::numbers::inv_sqrt3_v<float>); // 0.57735
// The Euler–Mascheroni constant
DebugLog(std::numbers::egamma_v<float>); // 0.577216
// The golden ratio
DebugLog(std::numbers::phi_v<float>); // 1.61803
DebugLog(std::numbers::pi); // 3.14159
Numeric
We’ll cover the <numeric> header in two parts because it serves two
quite different purposes. In this chapter we’ll just look at three
common numeric algorithms it provides. These aren’t available in
C#:
#include <numeric>
// Greatest common divisor
DebugLog(std::gcd(12, 9)); // 3
// Least common multiple
DebugLog(std::lcm(12, 9)); // 36
// Half way between two numbers
DebugLog(std::midpoint(12.0, 9.0)); // 10.5
We’ll see the rest of the <numeric> header, which deals with
sequences of numbers, later in the book when we look at generic
algorithms.
Ratio
The <ratio> header provides a single class template: std::ratio. It
takes two integer template parameters representing a numerator and
a denominator. It has only two members, num and den, and both are
static. These are calculated at compile time by dividing the template
parameters by their greatest common divisor:
#include <ratio>
// Greatest common divisor of 1000 and 60 is 20
using MsPerFrame = std::ratio<1000, 60>;
// num = 1000 / 20 = 50
// den = 60 / 20 = 3
DebugLog(MsPerFrame::num, MsPerFrame::den); // 50, 3
#include <ratio>
DebugLog(std::nano::num, std::nano::den); // 1,
1000000000
DebugLog(std::milli::num, std::milli::den); // 1, 1000
DebugLog(std::kilo::num, std::kilo::den); // 1000, 1
DebugLog(std::mega::num, std::mega::den); // 1000000, 1
The durations we saw in the <chrono> header are actually
instantiations of std::ratio. For example:
Alias Ratio
std::chrono::seconds std::ratio<1, 1>
std::chrono::minutes std::ratio<60, 1>
std::chrono::hours std::ratio<3600, 1>
std::chrono::days std::ratio<86400, 1>
#include <complex>
// Real part is 2. Imaginary part is 0.
std::complex<float> c1{ 2, 0 };
DebugLog(c1.real(), c1.imag()); // 2, 0
// Real part is 0. Imaginary part is 1.
std::complex<float> c2{ 0, 1 };
// Some operators are overloaded
DebugLog(c1 + c2); // 2, 1
DebugLog(c1 - c2); // 2, -1
DebugLog(c1 == c2); // false
DebugLog(c1 != c2); // true
DebugLog(-c1); // -2, -0
// Trigonometric functions
DebugLog(std::sin(c1)); // 0.909297, -0
DebugLog(std::cos(c1)); // -0.416147,-0
// Hyperbolic functions
DebugLog(std::sinh(c1)); // 3.62686, 0
DebugLog(std::cosh(c1)); // 3.7622, 0
// Exponential functions
DebugLog(std::pow(c1, c2)); // 0.769239, 0.638961
DebugLog(std::sqrt(c1)); // 1.41421, 0
// Misc functions
DebugLog(std::abs(c1)); // 2
DebugLog(std::norm(c1)); // 4
DebugLog(std::conj(c1)); // 2, -0
#include <complex>
using namespace std::literals::complex_literals;
std::complex<double> d = 2i;
DebugLog(d); // 0, 2
std::complex<float> f = 2if;
DebugLog(f); // 0, 2
std::complex<long double> ld = 2il;
DebugLog(ld); // 0, 2
Bit
The <bit> header, introduced in C++20, provides one enumeration
for dealing with endianness. This can be used like the
BitConverter.IsLittleEndian constant in C#:
#include <bit>
bool isLittleEndian = std::endian::native ==
std::endian::little;
DebugLog(isLittleEndian); // Maybe true
#include <bit>
// Check if only one bit is set, i.e. value is a power of
two
DebugLog(std::has_single_bit(2u)); // true
DebugLog(std::has_single_bit(3u)); // false
// Get the largest power of two greater than or equal to
a value
DebugLog(std::bit_ceil(100u)); // 128
// Rotate bits left, wrapping around
DebugLog(
std::rotl(0b10100000000000000000000000000000, 2)
== 0b10000000000000000000000000000010); //
true
// Count consecutive zero bits starting at the least-
significant
DebugLog(std::countr_zero(0b1000u)); // 3
// Count total one bits
DebugLog(std::popcount(0b10101010101010101010101010101010
)); // 16
// Reinterpret the bits of one type as another type
// Not a real cast, just a function with "cast" in the
name
uint32_t i = std::bit_cast<uint32_t>(3.14f);
DebugLog(i); // 1078523331
#include <random>
// "Subtract with carry" algorithm for uint32_t values
with parameters
std::subtract_with_carry_engine<uint32_t, 24, 10, 24>
swc{};
// Generate random numbers
DebugLog(swc()); // Maybe 15039276
DebugLog(swc()); // Maybe 16323925
DebugLog(swc()); // Maybe 14283486
// Advance the engine state 100 steps without getting any
numbers
swc.discard(100);
// Reset the seed
swc.seed(123);
// Some "subtract with carry" engines with common types
and parameters
std::ranlux24_base r24; // 32-bit
std::ranlux48_base r48; // 64-bit
#include <random>
// "Mersenne Twister" engines
std::mersenne_twister_engine<
uint32_t, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13>
mt{}; // Custom
std::mt19937 mt32{}; // 32-bit with common parameters
std::mt19937_64 mt64{}; // 64-bit with common parameters
// "Linear congruential generator" engines
std::linear_congruential_engine<uint32_t, 1, 2, 3> lce{};
// Custom
std::minstd_rand0 msr0; // 32-bit "Minimal standard"
std::minstd_rand msr1; // New version of 32-bit "Minimal
standard"
#include <random>
// std::mt19937 is the underlying engine
// For each block of 32 random numbers, keep 2 of them
std::discard_block_engine<std::mt19937, 32, 2> db{};
uint32_t dbr = db();
// std::mt19937_64 is the underlying engine generating
64-bit numbers
// Convert them to 32-bit uint32_t values
std::independent_bits_engine<std::mt19937_64, 32,
uint32_t> ib{};
uint32_t ibr = ib();
// std::mt19937 is the underlying engine
// Keep a table of 16 random numbers and shuffle the
order returned
std::shuffle_order_engine<std::mt19937, 16> so{};
uint32_t sor = so();
// Alias of std::shuffle_order_engine<std::minstd_rand0,
256>
std::knuth_b kb{};
#include <random>
std::random_device rd{};
DebugLog(rd()); // Maybe 448041643
DebugLog(rd()); // Maybe 1317373389
DebugLog(rd()); // Maybe 393151656
None of these are typically used directly. That’s because they return
numbers on their full range of values. We usually want to generate
random numbers on some particular range, so we use one of many
“distribution” classes. These classes can also shape the random
numbers to fit certain patterns:
#include <random>
// Random number generator engine
std::mt19937 engine{};
// Normal/Gaussian distribution of float values
// The mean is 3 and the standard deviation is 1.5
std::normal_distribution<float> distribution{ 3.0f, 1.5f
};
// Generate random numbers with the engine on the
distribution
DebugLog(distribution(engine)); // Maybe 3.37974
DebugLog(distribution(engine)); // Maybe 2.56017
DebugLog(distribution(engine)); // Maybe 3.12689
#include <limits>
DebugLog(std::numeric_limits<int32_t>::min()); //
-2147483648
DebugLog(std::numeric_limits<int32_t>::max()); //
2147483647
The min and max member functions are constexpr, so they can be
used in compile-time programming just like the equivalent C# const
fields.
#include <limits>
// Difference between 1.0 and the next representable
floating-point value
DebugLog(std::numeric_limits<float>::epsilon()); //
1.19209e-07
// Largest error in rounding a floating-point value
DebugLog(std::numeric_limits<float>::round_error()); //
0.5
// Floating-point constants
DebugLog(std::numeric_limits<float>::infinity()); // inf
DebugLog(std::numeric_limits<float>::quiet_NaN()); // nan
DebugLog(std::numeric_limits<float>::signaling_NaN()); //
nan
// Type info useful when writing templates
DebugLog(std::numeric_limits<float>::is_integer); //
false
DebugLog(std::numeric_limits<float>::is_exact); // false
DebugLog(std::numeric_limits<float>::is_modulo); // false
DebugLog(std::numeric_limits<float>::digits10); // 6
Numbers
The <numbers> header was introduced in C++20 to provide
mathematical constants in the std::numbers namespace. C# has a
few of these as const fields of Math, but the selection is limited and
only double values are provided. C++ provides a more robust set as
variable templates for each numeric type:
#include <numbers>
// Base 2 log of e
DebugLog(std::numbers::log2e_v<float>); // 1.4427
// Base 10 log of e
DebugLog(std::numbers::log10e_v<float>); // 0.434294
// Pi
DebugLog(std::numbers::pi_v<float>); // 3.14159
// 1 divided by pi
DebugLog(std::numbers::inv_pi_v<float>); // 0.31831
// 1 divided by the square root of pi
DebugLog(std::numbers::inv_sqrtpi_v<float>); // 0.56419
// Natural logarithm of 2
DebugLog(std::numbers::ln2_v<float>); // 0.693147
// Natural logarithm of 10
DebugLog(std::numbers::ln10_v<float>); // 2.30259
// Square root of 2
DebugLog(std::numbers::sqrt2_v<float>); // 1.41421
// Square root of 3
DebugLog(std::numbers::sqrt3_v<float>); // 1.73205
// 1 divided by the square root of 3
DebugLog(std::numbers::inv_sqrt3_v<float>); // 0.57735
// The Euler–Mascheroni constant
DebugLog(std::numbers::egamma_v<float>); // 0.577216
// The golden ratio
DebugLog(std::numbers::phi_v<float>); // 1.61803
DebugLog(std::numbers::pi); // 3.14159
Numeric
We’ll cover the <numeric> header in two parts because it serves two
quite different purposes. In this chapter we’ll just look at three
common numeric algorithms it provides. These aren’t available in
C#:
#include <numeric>
// Greatest common divisor
DebugLog(std::gcd(12, 9)); // 3
// Least common multiple
DebugLog(std::lcm(12, 9)); // 36
// Half way between two numbers
DebugLog(std::midpoint(12.0, 9.0)); // 10.5
We’ll see the rest of the <numeric> header, which deals with
sequences of numbers, later in the book when we look at generic
algorithms.
Ratio
The <ratio> header provides a single class template: std::ratio. It
takes two integer template parameters representing a numerator and
a denominator. It has only two members, num and den, and both are
static. These are calculated at compile time by dividing the template
parameters by their greatest common divisor:
#include <ratio>
// Greatest common divisor of 1000 and 60 is 20
using MsPerFrame = std::ratio<1000, 60>;
// num = 1000 / 20 = 50
// den = 60 / 20 = 3
DebugLog(MsPerFrame::num, MsPerFrame::den); // 50, 3
#include <ratio>
DebugLog(std::nano::num, std::nano::den); // 1,
1000000000
DebugLog(std::milli::num, std::milli::den); // 1, 1000
DebugLog(std::kilo::num, std::kilo::den); // 1000, 1
DebugLog(std::mega::num, std::mega::den); // 1000000, 1
The durations we saw in the <chrono> header are actually
instantiations of std::ratio. For example:
Alias Ratio
std::chrono::seconds std::ratio<1, 1>
std::chrono::minutes std::ratio<60, 1>
std::chrono::hours std::ratio<3600, 1>
std::chrono::days std::ratio<86400, 1>
#include <complex>
// Real part is 2. Imaginary part is 0.
std::complex<float> c1{ 2, 0 };
DebugLog(c1.real(), c1.imag()); // 2, 0
// Real part is 0. Imaginary part is 1.
std::complex<float> c2{ 0, 1 };
// Some operators are overloaded
DebugLog(c1 + c2); // 2, 1
DebugLog(c1 - c2); // 2, -1
DebugLog(c1 == c2); // false
DebugLog(c1 != c2); // true
DebugLog(-c1); // -2, -0
// Trigonometric functions
DebugLog(std::sin(c1)); // 0.909297, -0
DebugLog(std::cos(c1)); // -0.416147,-0
// Hyperbolic functions
DebugLog(std::sinh(c1)); // 3.62686, 0
DebugLog(std::cosh(c1)); // 3.7622, 0
// Exponential functions
DebugLog(std::pow(c1, c2)); // 0.769239, 0.638961
DebugLog(std::sqrt(c1)); // 1.41421, 0
// Misc functions
DebugLog(std::abs(c1)); // 2
DebugLog(std::norm(c1)); // 4
DebugLog(std::conj(c1)); // 2, -0
#include <complex>
using namespace std::literals::complex_literals;
std::complex<double> d = 2i;
DebugLog(d); // 0, 2
std::complex<float> f = 2if;
DebugLog(f); // 0, 2
std::complex<long double> ld = 2il;
DebugLog(ld); // 0, 2
Bit
The <bit> header, introduced in C++20, provides one enumeration
for dealing with endianness. This can be used like the
BitConverter.IsLittleEndian constant in C#:
#include <bit>
bool isLittleEndian = std::endian::native ==
std::endian::little;
DebugLog(isLittleEndian); // Maybe true
#include <bit>
// Check if only one bit is set, i.e. value is a power of
two
DebugLog(std::has_single_bit(2u)); // true
DebugLog(std::has_single_bit(3u)); // false
// Get the largest power of two greater than or equal to
a value
DebugLog(std::bit_ceil(100u)); // 128
// Rotate bits left, wrapping around
DebugLog(
std::rotl(0b10100000000000000000000000000000, 2)
== 0b10000000000000000000000000000010); //
true
// Count consecutive zero bits starting at the least-
significant
DebugLog(std::countr_zero(0b1000u)); // 3
// Count total one bits
DebugLog(std::popcount(0b10101010101010101010101010101010
)); // 16
// Reinterpret the bits of one type as another type
// Not a real cast, just a function with "cast" in the
name
uint32_t i = std::bit_cast<uint32_t>(3.14f);
DebugLog(i); // 1078523331
#include <random>
// "Subtract with carry" algorithm for uint32_t values
with parameters
std::subtract_with_carry_engine<uint32_t, 24, 10, 24>
swc{};
// Generate random numbers
DebugLog(swc()); // Maybe 15039276
DebugLog(swc()); // Maybe 16323925
DebugLog(swc()); // Maybe 14283486
// Advance the engine state 100 steps without getting any
numbers
swc.discard(100);
// Reset the seed
swc.seed(123);
// Some "subtract with carry" engines with common types
and parameters
std::ranlux24_base r24; // 32-bit
std::ranlux48_base r48; // 64-bit
#include <random>
// "Mersenne Twister" engines
std::mersenne_twister_engine<
uint32_t, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13>
mt{}; // Custom
std::mt19937 mt32{}; // 32-bit with common parameters
std::mt19937_64 mt64{}; // 64-bit with common parameters
// "Linear congruential generator" engines
std::linear_congruential_engine<uint32_t, 1, 2, 3> lce{};
// Custom
std::minstd_rand0 msr0; // 32-bit "Minimal standard"
std::minstd_rand msr1; // New version of 32-bit "Minimal
standard"
#include <random>
// std::mt19937 is the underlying engine
// For each block of 32 random numbers, keep 2 of them
std::discard_block_engine<std::mt19937, 32, 2> db{};
uint32_t dbr = db();
// std::mt19937_64 is the underlying engine generating
64-bit numbers
// Convert them to 32-bit uint32_t values
std::independent_bits_engine<std::mt19937_64, 32,
uint32_t> ib{};
uint32_t ibr = ib();
// std::mt19937 is the underlying engine
// Keep a table of 16 random numbers and shuffle the
order returned
std::shuffle_order_engine<std::mt19937, 16> so{};
uint32_t sor = so();
// Alias of std::shuffle_order_engine<std::minstd_rand0,
256>
std::knuth_b kb{};
#include <random>
std::random_device rd{};
DebugLog(rd()); // Maybe 448041643
DebugLog(rd()); // Maybe 1317373389
DebugLog(rd()); // Maybe 393151656
None of these are typically used directly. That’s because they return
numbers on their full range of values. We usually want to generate
random numbers on some particular range, so we use one of many
“distribution” classes. These classes can also shape the random
numbers to fit certain patterns:
#include <random>
// Random number generator engine
std::mt19937 engine{};
// Normal/Gaussian distribution of float values
// The mean is 3 and the standard deviation is 1.5
std::normal_distribution<float> distribution{ 3.0f, 1.5f
};
// Generate random numbers with the engine on the
distribution
DebugLog(distribution(engine)); // Maybe 3.37974
DebugLog(distribution(engine)); // Maybe 2.56017
DebugLog(distribution(engine)); // Maybe 3.12689
#include <thread>
#include <chrono>
void PrintLoop(const char* threadName)
{
for (int i = 0; i < 10; ++i)
{
DebugLog(threadName, i);
// this_thread provides functions that operate on
the current thread
// sleep_for takes a std::chrono::duration
std::this_thread::sleep_for(std::chrono::milliseconds{100
});
}
}
void MyThread()
{
PrintLoop("MyThread");
}
// Create a thread and immediately start executing
MyThread in it
std::thread t{MyThread};
// This happens on the main thread
PrintLoop("Main Thread");
// Block until the thread terminates
t.join();
Main Thread, 0
MyThread, 0
MyThread, 1
Main Thread, 1
Main Thread, 2
MyThread, 2
MyThread, 3
Main Thread, 3
Main Thread, 4
MyThread, 4
MyThread, 5
Main Thread, 5
MyThread, 6
Main Thread, 6
MyThread, 7
Main Thread, 7
Main Thread, 8
MyThread, 8
Main Thread, 9
MyThread, 9
#include <thread>
// Sleep until a specific time
std::this_thread::sleep_until(std::chrono::system_clock::
now() + 1500ms);
// Tell the OS to schedule other threads
std::this_thread::yield();
// Get the current thread's ID
// Has overloaded comparison operators and works with
std::hash
std::thread::id i = std::this_thread::get_id();
std::thread t{
[&] {
DebugLog(i == std::this_thread::get_id()); //
false
}
};
t.join();
#include <thread>
void Thread(int param)
{
DebugLog(param); // 123 then 456 or visa versa
}
std::thread t1{ Thread, 123 };
std::thread t2{ Thread, 456 };
t1.join();
t2.join();
#include <thread>
std::thread t{ [] {} };
// Get the ID from outside the thread
std::thread::id id = t.get_id();
// Get an platform-dependent handle to the thread
std::thread::native_handle_type handle =
t.native_handle();
// Check how many threads the CPU can run at once
// Depends on number of processors, cores, Hyper-
threading, etc.
unsigned int hc = std::thread::hardware_concurrency();
DebugLog(hc); // Maybe 8
// Check if the thread is active, i.e. we can join() it
DebugLog(t.joinable()); // true
t.join();
DebugLog(t.joinable()); // false
The last function is detach, which releases the OS thread from the
std::thread:
#include <thread>
#include <chrono>
std::thread t{ [] {
DebugLog("thread start");
std::this_thread::sleep_for(std::chrono::milliseconds{500
});
DebugLog("thread done");
} };
// Release the OS thread
t.detach();
DebugLog(t.joinable()); // false
DebugLog("main thread done");
// Can't join() the thread anymore, so sleep longer than
it runs
std::this_thread::sleep_for(std::chrono::milliseconds{
1000 });
false
main thread done
thread start
thread done
#include <thread>
#include <chrono>
void Foo()
{
std::thread t{ [] {
std::this_thread::sleep_for(std::chrono::milliseconds{500
});
} };
} // destructor throws
#include <thread>
#include <chrono>
void Foo()
{
std::jthread t{ [] {
std::this_thread::sleep_for(std::chrono::milliseconds{500
});
} };
} // destructor calls join()
To support cancelation, the thread function can take a
std::stop_token defined in <stop_token> used to check if another
thread has requested that the thread stop executing. Using this
“stop” functionality allows us to avoid some tricky inter-thread
communication. Unfortunately, there’s no analog to this in C#:
#include <thread>
#include <chrono>
void Foo()
{
// Thread function takes a stop_token
std::jthread t{ [] (std::stop_token st) {
// Check if a stop is requested
while (!st.stop_requested())
{
std::this_thread::sleep_for(std::chrono::milliseconds{100
});
DebugLog("Thread still running");
}
DebugLog("Stop requested");
std::this_thread::sleep_for(std::chrono::milliseconds{
500 });
} };
std::this_thread::sleep_for(std::chrono::seconds{ 1
});
// Request that the thread stop executing
// This does not block like join() would
t.request_stop();
DebugLog("After requesting stop");
} // jthread destructor calls join(). About 500
milliseconds passes here...
Stop Token
Besides defining std::stop_token, the <stop_token> header has a
couple other features related to std::jthread. First, there is
std::stop_source which issues std::stop_token objects:
#include <stop_token>
// Create a source of tokens
std::stop_source source{};
// Issue tokens from the source
std::stop_token t1 = source.get_token();
std::stop_token t2 = source.get_token();
// No stop is initially requested
DebugLog(t1.stop_requested(), t2.stop_requested()); //
false, false
// Request a stop on all tokens issued by the source
source.request_stop();
DebugLog(t1.stop_requested(), t2.stop_requested()); //
true, true
#include <thread>
#include <stop_token>
#include <chrono>
// Thread that sleeps for 1 second
std::jthread t1{
[]
{
std::this_thread::sleep_for(std::chrono::milliseconds{100
0});
}
};
std::stop_source source1 = t1.get_stop_source();
std::stop_token token1 = t1.get_stop_token();
// Thread that sleeps for 0.5 seconds then stops thread
1's source
std::jthread t2{
[](std::stop_source ss)
{
std::this_thread::sleep_for(std::chrono::milliseconds{500
});
// Calls the below callback on this thread
ss.request_stop();
},
source1
};
std::thread::id id2 = t2.get_id();
// Register a callback for when thread 1's token is
stopped
std::stop_callback sc{
token1,
[&]
{
// Print which thread the callback was called on
DebugLog(id2 == std::this_thread::get_id()); //
true
}
};
// Wait 2 seconds for the threads to do their work
std::this_thread::sleep_for(std::chrono::milliseconds{
2000 });
Mutex
Proper synchronization between threads is essential to prevent data
corruption and logic errors. To this end, C++ provides numerous
facilities starting with std::mutex, the equivalent of C#’s Mutex class.
#include <thread>
#include <mutex>
// An array to fill up with integers
constexpr int size = 10;
int integers[size];
int index = 0;
// A mutex to control access to the array
std::mutex m{};
auto writer = [&]
{
while (true)
{
// Lock the mutex before accessing shared state:
index and integers
m.lock();
// Access shared state by reading index
if (index >= size)
{
// Unlock the mutex when done with the shared
state
m.unlock();
break;
}
// Access shared state by reading and writing
index and writing integers
integers[index] = index;
index++;
// Unlock the mutex when done with the shared
state
m.unlock();
}
};
std::thread t1{writer};
std::thread t2{writer};
t1.join();
t2.join();
for (int i = 0; i < size; ++i)
{
DebugLog(integers[i]); // 0, 1, 2, 3, 4, 5, 6, 7, 8,
9
}
More mutex classes are available besides the basic std::mutex. The
std::timed_mutex class allows us to attempt to unlock a mutex for a
certain amount of time:
#include <mutex>
std::timed_mutex m{};
// Try to get a lock for up to 1 millisecond then give up
bool didLock = m.try_lock_for(std::chrono::milliseconds{
1 });
#include <mutex>
std::recursive_mutex m{};
// First lock
m.lock();
// Second lock: OK with recursive_mutex but not regular
mutex
m.lock();
// Unlock second lock
m.unlock();
// Unlock first lock
m.unlock();
std::mutex m1{};
std::mutex m2{};
// Lock both mutexes
std::lock(m1, m2);
m1.unlock();
m2.unlock();
#include <thread>
#include <mutex>
// Keeps track of whether the function has been called
std::once_flag of{};
// Function to call once
auto print = [](int x) { DebugLog("called once", x); };
// Two threads racing to call the function
auto threadFunc = [&](int x) { std::call_once(of, print,
x); };
std::thread t1{ threadFunc, 123 };
std::thread t2{ threadFunc, 456 };
t1.join();
t2.join();
#include <mutex>
void Foo()
{
// Mutex to lock
std::mutex m{};
// Create a lock object for the mutex
// Constructor locks the mutex
std::lock_guard g{ m };
} // lock_guard's destructor unlocks the mutex
#include <mutex>
void Foo()
{
std::mutex m1{};
std::mutex m2{};
// Make the locks, but don't lock the mutexes yet
std::unique_lock g1{ m1, std::defer_lock };
std::unique_lock g2{ m2, std::defer_lock };
// Lock them both, avoiding deadlocks
std::lock(g1, g2);
} // unique_lock's destructor unlocks both mutexes
#include <mutex>
void Foo()
{
std::mutex m1{};
std::mutex m2{};
// Lock both mutexes
std::scoped_lock g{ m1, m2 };
} // scoped_lock's destructor unlocks both mutexes
Shared Mutex
C++17 adds another mutex type, std::shared_mutex, in the
<shared_mutex> header. There are two ways to lock this mutex:
“exclusive” and “shared.” An “exclusive” lock can only be taken by
one thread at a time and prevents any threads from taking a “shared”
lock. A “shared” lock allows other threads to take a “shared” lock but
not an “exclusive” lock. Regardless of the kind of lock, any given
thread can only lock once.
#include <mutex>
#include <shared_mutex>
class SharedInt
{
int Value = 0;
// Mutex that protects the value
mutable std::shared_mutex Mutex;
public:
int GetValue() const
{
// Multiple threads can read at once, so use take
a "shared" lock
std::shared_lock lock{ Mutex };
return Value;
}
void SetValue(int value)
{
// Only one thread can write at once, so take an
"exclusive" lock
std::unique_lock lock{ Mutex };
Value = value;
}
};
Semaphore
C++20 introduces more synchronization mechanisms than just
mutexes, starting with std::counting_semaphore in <semaphore>.
This is the analog of C#’s Semaphore class and it allows more than
one access at a time:
#include <semaphore>
// Allow up to 3 accesses with the counter starting at 3
std::counting_semaphore<3> cs{ 3 };
// Block while the counter is 0 then decrement it by 1
cs.acquire();
// Counter is now 2
cs.acquire();
// Counter is now 1
cs.acquire();
// Counter is now 0
// Try to acquire, but fail because the counter is at 0
bool didAcquire = cs.try_acquire();
DebugLog(didAcquire); // false
// Increment the counter
cs.release();
// Counter is now 1
DebugLog(cs.try_acquire()); // true
// Counter is now 0
#include <chrono>
#include <thread>
#include <barrier>
void Foo()
{
// Define a function to call when the barrier is
completed
auto complete = []() noexcept {};
// Allow up to three threads to block until the
barrier is completed
std::barrier b{ 3, complete };
auto threadFunc = [&](int id)
{
// Do something before arriving at the barrier
DebugLog("before arrival", id);
// Arrive at the barrier and get a token
auto arrivalToken = b.arrive();
// Do something after arriving at the barrier
DebugLog("after arrival", id);
// Wait for the barrier to complete
b.wait(std::move(arrivalToken));
// Do something after the barrier completes
DebugLog("after waiting", id);
};
std::jthread t1{ threadFunc, 1 };
std::jthread t2{ threadFunc, 2 };
std::jthread t3{ threadFunc, 3 };
std::this_thread::sleep_for(std::chrono::seconds{ 3
});
// Complete the barrier
complete();
}
This prints:
before arrival, 3
before arrival, 1
before arrival, 2
after arrival, 3
after arrival, 11
after arrival, 2
after waiting, 2
after waiting, 1
after waiting, 3
Latch
The std::latch class in C++20’s <latch> header provides a single-
use version of std::barrier. This class is flexible in different ways
than std::barrier. One is that any given thread can decrement the
counter more than once. Another is that decrementing can be by
more than one step. There’s no completion function though. Instead,
threads blocking on the latch are resumed when the counter hits
zero.
#include <chrono>
#include <thread>
#include <latch>
// Allow up to three threads to block
std::latch latch{ 3 };
auto threadFunc = [&](int id)
{
// Do something before
DebugLog("before", id);
// Decrement the counter by one
latch.count_down();
// Do something after
DebugLog("after", id);
// Wait for the counter to hit zero
latch.wait();
// Do something after the counter hits zero
DebugLog("after zero", id);
};
std::jthread t1{ threadFunc, 1 };
std::this_thread::sleep_for(std::chrono::seconds{ 2 });
std::jthread t2{ threadFunc, 2 };
std::this_thread::sleep_for(std::chrono::seconds{ 2 });
std::jthread t3{ threadFunc, 3 };
This prints:
before, 1
after, 1
before, 2
after, 2
And two more seconds later, thread 3 reduces the latch to zero…
before, 3
after, 3
after zero, 3
after zero, 1
after zero, 2
#include <thread>
#include <mutex>
#include <condition_variable>
// Mutex and condition variable to coordinate the threads
std::mutex m;
std::condition_variable cv;
// Flags to indicate that work is ready and the result of
work is ready
bool workReady;
bool resultReady;
// The result of work
int result;
// Thread that does the work
// First waits for the condition to be set indicating
that work is ready
void WorkThread()
{
// Lock the mutex
DebugLog("Work thread locking mutex");
std::unique_lock<std::mutex> lock(m);
// Wait for the workReady flag to be set to true
DebugLog("Work thread waiting for workReady flag");
cv.wait(lock, [] { return workReady; });
// Now we have the mutex locked
// Do "work" by setting the shared value to 123
DebugLog("Work thread doing work");
result = 123;
// Set the resultReady flag to tell the other thread
our work is done
DebugLog("Work thread setting resultReady flag");
resultReady = true;
// Unlock the mutex
DebugLog("Work thread unlocking mutex");
lock.unlock();
// Notify the condition variable
DebugLog("Work thread notifying CV");
cv.notify_one();
DebugLog("Work thread done");
}
// Initially nothing is ready
result = 0;
workReady = false;
resultReady = false;
// Start the thread
// It'll start waiting for the condition variable
std::thread worker{ WorkThread };
{
// Lock the mutex
DebugLog("Main thread locking mutex");
std::lock_guard lg(m);
// Set the flag to indicate that work is ready
DebugLog("Main thread setting workReady flag");
workReady = true;
} // Third, unlock the mutex (via lock_guard destructor)
// Fourth, notify the condition variable
DebugLog("Main thread notifying CV");
cv.notify_one();
{
// Lock the mutex
DebugLog("Main thread locking mutex to get result");
std::unique_lock ul(m);
// Wait for the resultReady flag to be set to true
DebugLog("Main thread waiting for resultReady flag");
cv.wait(ul, [] { return resultReady; });
}
// Use the result
DebugLog("Main thread got result", result);
worker.join();
This prints:
#include <atomic>
#include <thread>
// Make an atomic int starting at zero
std::atomic<int> val{ 0 };
// Run three threads that each use the atomic int
auto threadFunc = [&]
{
for (int i = 0; i < 1000; ++i)
{
// Call the overloaded ++ operator
// Atomically adds one
val++;
}
};
std::jthread t1{ threadFunc };
std::jthread t2{ threadFunc };
std::jthread t3{ threadFunc };
t1.join();
t2.join();
t3.join();
DebugLog(val); // 3000
struct Player
{
const char* Name;
int32_t Score;
int32_t Health;
};
std::atomic<Player> ap;
std::atomic<int> val{ 0 };
DebugLog(val.is_lock_free()); // true
DebugLog(ap.is_lock_free()); // false
#include <atomic>
std::atomic<int> val{ 0 };
// Write and customize how memory ordering is affected
val.store(1, std::memory_order_relaxed); // No
synchronization
val.store(2, std::memory_order_release); // No writes
reordered after this
val.store(3, std::memory_order_seq_cst); // Sequentially
consistent
// Read and customize how memory ordering is affected
int i;
i = val.load(std::memory_order_relaxed); // No
synchronization
i = val.load(std::memory_order_consume); // No writes
reordered before this
i = val.load(std::memory_order_acquire); // No reads or
writes before this
i = val.load(std::memory_order_seq_cst); // Sequentially
consistent
#include <atomic>
std::atomic<int> v{ 123 };
// Set a new value and return the old value
int old = v.exchange(456);
DebugLog(old); // 123
// Set a new value if the current value is an expected
value
int expected = 456;
bool exchanged = v.compare_exchange_strong(expected,
789);
DebugLog(exchanged); // true
DebugLog(v); // 789
exchanged = v.compare_exchange_strong(expected, 1000);
DebugLog(exchanged); // false
DebugLog(v); // 789
// A "weak" version that might set even if the expected
value differs
exchanged = v.compare_exchange_strong(expected, 1000);
Future
Lastly, we have <future> with its future and async functionality. The
async function works conceptually similarly to Task in C# in that the
platform takes care of running a function, presumably on another
thread in a thread pool. A future is returned as a placeholder for the
eventual return value of that function:
#include <chrono>
#include <thread>
#include <future>
DebugLog("Calling async");
std::future<int> f {
std::async(
[] {
std::this_thread::sleep_for(std::chrono::seconds{2});
return 123;
}
)
};
DebugLog("Waiting");
f.wait();
DebugLog("Getting return value");
int retVal = f.get();
DebugLog("Got return value", retVal);
This prints:
Calling async
Waiting
#include <chrono>
#include <thread>
#include <future>
DebugLog("Calling async");
std::future<int> f{
std::async(std::launch::deferred, [] {
std::this_thread::sleep_for(std::chrono::seconds{2});
return 123; }) };
DebugLog("Doing something else");
std::this_thread::sleep_for(std::chrono::seconds{ 5 });
DebugLog("Getting return value");
int retVal = f.get();
DebugLog("Got return value", retVal);
This prints:
Calling async
Doing something else
std::this_thread::sleep_for(std::chrono::seconds{2});
p.set_value(123);
}
};
// Block until the value is ready
DebugLog("Getting return value");
int retVal = f.get();
DebugLog(retVal);
This prints:
Getting return value
123
#include <chrono>
#include <thread>
#include <future>
int DoWork()
{
std::this_thread::sleep_for(std::chrono::seconds{ 2
});
return 123;
}
// Wrap DoWork in a class object
std::packaged_task pt{ DoWork };
// Get a future for when the packaged task is executed
std::future<int> f{ pt.get_future() };
// Call the packaged task on another thread
std::jthread t{ [&] { pt(); } };
// Block (for about 2 seconds) until DoWork returns
DebugLog("Getting return value");
int retVal = f.get();
DebugLog(retVal);
This prints:
123
Conclusion
The C++ Standard Library provides us with quite a few multi-
threading tools. At the most basic, we have thread and jthread to
create our own threads. Once we’ve created these, we have a huge
variety of synchronization mechanisms: mutexes, latches, barriers,
semaphores, condition variables, and atomics. The <future> header
provides future, promise, packaged_task to wrap up work that’ll be
done asynchronously and either complete or throw an exception in
the future. These generic tools allow us to avoid implementing
extremely complex and error-prone thread synchronization strategies
ourselves.
#include <charconv>
// Buffer to print the value to
char buf[100];
char* end = buf + sizeof(buf);
// Print 3.14 to the buffer in scientific notation
std::to_chars_result tcr{
std::to_chars(buf, end, 3.14,
std::chars_format::scientific) };
// Add a NUL terminator to the returned pointer to the
character after the
// last printed character
*tcr.ptr = '\0';
DebugLog(buf); // 3.14e+00
DebugLog("Success?", tcr.ec == std::errc()); // true
DebugLog("End pointer index", tcr.ptr - buf); // 8
// Read 3.14e+00 from the buffer
double val;
std::from_chars_result fcr{ std::from_chars(buf, end,
val) };
DebugLog(val); // 3.14
DebugLog("Success?", fcr.ec == std::errc()); // true
DebugLog("End pointer index", fcr.ptr - buf); // 8
Wide character
std::wstring std::basic_string<wchar_t>
string
Wide
std::pmr::wstring std::pmr::basic_string<wchar_t> character
string
Alias Template Meaning
UTF-8
std::pmr::u8string std::pmr::basic_string<char8_t>
string
UTF-16
std::pmr::u16string std::pmr::basic_string<char16_t>
string
UTF-32
std::pmr::u32string std::pmr::basic_string<char32_t>
string
Whichever we choose, the class “owns” the memory that the string is
stored in. That means it allocates memory when needed and
deallocates it in the destructor. It also provides a bunch of member
functions to perform common operations on the string. Here’s a
sampling of that functionality:
#include <string>
void Foo()
{
// Allocate memory for the string
std::string s{ "hello world" };
// Read and write individual characters
s[0] = 'H';
s[6] = 'W';
DebugLog(s); // Hello World
// Get a NUL-terminated const pointer to the first
character (a C string)
const char* cs = s.c_str();
DebugLog(cs); // Hello World
// Get a non-const pointer to the first character
char* d = s.data();
DebugLog(d); // Hello World
// Check if the string is empty
DebugLog(s.empty()); // false
// Get the number of characters in the string
DebugLog(s.size()); // 11
DebugLog(s.length()); // 11
// Check how much capacity is there to hold characters
DebugLog(s.capacity()); // Maybe 15
// Allocate enough memory to hold a certain number of
characters
// Note: cannot be used to shrink the string
s.reserve(128);
DebugLog(s.capacity()); // At least 128
// Request reducing allocated memory to just enough to
hold the string
s.shrink_to_fit();
DebugLog(s.capacity()); // Maybe 15
// Add a character to the end
s.push_back('!');
DebugLog(s); // Hello World!
// Check if the string starts with another string
DebugLog(s.starts_with("Hello")); // true
// Replace 1 character starting at index 5 with a comma
and a space
s.replace(5, 1, ", ");
DebugLog(s); // Hello, World!
// Get a string of 5 characters starting at index 7
std::string ss{ s.substr(7, 5) };
DebugLog(ss); // World
// Find an index of a string in the string
std::string::size_type i = s.find("llo");
DebugLog(i); // 2
// Copy the string to another string
std::string s2{ "other" };
DebugLog(s2); // other
s2 = s;
DebugLog(s2); // Hello, World!
// Compare strings' characters with overloaded
operators
DebugLog(s == s2); // true
// Empty the string
s.clear();
DebugLog(s); //
} // Destructor deallocates the string's memory
There are also some functions outside of the class that operate on
std::basic_string objects:
#include <string>
// Parse a float out of a string
// Throws an exception upon failure
std::string s{ "3.14" };
float f = std::stof(s);
DebugLog(f); // 3.14
// Convert a double to a string
std::string s2{ std::to_string(3.14) };
DebugLog(s2); // 3.140000
// Check if a string is empty
DebugLog(std::empty(s)); // false
// Get a non-const pointer to the first character
char* d = std::data(s);
DebugLog(d); // 3.14
#include <string>
using namespace std::literals::string_literals;
// Plain string literals create a std::string
std::string s{ "hello"s };
// char8_t string literals create a UTF-8 string
std::u8string s8{ u8"hello"s };
Locale and Codecvt
Next up is <locale> to help with localization. The std::locale class
indentifies a locale like CultureInfo does in C#. Its member
functions and other functions in <locale> allow us to perform
operations within the context of that locale:
#include <string>
#include <locale>
// Construct a locale for a specific locale name
std::locale loc{ "en_US.UTF-8" };
// Lexicographically compare strings with the overloaded
() operator
std::string a{ "apple" };
std::string b{ "banana" };
DebugLog(loc(a, b)); // true
// Check if a character is in a category for this locale
DebugLog(std::isspace(' ', loc)); // true
DebugLog(std::islower('a', loc)); // true
DebugLog(std::isdigit('1', loc)); // true
// Convert between uppercase and lowercase in this locale
DebugLog(std::toupper('a', loc)); // A
DebugLog(std::tolower('Z', loc)); // z
Later in the book we’ll look at I/O and see how we can use
std::locale to localize value categories like time and money.
#include <string>
#include <locale>
#include <codecvt>
void Foo()
{
// Emojis as UTF-8 and UTF-16
std::string u8 = "\xf0\x9f\x98\x8e\xf0\x9f\x91\x8d";
std::u16string u16 = u"\xd83d\xde0e\xd83d\xdc4d";
// Make a converter from UTF-8 to UTF-16
std::wstring_convert<std::codecvt_utf8_utf16<char16_t>,
char16_t> u8u16{};
// Use it to convert from UTF-8 to UTF-16
std::u16string toU16 = u8u16.from_bytes(u8);
DebugLog("Success?", u16 == toU16); // true
DebugLog("UTF-16 size", toU16.size()); // 4
for (uint32_t c : toU16)
{
DebugLog(c);
// Outputs:
// 55357
// 56846
// 55357
// 56397
}
// Make a converter from UTF-16 to UTF-8
std::wstring_convert<std::codecvt_utf8_utf16<char16_t>,
char16_t> u16u8 {};
// Use it to convert UTF-16 to UTF-8
std::string toU8 = u16u8.to_bytes(u16);
DebugLog("Success?", u8 == toU8); // true
DebugLog("UTF-8 size", toU8.size()); // 8
for (uint32_t c : toU8)
{
DebugLog(c);
// Outputs:
// 4294967280
// 4294967199
// 4294967192
// 4294967182
// 4294967280
// 4294967199
// 4294967185
// 4294967181
}
}
Format
C++20 adds the <format> header to make formatting data as strings
easier and safer than existing methods like sprintf in the C
Standard Library. The std::format function is rather similar to string
interpolation in C#: $"Score: {score}".
#include <string>
#include <format>
#include <locale>
// Format a string
int score = 123;
std::string str{ std::format("Score: {}", score) };
DebugLog(str); // Score: 123
// Format a string for a specific locale
std::locale loc{ "en_US.UTF-8" };
str = std::format(loc, "Score: {}", score);
DebugLog(str); // Score: 123
#include <format>
struct Vector2
{
float X;
float Y;
};
namespace std
{
template<class TChar>
struct std::formatter<Vector2, TChar>
{
template <typename TContext>
auto parse(TContext& pc)
{
return pc.end();
}
template<typename TContext>
auto format(Vector2 v, TContext& fc)
{
return std::format_to(fc.out(), "({}, {})",
v.X, v.Y);
}
};
}
Vector2 v{ 1, 2, 3 };
std::string s{ std::format("Vector: {}", v) };
DebugLog(s); // Vector: (1, 2, 3)
String View
C++17 introduces std::basic_string_view as a class template that
provides a read-only “view” into another string. It’s an adapter for string
literals and other arrays of characters as well as string classes like
std::basic_string. Unlike std::basic_string, it doesn’t “own” the
memory that holds the characters. That means it doesn’t allocate it or
deallocate it but instead acts like a pointer to existing memory and a
size_t to keep track of the length. As with other pointers, it’s important
to not use the std::basic_string_view after the string it points to is
deallocated.
View of
std::string_view std::basic_string_view<char>
C string
View of
wide
std::wstring_view std::basic_string_view<wchar_t>
character
string
View of
std::u8string_view std::basic_string_view<char8_t> UTF-8
string
View of
std::u16string_view std::basic_string_view<char16_t> UTF-16
string
View of
std::u32string_view std::basic_string_view<char32_t> UTF-32
string
#include <string>
#include <string_view>
const char cs[] = "C String";
std::string_view svcs{ cs };
// Check if a string view is empty
DebugLog(std::empty(svcs)); // false
// Get a pointer to the first character
const char* d = std::data(svcs);
DebugLog(d); // C String
#include <string>
#include <string_view>
using namespace std::literals;
const char cs[] = "C String";
std::string_view svcs{ cs };
// Plain string literals create a std::string_view
std::string_view s{ "hello"sv };
// char8_t string literals create a UTF-8 string view
std::u8string_view s8{ u8"hello"sv };
#include <string>
#include <regex>
// A regular expression for YYYY-MM-DD dates with
ECMAScript grammar
// Each part of the date is captured in a group
std::regex re{
"(\\d{4})-(\\d{2})-(\\d{2})",
std::regex_constants::ECMAScript };
// Check if a string matches and get the results of the
match
std::cmatch results{};
DebugLog(std::regex_match("before 2021-03-15 after",
results, re)); // true
DebugLog(results.size()); // 4
DebugLog(results[0]); // 2021-03-15 (sub-string that
matched)
DebugLog(results[1]); // 2021 (first group)
DebugLog(results[2]); // 03 (second group)
DebugLog(results[3]); // 15 (third group)
// Replace the part of a string that matches
std::basic_string s{
std::regex_replace(
std::string{ "before 2021-03-15 after" }, re,
"YYYY-MM-DD") };
DebugLog(s); // before YYYY-MM-DD after
The C# equivalent of this are classes like Regex and Match in the
System.Text.RegularExpressions namespace.
Conclusion
The C++ Standard Library layers quite a lot of functionality on top of
a very humble basis. Simple characters and arrays of characters are
extended all the way up to regular expressions, string classes, and
string views. In between we have functionality for quick and
convenient serialization, parsing, and localization.
As is usual for the Standard Library, all of this is done via the
specialization of templates. We choose the most optimal version at
compile time rather than relying on runtime strategies like virtual
functions. We can specialize any of these templates to support new
types of strings or to format our own app’s types and reap all the
same benefits that standardized types like std::basic_string do.
45. Array Containers Library
Vector
Let’s start with one of the most commonly-used container types:
std::vector. This class, found in <vector>, is the equivalent of List
in C# as it implements a dynamic array. Here’s a sampling of its API:
#include <vector>
void Foo()
{
// Create an empty vector of int
std::vector<int> v{};
// Add an element to the end
v.push_back(123);
// Construct an element in place at the end
v.emplace_back(456);
// Get size information
DebugLog(v.empty()); // false
DebugLog(v.size()); // 2
DebugLog(v.capacity()); // At least 2
DebugLog(v.max_size()); // Maybe 4611686018427387903
// Request changes to capacity
v.reserve(100); // Note: can't shrink
DebugLog(v.capacity()); // 100
v.shrink_to_fit();
DebugLog(v.capacity()); // Maybe 2
// Shrink to just the first element
v.resize(1);
// Add two defaulted elements to the end
v.resize(3);
// Access elements with overloaded index operator
v[2] = 789;
DebugLog(v[0], v[1], v[2]); // 123, 0, 789
// Access first and last elements
DebugLog(v.front()); // 123
v.back() = 1000;
DebugLog(v[2]); // 1000
// Get a pointer to the first element
int* p = v.data();
DebugLog(p[0], p[1], p[2]); // 123, 0, 1000
// Create a vector with four elements
std::vector<int> v2{ 2, 4, 6, 8 };
// Compare vectors' elements
DebugLog(v == v2); // false
// Replace the elements of v with the elements of v2
v = v2;
DebugLog(v.size()); // 4
DebugLog(v[0], v[1], v[2], v[3]); // 2, 4, 6, 8
} // Destructors free memory of v and v2
#include <vector>
std::vector<int> v1{ 100, 200, 200, 200, 300 };
// Erase every element that equals 200
std::vector<int>::size_type numErased = std::erase(v1,
200);
DebugLog(numErased); // 3
DebugLog(v1.size()); // 2
DebugLog(v1[0], v1[1]); // 100, 300
std::vector<int> v2{ 1, 2, 3, 4, 5 };
// Erase all the even numbers
numErased = std::erase_if(v2, [](int x) { return (x % 2)
== 0; });
DebugLog(numErased); // 2
DebugLog(v2.size()); // 3
DebugLog(v2[0], v2[1], v2[2]); // 1, 3, 5
#include <array>
// Create an array of 3 int elements
std::array<int, 3> a{ 1, 2, 3 };
// Query its size
DebugLog(a.size()); // 3
DebugLog(a.max_size()); // 3
DebugLog(a.empty()); // false
// Read and write its elements
DebugLog(a[0], a[1], a[2]); // 1, 2, 3
a[0] = 10;
DebugLog(a.front()); // 10
DebugLog(a.back()); // 3
// Get a pointer to the first element
int* p = a.data();
DebugLog(p[0], p[1], p[2]); // 10, 2, 3
// Create another array of 3 int elements
std::array<int, 3> a2{ 10, 2, 3 };
// Compare elements of arrays
DebugLog(a == a2); // true
Note that std::array doesn’t require that it’s allocated on the stack.
We can easily allocate one on the heap like any other class:
#include <array>
std::array<int, 3>* a = new std::array<int, 3>{ 1, 2, 3
};
DebugLog((*a)[0], (*a)[1], (*a)[2]); // 1, 2, 3
delete a;
#include <valarray>
void Foo()
{
// Create an array of two ints
std::valarray<int> va1{ 10, 20 };
// Create another array of two ints
std::valarray<int> va2{ 10, 30 };
// Compare element 0 in each, element 1 in each, and
so on
// Return a valarray of comparison results
std::valarray<bool> eq{ va1 == va2 };
DebugLog(eq[0], eq[1]); // true, false
// Add elements
std::valarray<int> sums{ va1 + va2 };
DebugLog(sums[0], sums[1]); // 20, 50
// Access elements
DebugLog(va1[0]); // 10
va1[1] = 200;
DebugLog(va1[0], va1[1]); // 10, 200
// Shift elements 1 toward the front, filling in with
zeroes
std::valarray<int> shifted{ va1.shift(1) };
DebugLog(shifted[0], shifted[1]); // 200, 0
// Shift elements 1 toward the front, rotating around
to the back
std::valarray<int> cshifted{ va1.cshift(1) };
DebugLog(cshifted[0], cshifted[1]); // 200, 10
// Copy all elements to another valarray
va1 = va2;
DebugLog(va1[0], va1[1]); // 10, 30
// Call a function with each element and assign the
return value to it
std::valarray<int> plusOne{ va1.apply([](int x) {
return x + 1; }) };
DebugLog(plusOne[0], plusOne[1]); // 11, 33
// Take 2^4 and 3^2
std::valarray<float> bases{ 2, 3 };
std::valarray<float> powers{ 4, 2 };
std::valarray<float> squares{ std::pow(bases, powers)
};
DebugLog(squares[0], squares[1]); // 16, 9
} // Destructors free memory of all valarrays
Some helper classes exist to “slice” more than one element out of a
std::valarray by passing instances of the class to the overloaded
[] operator:
#include <valarray>
std::valarray<int> va1{ 10, 20, 30, 40, 50, 60, 70 };
// A slice that starts at index 1 plus 2 elements with 0
stride
std::slice s{ 1, 2, 0 };
// Slice the valarray to get a slice_array that refers to
the slice
std::slice_array<int> sa{ va1[s] };
// Copy the slice into a new valarray
std::valarray<int> sliced{ sa };
DebugLog(sliced.size()); // 2
DebugLog(sliced[0], sliced[1]); // 20, 30
// Slice that starts at index 1 with sizes 2 and 3 and
strides 1 and 2
std::gslice g{ 1, {2, 3}, {1, 2} };
// Slice the valarray to get a gslice_array that refers
to the slice
std::gslice_array ga{ va1[g] };
// Copy the slice into a new valarray
std::valarray<int> gsliced{ ga };
DebugLog(gsliced.size()); // 6
DebugLog(gsliced[0], gsliced[1], gsliced[2]); // 20, 40,
60
DebugLog(gsliced[3], gsliced[4], gsliced[5]); // 30, 50,
70
Deque
std::deque, pronounced like “deck” and located in <deque>, is a
doubly-ended queue that owns its elements. Internally, it holds a list
of arrays but this is hidden by its API which gives the appearance
that it’s one contiguous array similar to a std::vector. This means
element access involves a second indirection, but it’s fast to add and
remove elements from the beginning and end of a std::deque. C#
has no equivalent to this container type. Here’s how to use it:
#include <deque>
void Foo()
{
// Create a deque of three floats
std::deque<float> d{ 10, 20, 30 };
// Query its size
DebugLog(d.size()); // 3
DebugLog(d.max_size()); // Maybe 4611686018427387903
DebugLog(d.empty()); // false
// Access its elements
DebugLog(d.front()); // 10
d[1] = 200;
DebugLog(d[1]); // 200
DebugLog(d.back()); // 30
// Add to and remove from the beginning and the front
d.push_front(5);
d.push_back(35);
DebugLog(d[0], d[1], d[2], d[3], d[4]); // 5, 10,
200, 30, 35
d.pop_front();
d.pop_back();
DebugLog(d[0], d[1], d[2]); // 10, 200, 30
// Remove all but the first two elements
d.resize(2);
DebugLog(d.size()); // 2
DebugLog(d[0], d[1]); // 10, 200
// Compare elements of two deques
std::deque<float> d2{ 10, 200 };
DebugLog(d == d2); // true
// Remove all of a particular element value
std::deque<float>::size_type numErased =
std::erase(d, 10);
DebugLog(numErased); // 1
DebugLog(d[0]); // 200
// Remove all elements that a function returns true
for
numErased = std::erase_if(d2, [](float x) { return x
< 100; });
DebugLog(numErased); // 1
DebugLog(d2[0]); // 200
} // Destructors free memory of all deques
Queue
Unlike std::deque, the std::queue class template in <queue> is an
adapter to provide a queue API to another collection. The default
container type is std::deque, but other containers with back, front,
push_back, and push_front member functions may be used. The
std::queue contains this collection type and provides member
functions that are implemented by calls to the contained collection.
#include <queue>
#include <deque>
void Foo()
{
// Explicitly use std::deque<int> to hold elements of
a std::queue of int
std::queue<int, std::deque<int>> qd{};
// Use the default collection type, which is
std::deque
// It's initially empty
std::queue<int> q{};
// Add elements to the back
q.push(10);
q.push(20);
q.emplace(30); // In-place construction
// Query the size
DebugLog(q.size()); // 3
DebugLog(q.empty()); // false
// Access only the first and last elements
DebugLog(q.front()); // 10
DebugLog(q.back()); // 30
// Remove elements from the front
q.pop();
DebugLog(q.size()); // 2
DebugLog(q.front()); // 20
// Copy elements to another queue
std::queue<int> q2{};
q2 = q;
DebugLog(q2.size()); // 2
DebugLog(q2.front()); // 20
DebugLog(q2.back()); // 30
} // Destructors free memory of all queues
#include <stack>
#include <vector>
void Foo()
{
// Make a stack backed by a std::vector
std::stack<int, std::vector<int>> sv{};
// Make a stack backed by the default std::deque
std::stack<int> s{};
// Add elements to the back
s.push(10);
s.push(20);
s.emplace(30); // In-place construction
// Query the size
DebugLog(s.size()); // 3
DebugLog(s.empty()); // false
// Access only the last element
DebugLog(s.top()); // 30
// Remove elements from the back
s.pop();
DebugLog(s.size()); // 2
DebugLog(s.top()); // 20
// Copy elements to another stack
std::stack<int> s2{};
s2 = s;
DebugLog(s2.size()); // 2
DebugLog(s2.top()); // 20
} // Destructors free memory of all stacks
Finally, there are queue and stack as adapter types for any collection
providing the necessary member functions. The C# Queue and Stack
classes instead mandate particular collection implementations.
46. Other Containers Library
Unordered Map
The <unordered_map> header provides C++’s Dictionary equivalent:
std::unordered_map. As with other containers like std::vector, it
“owns” the memory that keys and values are stored in. Here’s a
sampling of the API:
#include <unordered_map>
void Foo()
{
// Hash map of int keys to float values
std::unordered_map<int, float> ifum{};
// Add a key-value pair
ifum.insert({ 123, 3.14f });
// Read the value that 123 maps to
DebugLog(ifum[123]); // 3.14
// Try to read the value that 456 maps to
// There's no such key, so insert a default-
initialized value
DebugLog(ifum[456]); // 0
// Query size
DebugLog(ifum.empty()); // false
DebugLog(ifum.size()); // 2
DebugLog(ifum.max_size()); // Maybe
768614336404564650
// Try to read and throw an exception if the key
isn't found
DebugLog(ifum.at(123)); // 3.14
DebugLog(ifum.at(1000)); // throws std::out_of_range
exception
// insert() does not overwrite
ifum.insert({ 123, 2.2f }); // does not overwrite
3.14
DebugLog(ifum[123]); // 3.14
// insert_or_assign() does overwrite
ifum.insert_or_assign(123, 2.2f); // overwrites 3.14
DebugLog(ifum[123]); // 2.2
// emplace() constructs in-place
ifum.emplace(456, 1.123f);
// Remove an element
ifum.erase(456);
DebugLog(ifum.size()); // 1
} // ifum's destructor deallocates the memory storing
keys and values
A std::unordered_multimap is also available for when there are
potentially multiple of the same key. C# has no equivalent of this
class template, but it can be approximated with a Dictionary<TKey,
List<TValue>>. Here’s how to use it:
#include <unordered_map>
void Foo()
{
// Create an empty multimap that maps int to float
std::unordered_multimap<int, float> ifumm{};
// Insert two of the same key with different values
ifumm.insert({ 123, 3.14f });
ifumm.insert({ 123, 2.2f });
// Check how many values are mapped to the 123 key
DebugLog(ifumm.count(123)); // 2
// C++20: check if there are any values mapped to the
123 key
DebugLog(ifumm.contains(123)); // true
// Find one of the key-value pairs for the 123 key
const auto& found = ifumm.find(123);
DebugLog(found->first, found->second); // Maybe 123,
3.14
// Loop over all the key-value pairs for the 123 key
auto range = ifumm.equal_range(123);
for (auto i = range.first; i != range.second; ++i)
{
DebugLog(i->first, i->second); // 123, 3.14 and
123, 2.2
}
// Remove all the key-value pairs with a given key
ifumm.erase(123);
DebugLog(ifumm.size()); // 0
} // ifumm's destructor deallocates key and value memory
#include <unordered_map>
// Create maps with 2 key-value pairs each
std::unordered_multimap<int, float> umm{ {123, 3.14},
{456, 2.2f} };
std::unordered_map<int, float> um{ {123, 3.14}, {456,
2.2f} };
// Erase all the key-value pairs where the key is less
than 200
auto lessThan200 = [](const auto& pair) {
const auto& [key, value] = pair;
return key < 200;
};
std::erase_if(um, lessThan200);
std::erase_if(umm, lessThan200);
// 123 key has 0 values associated with it
DebugLog(um.count(123)); // 0
DebugLog(umm.count(123)); // 0
Map
The <map> header provides ordered versions of std::unordered_map
and std::unordered_multimap. They’re called, naturally, std::map
and std::multimap. The basics of their APIs are very similar to the
unordered counterparts, which helps with generic programming, but
there are also some differences.
#include <map>
void Foo()
{
// Create a map with three keys
std::map<int, float> m{ {456, 2.2f}, {123, 3.14},
{789, 42.42f} };
// Many functions from other containers are available
DebugLog(m.size()); // 3
DebugLog(m.empty()); // false
m.insert({ 1000, 2000.0f });
m.erase(123);
m.emplace(100, 9.99f);
// Ordering by key is guaranteed
for (const auto& item : m)
{
DebugLog(item.first, item.second);
// Prints:
// 100, 9.99
// 456, 2.2
// 789, 42.42
// 1000, 2000
}
} // m's destructor deallocates key and value memory
#include <map>
void Foo()
{
// Create a multimap with three keys: two are
duplicated
std::multimap<int, float> mm{ {456, 42.42f}, {123,
3.14f}, {123, 2.2f} };
// Many functions from other containers are available
DebugLog(mm.size()); // 3
DebugLog(mm.empty()); // false
mm.insert({ 1000, 2000.0f });
mm.erase(456);
mm.emplace(100, 9.99f);
// Ordering by key is guaranteed
for (const auto& item : mm)
{
DebugLog(item.first, item.second);
// Prints:
// 100, 9.99
// 123, 3.14
// 123, 2.2
// 1000, 2000
}
} // mm's destructor deallocates key and value memory
Unordered Set
The <unordered_set> header provides the equivalent of HashSet in
C#: std::unordered_set. It’s like a std::unordered_map except that
there are only keys so the API is simpler:
#include <unordered_set>
void Foo()
{
// Create a set with four values
std::unordered_set<int> us{ 123, 456, 789, 1000 };
// Many functions from other containers are available
DebugLog(us.size()); // 4
DebugLog(us.empty()); // false
us.insert(2000);
us.erase(456);
us.emplace(100);
DebugLog(us.count(123)); // 1
} // us's destructor deallocates value memory
The APIs of both std::set and std::multiset are also very similar
to their unordered counterparts. Here’s std::set:
#include <set>
void Foo()
{
// Create a set with four values
std::set<int> s{ 123, 456, 789, 1000 };
// Many functions from other containers are available
DebugLog(s.size()); // 4
DebugLog(s.empty()); // false
s.insert(2000);
s.erase(456);
s.emplace(100);
DebugLog(s.count(123)); // 1
// Ordering by key is guaranteed
for (int x : s)
{
DebugLog(x);
// Prints:
// 100
// 123
// 789
// 1000
// 2000
}
} // s's destructor deallocates value memory
#include <set>
void Foo()
{
// Create a multiset with six values: two are
duplicated
std::multiset<int> ms{ 123, 456, 123, 789, 1000, 1000
};
// Many functions from other containers are available
DebugLog(ms.size()); // 6
DebugLog(ms.empty()); // false
ms.insert(2000);
ms.erase(123); // erases both
ms.emplace(100);
DebugLog(ms.count(1000)); // 2
// Ordering is guaranteed
for (int x : ms)
{
DebugLog(x);
// Prints:
// 100
// 456
// 789
// 1000
// 1000
// 2000
}
} // ms's destructor deallocates value memory
List
Next up is the <list> header and it’s std::list class template that
implements a doubly-linked list. This is equivalent to the LinkedList
class in C#. It’s API is similar to std::vector except that operations
on the front of the list are also supported and indexing isn’t allowed
since that would require an expensive walk of the list:
#include <list>
void Foo()
{
// Create an empty list
std::list<int> li{};
// Add some values
li.push_back(456);
li.push_front(123);
// Grow by inserting default-initialized values
li.resize(5);
// Query size
DebugLog(li.empty()); // false
DebugLog(li.size()); // 5
// Indexing isn't supported. Loop instead.
for (int x : li)
{
DebugLog(x);
// Prints:
// 123
// 456
// 0
// 0
// 0
}
// Remove values
li.pop_back();
li.pop_front();
// Special operations
li.sort();
li.remove(0); // remove all zeroes
li.remove_if([](int x) { return x < 200; }); //
remove all under 200
} // li's destructor deallocates value memory
Forward List
A singly-linked list, std::forward_list, is also provided via
<forward_list>. C# doesn’t provide an equivalent container. The
API is like the reverse of std::vector since only operations on the
front of the list are supported. Unlike most other containers, a size
function isn’t provided since it would require walking the entire list to
count nodes:
#include <forward_list>
void Foo()
{
// Create an empty list
std::forward_list<int> li{};
// Add some values
li.push_front(123);
li.push_front(456);
// Grow by inserting default-initialized values
li.resize(5);
// Query size
DebugLog(li.empty()); // false
// Indexing isn't supported. Loop instead.
for (int x : li)
{
DebugLog(x);
// Prints:
// 123
// 456
// 0
// 0
// 0
}
// Remove values
li.pop_front();
// Special operations
li.sort();
li.remove(0); // remove all zeroes
li.remove_if([](int x) { return x < 200; }); //
remove all under 200
} // li's destructor deallocates value memory
Conclusion
The C++ Standard Library provides a robust and consistent offering
of non-array collection types to go along with array collection types
like std::vector. The APIs are all very similar, which is great for
generic programming as the collection type can easily be made into
a type parameter.
C++ containers provide the same support for this abstract form of
traversing a collection, but they call the object keeping track of the
traversal an “iterator” instead of an “enumerator.” Regardless of the
container type, it’ll have two member type aliases: iterator and
const_iterator. Here’s how we use them the way we’d manually
traverse a collection vith IEnumerator<T> in C#:
#include <vector>
// Create a vector with three elements
std::vector<int> v{ 1, 2, 3 };
// Call begin() to get an iterator that's at the first
element: 1
// Call end() to get an iterator that's one past the last
element: 3
// Use the overloaded pre-increment operator to advance
to the next element
for (std::vector<int>::iterator it{ v.begin() }; it !=
v.end(); ++it)
{
// Use the overloaded dereference operator to get the
element
DebugLog(*it); // 1 then 2 then 3
}
It’s worth noting that while both the pre-increment and post-
increment operators are overloaded for iterator types, the pre-
increment operator is always at least as fast as the post-increment
operator. They may be equally fast, but it’s a good habit to use the
pre-increment operator when manually using iterators.
There are a few variations of the above canonical loop that are
commonly seen:
#include <vector>
std::vector<int> v{ 1, 2, 3 };
// Use the free functions begin() and end() instead of
the member functions
for (std::vector<int>::iterator it{ std::begin(v) }; it
!= std::end(v); ++it)
{
DebugLog(*it); // 1 then 2 then 3
}
// Use cbegin() and cend() to get constant (i.e. read-
only) iterators
for (
std::vector<int>::const_iterator it{ v.cbegin() };
it != v.cend();
++it)
{
DebugLog(*it); // 1 then 2 then 3
}
// Use auto to avoid the long type name
for (auto it{ v.cbegin() }; it != v.cend(); ++it)
{
DebugLog(*it); // 1 then 2 then 3
}
The design of iterators and functions like begin and end make all of
the container types compatible with range-based for loops. Since
they were introduced in C++11 it’s now far less common to see these
verbose loops that manually control iterators in the simple, and most
common, “start to end” fashion. Instead, we just use a for loop and
let the compiler generate the same code as the manual version:
#include <vector>
std::vector<int> v{ 1, 2, 3 };
// Non-const copies of every element
for (int x : v)
{
DebugLog(x); // 1 then 2 then 3
}
// Non-const references to every element
for (int& x : v)
{
DebugLog(x); // 1 then 2 then 3
}
// const copies of every element
for (const int x : v)
{
DebugLog(x); // 1 then 2 then 3
}
// const references to every element
for (const int& x : v)
{
DebugLog(x); // 1 then 2 then 3
}
Note that auto can be used instead of int in all of the above
examples.
#include <list>
// Custom allocator using malloc() and free() from the C
Standard Library
template <class T>
struct MallocFreeAllocator
{
// This allocator allocates objects of type T
using value_type = T;
// Default constructor
MallocFreeAllocator() noexcept = default;
// Converting copy constructor
template<class U>
MallocFreeAllocator(const MallocFreeAllocator<U>&)
noexcept
{
}
// Allocate enough memory for n objects of type T
T* allocate(const size_t n) const
{
DebugLog("allocate");
return reinterpret_cast<T*>(malloc(n *
sizeof(T)));
}
// Deallocate previously-allocated memory
void deallocate(T* const p, size_t) const noexcept
{
DebugLog("deallocate");
free(p);
}
};
void Foo()
{
// Use the custom allocator to allocate the list's
memory
std::list<int, MallocFreeAllocator<int>> li{ 1, 2, 3
};
for (int x : li)
{
DebugLog(x);
}
}
What exactly this prints, besides at least one “allocate”, then “1”, “2”,
and “3”, then at least one “deallocate” depends on the
implementation of the std::list type, but it’s likely to look
something like this:
allocate
allocate
allocate
allocate
allocate
1
2
3
deallocate
deallocate
deallocate
deallocate
deallocate
// Create an allocator
MallocFreeAllocator<int> alloc{};
// Pass it to the constructor to be used
std::list<int, MallocFreeAllocator<int>> li{ {1, 2, 3},
alloc };
#include <list>
#include <memory_resource>
// A std::pmr::memory_resource that uses the new and
delete operators
struct NewDeleteMemoryResource : public
std::pmr::memory_resource
{
// Allocate bytes, not a particular type
virtual void* do_allocate(
std::size_t numBytes, std::size_t alignment)
override
{
return new uint8_t[numBytes];
}
// Deallocate bytes
virtual void do_deallocate(
void* p, std::size_t numBytes, std::size_t
alignment) override
{
delete[](uint8_t*)p;
}
// Check if this resource's allocation and
deallocation are compatible
// with that of another resource
virtual bool do_is_equal(
const std::pmr::memory_resource& other) const
noexcept override
{
// Compatible if the same type
return typeid(other) ==
typeid(NewDeleteMemoryResource);
}
};
void Foo()
{
// Make the memory resource
NewDeleteMemoryResource mr{};
// Make the polymorphic allocator backed by the
memory resource
std::pmr::polymorphic_allocator<int> polyalloc{ &mr
};
// Pass the polymorphic allocator to the constructor
to be used
std::pmr::list<int> li1{ {1, 2, 3}, polyalloc };
for (int x : li1)
{
DebugLog(x); // 1 then 2 then 3
}
// Class template instantiations are compatible
// This has a different polymorphic allocator: the
default
std::pmr::list<int> li2 = li1;
for (int x : li2)
{
DebugLog(x); // 1 then 2 then 3
}
}
Overloads like the one we created above for SimpleArray exist for all
the C++ Standard Library containers. It’s even typical to create
overloads like the above for any custom container types we create.
Usage with types like std::unordered_set looks just the same:
#include <unordered_set>
// Create collections
std::unordered_set<int> a{ 1, 2, 3 };
std::unordered_set<int> b{ 10, 20, 30, 40, 50 };
// Print initial contents
for (int x : a)
{
DebugLog(x); // 1, 2, 3
}
for (int x : b)
{
DebugLog(x); // 10, 20, 30, 40, 50
}
// Swap contents
std::swap(a, b);
// Print swapped contents
for (int x : a)
{
DebugLog(x); // 10, 20, 30, 40, 50
}
for (int x : b)
{
DebugLog(x); // 1, 2, 3
}
Exceptions
C++ supports a wide variety of error-handling techniques ranging
from simple return codes to exceptions, std::optional, and even the C
Standard Library’s errno. Most of the C++ Standard Library uses
exceptions though. This includes all the container types, such as
detecting out-of-bounds conditions like SimpleArray did above:
#include <vector>
std::vector<int> v{ 1, 2, 3 };
try
{
int x = v[1000];
DebugLog(x); // Does not get printed
}
catch (const std::out_of_range& ex)
{
DebugLog(ex.what()); // Maybe "vector subscript out
of range"
}
#include <vector>
// A class that logs its lifecycle
struct Noisy
{
int Val;
Noisy(int val)
{
Val = val;
DebugLog(Val, "val ctor");
}
Noisy(const Noisy& other)
{
Val = other.Val;
DebugLog(Val, "copy ctor");
}
~Noisy()
{
DebugLog(Val, "dtor");
}
};
std::vector<Noisy> v{};
v.reserve(2);
v.push_back(123);
// Prints:
// 123, val ctor
// 123, copy ctor
// 123, dtor
v.emplace_back(456);
// Prints:
// 456, val ctor
v.push_back(Noisy{123});
Second, the implementation of push_back copies the parameter to its
array of Noisy objects. It’s as though push_back included a line that
used “placement new” to call the copy constructor with this being the
appropriate location in its array. This is why we see the “123, copy
ctor” log message. Here’s a pseudo-code version of how that might
look:
On the other hand, emplace_back does not take the element to add
to the std::vector. Instead, it takes the arguments to pass to the
“placement new” operator. It’s as though push_back is implemented
like this:
#include <vector>
#include <span>
// Create a container
std::vector<int> v{ 1, 2, 3 };
// Create a span to view the container
std::span<int> s{ v };
// Use the span to get the contents of the container
for (int x : s)
{
DebugLog(x); // 1 then 2 then 3
}
#include <vector>
#include <span>
void Print(std::span<int> s)
{
for (int x : s)
{
DebugLog(x);
}
}
std::vector<int> v{ 1, 2, 3 };
Print(v); // 1 then 2 then 3
int a[] = { 1, 2, 3 };
Print(a); // 1 then 2 then 3
#include <vector>
#include <span>
std::vector<int> v{ 1, 2, 3, 4, 5 };
// A span covering the whole container
std::span<int> s{ v };
DebugLog(s.size()); // 5
for (int x : s)
{
DebugLog(x); // 1, 2, 3, 4, 5
}
// A sub-span of the middle 3 elements
std::span<int> ss{ s.subspan(1, 3) };
DebugLog(ss.size()); // 3
for (int x : ss)
{
DebugLog(x); // 2, 3, 4
}
Conclusion
The containers library is designed in such a way that containers are
very compatible with each other and with the C++ language itself.
They all have iterators that can be used directly or via range-based
for loops just like C# container types can be used with foreach
loops. They consistently handle errors with exceptions, just like C#
containers do, but often allow disabling those exceptions. The
std::span type provides an abstract view of any container just like
Span does. All in all, there’s a lot of similarity between C++ and C#.
Container C++ C#
#include <iterator>
#include <vector>
std::vector<int> v{ 1, 2, 3 };
// Adapt iterators to go backward instead of forward when
using ++
std::reverse_iterator<std::vector<int>::iterator> it{
v.end() };
std::reverse_iterator<std::vector<int>::iterator> end{
v.begin() };
while (it != end)
{
DebugLog(*it); // 3 then 2 then 1
++it;
}
// Less verbose version using class template argument
deduction
for (std::reverse_iterator it{ v.end() };
it != std::reverse_iterator{ v.begin() };
++it)
{
DebugLog(*it); // 3 then 2 then 1
}
// Even less verbose version using rebgin() and rend()
for (auto it{ v.rbegin() }; it != v.rend(); ++it)
{
DebugLog(*it); // 3 then 2 then 1
}
There’s also a std::back_insert_iterator that overloades the
assignment (=) operator to call push_back on a collection:
#include <iterator>
#include <vector>
// Empty collection
std::vector<int> v{};
// Create an iterator to insert into the std::vector
std::back_insert_iterator<std::vector<int>> it{ v };
// Insert three elements
it = 1;
it = 2;
it = 3;
for (int x : v)
{
DebugLog(x); // 1 then 2 then 3
}
#include <iterator>
#include <vector>
std::vector<int> v{ 10, 20, 30, 40, 50 };
// Get the distance (how many iterations) between two
iterators
DebugLog(std::distance(v.begin(), v.end())); // 5
// Advance an iterator by a certain number of iterations
std::vector<int>::iterator it{ v.begin() };
std::advance(it, 2);
DebugLog(*it); // 30
// Get the next iterator
DebugLog(*std::next(it)); // 40
DebugLog(*std::prev(it)); // 20
#include <iterator>
#include <vector>
std::vector<int> v{ 10, 20, 30, 40, 50 };
// Check if a container is empty
DebugLog(std::empty(v)); // false
// Get a pointer to a container's data
int* d{ std::data(v) };
DebugLog(*d); // 10
A bunch of overloaded operators are provided outside of any
particular iterator class to perform binary operations on iterators:
#include <iterator>
#include <vector>
std::vector<int> v{ 10, 20, 30, 40, 50 };
std::vector<int>::iterator itA{ v.begin() };
std::vector<int>::iterator itB{ v.end() };
// Subtraction is a synonym for std::distance
DebugLog(itB - itA); // 5
// Inequality and inequality operators compare iteration
position
DebugLog(itA == itB); // false
DebugLog(itA < itB); // true
#include <algorithm>
#include <vector>
std::vector<int> v{ 10, 20, 30, 40, 50 };
bool allEven = std::all_of(
v.begin(), // Iterator to start at
v.end(), // Iterator to stop at
[](int x) { return (x % 2) == 0; }); // Predicate to
call with elements
DebugLog(allEven); // true
Right away we see a difference from LINQ: two iterators are passed
in instead of an IEnumerable with its single GetEnumerator method.
This makes it easy to operate on a subset of the container such as
the middle three elements of an array:
#include <algorithm>
#include <vector>
int v[]{ 10, 20, 30, 40, 50 };
bool allSmall = std::all_of(
std::begin(v) + 1,
std::begin(v) + 4,
[](int x) { return x >= 20 && x <= 40; });
DebugLog(allSmall); // true
The same can be done in C#, but only by allocating new managed
class objects that implement the IEnumerable interface:
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
int[] a = { 10, 20, 30, 40, 50 };
// Skip allocates a class instance
IEnumerable<int> skipped = a.Skip(1);
// Take allocates a class instance
IEnumerable<int> taken = skipped.Take(3);
// Operates on only the middle three elements
bool allSmall = taken.All(x => x >= 20 && x <=
40);
Console.WriteLine(allSmall); // true
}
}
#include <algorithm>
#include <iterator>
int main()
{
int v[]{ 10, 20, 30, 40, 50 };
bool allSmall = std::all_of(
std::begin(v) + 1,
std::begin(v) + 4,
[](int x) { return x >= 20 && x <= 40; });
return allSmall;
}
GCC 10.3 with basic optimization (-O1) enabled compiles this to the
constant 1 (for true) on x86-64:
main:
mov eax, 1
ret
#include <algorithm>
#include <vector>
std::vector<int> v1{ 10, 20, 30, 40, 50 };
std::vector<int> v2{ 10, 20, 35, 45, 55 };
auto isEven = [](int x) { return (x % 2) == 0; };
// Query the contents of a vector
DebugLog(std::any_of(v1.begin(), v1.begin(), isEven)); //
true
DebugLog(std::none_of(v1.begin(), v1.begin(), isEven));
// false
// Get a pair of iterators where the vectors diverge
auto mm{ std::mismatch(v1.begin(), v1.end(), v2.begin())
};
DebugLog(*mm.first, *mm.second); // 30, 35
// Get an iterator to the first matching element
auto firstEven{ std::find_if(v2.begin(), v2.end(),
isEven) };
DebugLog(*firstEven); // 10
// Get an iterator to the first matching element that
doesn't match
auto firstOdd{ std::find_if_not(v2.begin(), v2.end(),
isEven) };
DebugLog(*firstOdd); // 35
// Get an iterator to the first element of an element
sequence
auto seq{ std::search(v1.begin(), v1.end(), v2.begin(),
v2.begin() + 1) };
DebugLog(*seq); // 10
#include <algorithm>
#include <vector>
#include <random>
std::vector<int> v{ 10, 20, 35, 45, 55 };
auto isEven = [](int x) { return (x % 2) == 0; };
auto print = [&]() {
for (int x : v)
{
DebugLog(x);
}
};
// Remove matching elements by shifting toward the front
// Returns an iterator just after the new end
auto end{ std::remove_if(v.begin(), v.end(), isEven) };
DebugLog(*end); // 45
print(); // 35, 45, 55
// Replace matching elements with a new value
std::replace_if(v.begin(), v.end(), [](int x) { return x
< 50; }, 50);
print(); // 50, 50, 55, 50, 55
// Rotate left by two elements
std::rotate(v.begin(), v.begin() + 2, v.end());
print(); // 55, 50, 55, 50, 50
// Randomly shuffle elements
// Note: random_shuffle() isn't thread-safe and is
deprecated since C++17
std::random_device rd{};
std::mt19937 gen{ rd() };
std::shuffle(v.begin(), v.end(), gen);
print(); // Some permutation of 55, 50, 55, 50, 50
// Assign a value to every element
std::fill(v.begin(), v.end(), 10);
print(); // 10, 10, 10, 10, 10
#include <algorithm>
#include <vector>
std::vector<int> v{ 35, 45, 10, 20, 55 };
auto isEven = [](int x) { return (x % 2) == 0; };
auto print = [](auto& c) {
for (int x : c)
{
DebugLog(x);
}
};
// Check if a sequence is sorted
DebugLog(std::is_sorted(v.begin(), v.end())); // false
// Sort elements until an iterator is reached
std::partial_sort(v.begin(), v.begin() + 2, v.end());
print(v); // 10, 20, 45, 35, 55
DebugLog(std::is_sorted(v.begin(), v.end())); // false
// Sort the whole sequence
std::sort(v.begin(), v.end());
print(v); // 10, 20, 35, 45, 55
DebugLog(std::is_sorted(v.begin(), v.end())); // true
// Binary search a sorted sequence
DebugLog(std::binary_search(v.begin(), v.end(), 45)); //
true
// Merge two sorted sequences into a sorted sequence
std::vector<int> v2{ 15, 25, 40, 50 };
std::vector<int> v3{};
std::merge(
v.begin(), v.end(), // First sequence
v2.begin(), v2.end(), // Second sequence
std::back_insert_iterator<std::vector<int>>{ v3 });
// Output iterator
print(v3); // 10, 15, 20, 25, 35, 40, 45, 55
// Check if a sorted sequence includes another sorted
sequence
// Inclusion doesn't need to be contiguous
bool inc{ std::includes(v3.begin(), v3.end(), v2.begin(),
v2.end()) };
DebugLog(inc); // true
#include <algorithm>
#include <vector>
std::vector<int> v1{ 35, 45, 10, 20, 55 };
std::vector<int> v2{ 35, 45, 10, 15, 30 };
// Check if two sequences' elements are equal
DebugLog(std::equal(v1.begin(), v1.end(), v2.begin(),
v2.end())); // false
DebugLog(
std::equal(
v1.begin(), v1.begin() + 3,
v2.begin(), v2.begin() + 3)); // true
// Find the min, max, and both of elements in a sequence
DebugLog(*std::min_element(v1.begin(), v1.end())); // 10
DebugLog(*std::max_element(v1.begin(), v1.end())); // 55
auto [minIt, maxIt] = std::minmax_element(v1.begin(),
v1.end());
DebugLog(*minIt, *maxIt); // 10, 55
// Single value versions don't operate on sequences
int a = 10;
int b = 20;
DebugLog(std::min(a, b)); // 10
DebugLog(std::max(a, b)); // 20
auto [minVal, maxVal] = std::minmax(a, b);
DebugLog(minVal, maxVal); // 10, 20
// Other single value functions
DebugLog(std::clamp(1000, 0, 100)); // 100
std::swap(a, b);
DebugLog(a, b); // 20, 10
#include <algorithm>
// A custom enum and a function to get enumerator string
names
enum class Element { Earth, Water, Wind, Fire };
const char* GetName(Element e)
{
switch (e)
{
case Element::Earth: return "Earth";
case Element::Water: return "Water";
case Element::Wind: return "Wind";
case Element::Fire: return "Fire";
default: return "";
}
}
// A custom struct
struct PrimalElement
{
Element Element;
int Power;
};
// Forward-declare a class that holds an array of the
custom struct
class PrimalElementsArray;
// An iterator type for the custom struct
class PrimalElementIterator
{
// Keep track of the current iteration position
PrimalElement* Array;
int Index;
public:
PrimalElementIterator(PrimalElement* array, int
index)
: Array(array)
, Index(index)
{
}
// Advance the iterator
PrimalElementIterator& operator++()
{
Index++;
return *this;
}
// Compare with another iterator
bool operator==(const PrimalElementIterator& other)
{
return Array == other.Array && Index ==
other.Index;
}
// Dereference to get the current element
PrimalElement& operator*()
{
return Array[Index];
}
};
// A class that holds an array of the custom struct
class PrimalElementsArray
{
PrimalElement Elements[4];
public:
PrimalElementsArray()
{
Elements[0] = PrimalElement{ Element::Earth, 50
};
Elements[1] = PrimalElement{ Element::Water, 20
};
Elements[2] = PrimalElement{ Element::Wind, 10 };
Elements[3] = PrimalElement{ Element::Fire, 75 };
}
// Get an iterator to the first element
PrimalElementIterator begin()
{
return PrimalElementIterator{ Elements, 0 };
}
// Get an iterator to one past the last element
PrimalElementIterator end()
{
return PrimalElementIterator{ Elements, 4 };
}
};
// Create our custom array type
PrimalElementsArray pea{};
// Use std::find_if to find the PrimalElement with more
than 50 power
PrimalElementIterator found{
std::find_if(
pea.begin(),
pea.end(),
[](const PrimalElement& pe) { return pe.Power >
50; }) };
DebugLog(GetName((*found).Element), (*found).Power); //
Fire, 75
Numeric
Finally for this chapter, we’ll revisit the numbers library by looking at
<numeric>. It turns out that it has some number-specific generic
algorithms. Here’s a few of them:
#include <numeric>
#include <vector>
std::vector<int> v{};
v.resize(5);
auto print = [](auto& c) { for (int x : c) DebugLog(x);
};
// Initialize with sequential values starting at 10
std::iota(v.begin(), v.end(), 10);
print(v); // 10, 11, 12, 13, 14
// Sum the range starting at 100
DebugLog(std::accumulate(v.begin(), v.end(), 100)); //
160
// Sum in an arbitrary order
DebugLog(std::reduce(v.begin(), v.end(), 100)); // 160
// C++17: transform pairs of elements and then reduce in
an arbitrary order
// Equivalent to 1000000 + 10*10 + 11*10 + 12*10 + 13*10
+ 14*10
DebugLog(
std::transform_reduce(
v.begin(), v.end(), // Sequence
1000000, // Initial value
[](int a, int b) { return a + b; }, // Reduce
function
[](int x) { return x*10; }) // Transform function
); // 1000600
// Output sums up to current iteration
std::vector<int> sums{};
std::partial_sum(
v.begin(), v.begin() + 3, // Sequence
std::back_insert_iterator{ sums }); // Output
iterator
print(sums); // 10, 21, 33
Conclusion
Both languages have a wide variety of generic algorithms but they
differ quite a bit in implementation. That ranges from the trivial
naming differences of enumerators and iterators to the giant
performance gulf between LINQ and C++ algorithm functions in
<algorithm> and <numeric>.
It’s hard to overstate just how many generic algorithms are available
in the C++ Standard Library. This is especially true when looking at
the huge number of permutations of each of these functions. It’s
common to see five or even ten overloads of these to customize for
a wide variety of parameters running the gamut from simple versions
to extremely generic and flexible versions. That’s another difference
with C# where LINQ functions typically have just one or two
overloads.
The design of the language, especially the very powerful support for
compile-time generic programming via templates, combines with the
iterator paradigm to enable all of this functionality on all of the many
container types but also all of the container types we might
implement in our own code to suit our own needs. We inherit the
same high level of optimization that C++ Standard Library types
receive, which gives us little excuse for writing a lot of “raw” loops.
49. Ranges and Parallel Algorithms' href
Library Layout and iosfwd
The I/O library subset of the broader C++ Standard Library contains
several header files that often #include each other. Here’s how
those relationships look:
I/O Library
The most basic usage of the I/O library is to #include <iosfwd>. This
header provides “forward” declarations of I/O types. These can then
be named, such as by pointer or reference types. They can’t be used
by value or by accessing any of their members. <iosfwd> really just
exists to speed up compilation when I/O types only need to be
named and their full definitions, which are slower to compile, aren’t
needed.
#include <ios>
#include <locale>
void Goo(std::ios_base& base)
{
// Get the flags that control formatting
std::ios_base::fmtflags f{ base.flags() };
DebugLog((f & std::ios_base::dec) != 0); // Maybe
true
DebugLog((f & std::ios_base::hex) != 0); // Maybe
false
DebugLog((f & std::ios_base::boolalpha) != 0); //
Maybe false
// Set and unset a format flag
base.setf(std::ios_base::boolalpha);
DebugLog((base.flags() & std::ios_base::boolalpha) !=
0); // true
base.unsetf(std::ios_base::boolalpha);
DebugLog((base.flags() & std::ios_base::boolalpha) !=
0); // false
// Set and get floating-point precision
base.precision(2);
DebugLog(base.precision()); // 2
// Set and get the minimum number of characters that
some operations print
base.width(10);
DebugLog(base.width()); // 10
// Set and get the locale
base.imbue(std::locale{ "de-DE" });
DebugLog(base.getloc().name()); // de-DE
try
{
throw std::ios_base::failure{ "some I/O error" };
}
catch (const std::ios_base::failure& ex)
{
DebugLog(ex.what()); // some I/O error
}
// Ways of opening streams
// These are bit flags to form a mask
std::ios_base::openmode mode{
std::ios_base::app | // Append
std::ios_base::binary | // Binary
std::ios_base::in | // Read
std::ios_base::out | // Write
std::ios_base::trunc | // Overwrite
std::ios_base::ate // Open at end of stream
};
// Bit flags forming the state of a stream
std::ios_base::iostate state{
std::ios_base::goodbit | // No error
std::ios_base::badbit | // Unrecoverable error
std::ios_base::failbit | // Operation failed
(e.g. formatting failed)
std::ios_base::eofbit // End of stream
};
// Directions to seek
std::ios_base::seekdir dir = std::ios_base::beg; //
Beginning of stream
dir = std::ios_base::end; // End of stream
dir = std::ios_base::cur; // From the current
position
}
There’s also std::char_traits, which is a class template with static
functions that provide functionality for operations on particular kinds
of characters:
#include <ios>
// Single-character operations
DebugLog(std::char_traits<char>::eq('a', 'a')); // true
DebugLog(std::char_traits<char>::eof()); // -1
// Copy multiple characters
char buf[5];
std::char_traits<char>::copy(buf, "abcd", 4);
DebugLog(buf); // abcd
// Lexicographical comparison
DebugLog(std::char_traits<char>::compare("abcd", "efgh",
4)); // -1
Finally, there are some free functions that set flags on std::ios_base
objects as an alternative to the setf member function:
#include <ios>
void Goo(std::ios_base& base)
{
// Use strings like "true" or numbers like 1 for
bools
std::boolalpha(base);
std::noboolalpha(base);
// Use uppercase or lowercase in hexadecimal numbers
and floats
std::uppercase(base);
std::nouppercase(base);
}
streambuf
The <streambuf> header provides just one class:
std::basic_streambuf. This is an abstract base class of a way to
input and output characters. It’s meant to have its virtual functions
overridden by derived classes in such a way that they implement the
actual reading and writing from the stream. This might mean access
to a network socket, file system, GPU memory, or any other place
that serialized data can be transmitted to and received from.
Global
Use C# Equivalent
Object
Standard output of
std::cout Console.OpenStandardOutput
char
Unbuffered standard
std::cerr Console.OpenStandardError
error of char
Unbuffered standard
std::wcerr Console.OpenStandardError
error of wchar_t
The << operator is overloaded with all of the primitive types like long,
float, char, char* (a C-string), and bool. It’s common for us to add
an overload for our own types so we can format them for output:
#include <iostream>
// Our own type
struct Point2
{
float X;
float Y;
};
// Overload basic_ostream's << operator for our own type
template <typename TChar>
std::basic_ostream<TChar>& operator<<(
std::basic_ostream<TChar>& stream,
const Point2& point)
{
// Use the overloaded << operator with already-
supported primitive types
stream << '(' << point.X << ", " << point.Y << ')';
// Return the stream for operator chaining
return stream;
}
// Print our own type to standard output
Point2 p{ 2, 4 };
std::cout << p << '\n'; // (2, 4)\n
At long last, we can write the DebugLog function! With the support of
variadic templates, template specialization, and type-aware
formatted output to a std::basic_ostream, it’s actually only about 9
lines of code:
#include <iostream>
// Logging nothing just prints an empty line
void DebugLog()
{
std::cout << '\n';
}
// Logging one value. This is the base case.
template <typename T>
void DebugLog(const T& val)
{
std::cout << val << '\n';
}
// Logging two or more values
template <typename TFirst, typename TSecond, typename
...TRemain>
void DebugLog(const TFirst& first, const TSecond& second,
TRemain... remain)
{
// Log the first value
std::cout << first << ", ";
// Recurse with the second value and any remaining
values
DebugLog(second, remain...);
}
// Call the first function to print an empty line
DebugLog(); // \n
// Call the second function to print a single value
DebugLog('a'); // a\n
// Call the third function
// It prints "b, "
// It recurses with (1, true, "hello")
// It prints "1, "
// It recurses with (true, "hello")
// It prints "true, "
// It calls the second function with "hello"
// The second function prints "hello\n"
DebugLog('b', 1, true, "hello"); // b, 1, true, hello\n
#include <iostream>
// Unformatted output of a single character
std::cout.put('a');
// Unformatted output of a block of characters
char buf[8];
for (int i = 0; i < sizeof(buf); ++i)
{
buf[i] = 'a' + i;
}
std::cout.write(buf, sizeof(buf)); // abcdefgh
There are also functions for querying and controlling the position in
the output stream. This has no meaning for std::cout, but makes
sense for other output streams such as to files:
#include <iostream>
// Write a null byte at a position then restore the
position
void WriteNullAt(std::ostream& stream, std::streampos
pos)
{
// Get stream position
std::streampos oldPos{ stream.tellp() };
// Seek stream position
stream.seekp(pos);
// Write the null byte
stream.put(0);
// Seek the stream back
stream.seekp(oldPos);
}
std::cout.flush();
#include <iostream>
int main()
{
std::cout << "Enter x:" << std::endl;
int x;
std::cin >> x;
std::cout << "Enter y:" << std::endl;
int y;
std::cin >> y;
std::cout << "x + y is " << (x+y) << std::endl;
}
Entering in some test values when prompted, we get the following
output:
Enter x:
2
Enter y:
4
x + y is 6
#include <iostream>
// Read 3 characters then print them
char buf[4] = { 0 };
std::cin.read(buf, 3); // Enter "abc"
DebugLog(buf); // abc
//// Read 1 character and ignore it
std::cin.ignore(1);
// Read until a character is found or the end of the
buffer is hit
std::cin.getline(buf, sizeof(buf), ';'); // Enter "ab;c"
DebugLog(buf); // ab
std::cin.getline(buf, sizeof(buf), ';'); // Enter
"abcdefg"
DebugLog(buf); // abc
// Put a character into the input stream
std::cin.putback('a');
std::cin.read(buf, 1);
DebugLog(buf); // a
iomanip
The <iomanip> header is full of “manipulator” functions that we can
pass to formatted read and write operations. Here’s a sampling of
the options:
#include <iomanip>
#include <iostream>
#include <numbers>
using namespace std;
// Output 255 as hexadecimal
cout << setbase(16) << 255 << endl; // ff
// Output pi with 3 digits of precision (whole and
fractional)
cout << setprecision(3) << numbers::pi << endl; // 3.14
// Set the width of the output and how it's filled.
Useful for columns.
auto row = [](auto num, auto name, char fill = ' ') {
cout << '|' << setw(10) << setfill(fill) << name <<
'|';
cout << setw(10) << setfill(fill) << num << '|' <<
endl;
};
row("Number", "Name");
row('-', '-', '-');
row(1, "One");
row(2, "Two");
// Prints:
// | Number| Name|
// |----------|----------|
// | 1| One|
// | 2| Two|
// Output cents as US Dollars
cout.imbue(locale("en_US"));
cout << std::showbase << put_money(250) << endl; // $2.50
fstream
The <fstream> header has facilities for file system I/O. At the lowest
level, we have std::basic_filebuf which is a
std::basic_streambuf that we can use for raw file system access.
More typically, we use the std::basic_ifstream,
std::basic_ofstream, and std::basic_fstream classes for input,
output, and both. Aliases such as std::fstream are provided and
most commonly seen. These are the rough equivalent of FileStream
in C#:
#include <fstream>
void Foo()
{
// Open the file for writing
std::fstream stream{ "/path/to/file",
std::ios_base::out };
// Formatted write to the file, including a flush via
endl
stream << "hello" << std::endl;
} // fstream's destructor closes the file
#include <sstream>
// Create a stream for an empty string
std::ostringstream stream{};
// Formatted writing
stream << "Hello" << 123;
// Unformatted writing
stream.write("Goodbye", 8);
// Get a string for what was written
std::string str{ stream.str() };
DebugLog(str); // Hello123Goodbye
#include <sstream>
// Create a stream for a string
std::istringstream stream{ "Hello 123Goodbye" };
// Formatted reading
std::string str;
int num;
stream >> str >> num;
DebugLog(str); // Hello
DebugLog(num); // 123
// Unformatted reading
char buf[8] = { 0 };
stream.read(buf, 8);
DebugLog(buf); // Goodbye
#include <sstream>
// Create a stream for an empty string
std::stringstream stream{};
// Formatted writing
stream << "Hello 123";
// Change read position to the beginning
stream.seekg(std::ios_base::beg, 0);
// Formatted reading
std::string str;
int num;
stream >> str >> num;
DebugLog(str); // Hello
DebugLog(num); // 123
syncstream
The final header of the I/O library was introduced with C++20:
<syncstream>. It provides std:: basic_syncbuf and
std::basic_osyncstream to synchronize the writing to a stream from
multiple threads. One motivating example is printing logs to standard
output. Consider how this works without synchronization:
#include <iostream>
#include <thread>
#include <chrono>
#include <functional>
// Prints "helloworld" to standard output 100 times
void Print(std::ostream& stream)
{
for (int i = 0; i < 100; ++i)
{
stream << "helloworld" << std::endl;
std::this_thread::sleep_for(std::chrono::microseconds{ 1
});
}
}
// Spawn a thread to print
std::jthread t{ Print, std::ref(std::cout) };
// Print on the main thread while the thread is running
Print(std::cout);
helloworld
helloworld
helloworldhelloworld
helloworld
helloworld
helloworld
helloworld
Here one of the threads printed helloworld but the other thread
interrupted to print helloworld\n before the first thread could print its
\n character. When the first thread resumed execution, it printed that
\n resulting in two \n in a row: \n\n.
#include <syncstream>
#include <iostream>
#include <thread>
#include <chrono>
#include <functional>
void Print(std::ostream& stream)
{
for (int i = 0; i < 100; ++i)
{
stream << "helloworld" << std::endl;
std::this_thread::sleep_for(std::chrono::microseconds{ 1
});
}
}
// Create a synchronized stream backed by std::cout
std::osyncstream out{ std::cout };
// Print to the synchronized stream
std::jthread t{ Print, std::ref(out) };
Print(std::cout);
Conclusion
The C++ “I/O streams” library is far more powerful than basic
functionality like printf found in the C Standard Library. It’s not
nearly as error-prone since it makes use of the C++ type system
rather than manually-entered “format strings.” It’s far more extensible
since we can write our own format functions, manipulator functions,
and stream types to read and write from whatever kind of device we
encounter.
I/O Library
The most basic usage of the I/O library is to #include <iosfwd>. This
header provides “forward” declarations of I/O types. These can then
be named, such as by pointer or reference types. They can’t be used
by value or by accessing any of their members. <iosfwd> really just
exists to speed up compilation when I/O types only need to be
named and their full definitions, which are slower to compile, aren’t
needed.
#include <ios>
#include <locale>
void Goo(std::ios_base& base)
{
// Get the flags that control formatting
std::ios_base::fmtflags f{ base.flags() };
DebugLog((f & std::ios_base::dec) != 0); // Maybe
true
DebugLog((f & std::ios_base::hex) != 0); // Maybe
false
DebugLog((f & std::ios_base::boolalpha) != 0); //
Maybe false
// Set and unset a format flag
base.setf(std::ios_base::boolalpha);
DebugLog((base.flags() & std::ios_base::boolalpha) !=
0); // true
base.unsetf(std::ios_base::boolalpha);
DebugLog((base.flags() & std::ios_base::boolalpha) !=
0); // false
// Set and get floating-point precision
base.precision(2);
DebugLog(base.precision()); // 2
// Set and get the minimum number of characters that
some operations print
base.width(10);
DebugLog(base.width()); // 10
// Set and get the locale
base.imbue(std::locale{ "de-DE" });
DebugLog(base.getloc().name()); // de-DE
try
{
throw std::ios_base::failure{ "some I/O error" };
}
catch (const std::ios_base::failure& ex)
{
DebugLog(ex.what()); // some I/O error
}
// Ways of opening streams
// These are bit flags to form a mask
std::ios_base::openmode mode{
std::ios_base::app | // Append
std::ios_base::binary | // Binary
std::ios_base::in | // Read
std::ios_base::out | // Write
std::ios_base::trunc | // Overwrite
std::ios_base::ate // Open at end of stream
};
// Bit flags forming the state of a stream
std::ios_base::iostate state{
std::ios_base::goodbit | // No error
std::ios_base::badbit | // Unrecoverable error
std::ios_base::failbit | // Operation failed
(e.g. formatting failed)
std::ios_base::eofbit // End of stream
};
// Directions to seek
std::ios_base::seekdir dir = std::ios_base::beg; //
Beginning of stream
dir = std::ios_base::end; // End of stream
dir = std::ios_base::cur; // From the current
position
}
There’s also std::char_traits, which is a class template with static
functions that provide functionality for operations on particular kinds
of characters:
#include <ios>
// Single-character operations
DebugLog(std::char_traits<char>::eq('a', 'a')); // true
DebugLog(std::char_traits<char>::eof()); // -1
// Copy multiple characters
char buf[5];
std::char_traits<char>::copy(buf, "abcd", 4);
DebugLog(buf); // abcd
// Lexicographical comparison
DebugLog(std::char_traits<char>::compare("abcd", "efgh",
4)); // -1
Finally, there are some free functions that set flags on std::ios_base
objects as an alternative to the setf member function:
#include <ios>
void Goo(std::ios_base& base)
{
// Use strings like "true" or numbers like 1 for
bools
std::boolalpha(base);
std::noboolalpha(base);
// Use uppercase or lowercase in hexadecimal numbers
and floats
std::uppercase(base);
std::nouppercase(base);
}
streambuf
The <streambuf> header provides just one class:
std::basic_streambuf. This is an abstract base class of a way to
input and output characters. It’s meant to have its virtual functions
overridden by derived classes in such a way that they implement the
actual reading and writing from the stream. This might mean access
to a network socket, file system, GPU memory, or any other place
that serialized data can be transmitted to and received from.
Global
Use C# Equivalent
Object
Standard output of
std::cout Console.OpenStandardOutput
char
Unbuffered standard
std::cerr Console.OpenStandardError
error of char
Unbuffered standard
std::wcerr Console.OpenStandardError
error of wchar_t
The << operator is overloaded with all of the primitive types like long,
float, char, char* (a C-string), and bool. It’s common for us to add
an overload for our own types so we can format them for output:
#include <iostream>
// Our own type
struct Point2
{
float X;
float Y;
};
// Overload basic_ostream's << operator for our own type
template <typename TChar>
std::basic_ostream<TChar>& operator<<(
std::basic_ostream<TChar>& stream,
const Point2& point)
{
// Use the overloaded << operator with already-
supported primitive types
stream << '(' << point.X << ", " << point.Y << ')';
// Return the stream for operator chaining
return stream;
}
// Print our own type to standard output
Point2 p{ 2, 4 };
std::cout << p << '\n'; // (2, 4)\n
At long last, we can write the DebugLog function! With the support of
variadic templates, template specialization, and type-aware
formatted output to a std::basic_ostream, it’s actually only about 9
lines of code:
#include <iostream>
// Logging nothing just prints an empty line
void DebugLog()
{
std::cout << '\n';
}
// Logging one value. This is the base case.
template <typename T>
void DebugLog(const T& val)
{
std::cout << val << '\n';
}
// Logging two or more values
template <typename TFirst, typename TSecond, typename
...TRemain>
void DebugLog(const TFirst& first, const TSecond& second,
TRemain... remain)
{
// Log the first value
std::cout << first << ", ";
// Recurse with the second value and any remaining
values
DebugLog(second, remain...);
}
// Call the first function to print an empty line
DebugLog(); // \n
// Call the second function to print a single value
DebugLog('a'); // a\n
// Call the third function
// It prints "b, "
// It recurses with (1, true, "hello")
// It prints "1, "
// It recurses with (true, "hello")
// It prints "true, "
// It calls the second function with "hello"
// The second function prints "hello\n"
DebugLog('b', 1, true, "hello"); // b, 1, true, hello\n
#include <iostream>
// Unformatted output of a single character
std::cout.put('a');
// Unformatted output of a block of characters
char buf[8];
for (int i = 0; i < sizeof(buf); ++i)
{
buf[i] = 'a' + i;
}
std::cout.write(buf, sizeof(buf)); // abcdefgh
There are also functions for querying and controlling the position in
the output stream. This has no meaning for std::cout, but makes
sense for other output streams such as to files:
#include <iostream>
// Write a null byte at a position then restore the
position
void WriteNullAt(std::ostream& stream, std::streampos
pos)
{
// Get stream position
std::streampos oldPos{ stream.tellp() };
// Seek stream position
stream.seekp(pos);
// Write the null byte
stream.put(0);
// Seek the stream back
stream.seekp(oldPos);
}
std::cout.flush();
#include <iostream>
int main()
{
std::cout << "Enter x:" << std::endl;
int x;
std::cin >> x;
std::cout << "Enter y:" << std::endl;
int y;
std::cin >> y;
std::cout << "x + y is " << (x+y) << std::endl;
}
Entering in some test values when prompted, we get the following
output:
Enter x:
2
Enter y:
4
x + y is 6
#include <iostream>
// Read 3 characters then print them
char buf[4] = { 0 };
std::cin.read(buf, 3); // Enter "abc"
DebugLog(buf); // abc
//// Read 1 character and ignore it
std::cin.ignore(1);
// Read until a character is found or the end of the
buffer is hit
std::cin.getline(buf, sizeof(buf), ';'); // Enter "ab;c"
DebugLog(buf); // ab
std::cin.getline(buf, sizeof(buf), ';'); // Enter
"abcdefg"
DebugLog(buf); // abc
// Put a character into the input stream
std::cin.putback('a');
std::cin.read(buf, 1);
DebugLog(buf); // a
iomanip
The <iomanip> header is full of “manipulator” functions that we can
pass to formatted read and write operations. Here’s a sampling of
the options:
#include <iomanip>
#include <iostream>
#include <numbers>
using namespace std;
// Output 255 as hexadecimal
cout << setbase(16) << 255 << endl; // ff
// Output pi with 3 digits of precision (whole and
fractional)
cout << setprecision(3) << numbers::pi << endl; // 3.14
// Set the width of the output and how it's filled.
Useful for columns.
auto row = [](auto num, auto name, char fill = ' ') {
cout << '|' << setw(10) << setfill(fill) << name <<
'|';
cout << setw(10) << setfill(fill) << num << '|' <<
endl;
};
row("Number", "Name");
row('-', '-', '-');
row(1, "One");
row(2, "Two");
// Prints:
// | Number| Name|
// |----------|----------|
// | 1| One|
// | 2| Two|
// Output cents as US Dollars
cout.imbue(locale("en_US"));
cout << std::showbase << put_money(250) << endl; // $2.50
fstream
The <fstream> header has facilities for file system I/O. At the lowest
level, we have std::basic_filebuf which is a
std::basic_streambuf that we can use for raw file system access.
More typically, we use the std::basic_ifstream,
std::basic_ofstream, and std::basic_fstream classes for input,
output, and both. Aliases such as std::fstream are provided and
most commonly seen. These are the rough equivalent of FileStream
in C#:
#include <fstream>
void Foo()
{
// Open the file for writing
std::fstream stream{ "/path/to/file",
std::ios_base::out };
// Formatted write to the file, including a flush via
endl
stream << "hello" << std::endl;
} // fstream's destructor closes the file
#include <sstream>
// Create a stream for an empty string
std::ostringstream stream{};
// Formatted writing
stream << "Hello" << 123;
// Unformatted writing
stream.write("Goodbye", 8);
// Get a string for what was written
std::string str{ stream.str() };
DebugLog(str); // Hello123Goodbye
#include <sstream>
// Create a stream for a string
std::istringstream stream{ "Hello 123Goodbye" };
// Formatted reading
std::string str;
int num;
stream >> str >> num;
DebugLog(str); // Hello
DebugLog(num); // 123
// Unformatted reading
char buf[8] = { 0 };
stream.read(buf, 8);
DebugLog(buf); // Goodbye
#include <sstream>
// Create a stream for an empty string
std::stringstream stream{};
// Formatted writing
stream << "Hello 123";
// Change read position to the beginning
stream.seekg(std::ios_base::beg, 0);
// Formatted reading
std::string str;
int num;
stream >> str >> num;
DebugLog(str); // Hello
DebugLog(num); // 123
syncstream
The final header of the I/O library was introduced with C++20:
<syncstream>. It provides std:: basic_syncbuf and
std::basic_osyncstream to synchronize the writing to a stream from
multiple threads. One motivating example is printing logs to standard
output. Consider how this works without synchronization:
#include <iostream>
#include <thread>
#include <chrono>
#include <functional>
// Prints "helloworld" to standard output 100 times
void Print(std::ostream& stream)
{
for (int i = 0; i < 100; ++i)
{
stream << "helloworld" << std::endl;
std::this_thread::sleep_for(std::chrono::microseconds{ 1
});
}
}
// Spawn a thread to print
std::jthread t{ Print, std::ref(std::cout) };
// Print on the main thread while the thread is running
Print(std::cout);
helloworld
helloworld
helloworldhelloworld
helloworld
helloworld
helloworld
helloworld
Here one of the threads printed helloworld but the other thread
interrupted to print helloworld\n before the first thread could print its
\n character. When the first thread resumed execution, it printed that
\n resulting in two \n in a row: \n\n.
#include <syncstream>
#include <iostream>
#include <thread>
#include <chrono>
#include <functional>
void Print(std::ostream& stream)
{
for (int i = 0; i < 100; ++i)
{
stream << "helloworld" << std::endl;
std::this_thread::sleep_for(std::chrono::microseconds{ 1
});
}
}
// Create a synchronized stream backed by std::cout
std::osyncstream out{ std::cout };
// Print to the synchronized stream
std::jthread t{ Print, std::ref(out) };
Print(std::cout);
Conclusion
The C++ “I/O streams” library is far more powerful than basic
functionality like printf found in the C Standard Library. It’s not
nearly as error-prone since it makes use of the C++ type system
rather than manually-entered “format strings.” It’s far more extensible
since we can write our own format functions, manipulator functions,
and stream types to read and write from whatever kind of device we
encounter.
Like the C++ language itself, the C++ Standard Library is a standard.
There are various implementations of the standard, but they all have
approximately the same features. Variance between them is mostly
in the form of unspecified behavior such as exception message
strings and deviances from the standard such as adding or removing
some (usually relatively-minor) features. This is especially true in
newly-released standards such as C++20 at the time of writing.
C++ has no built-in GUI support, but can access Windows Forms and Windows Presentation
Foundation via C++/CLI. Quite a few libraries are also available for cross-platform GUI
development:
Web
Library Language Licence Windows macOS Linux Android iOS
Browser
LGPL,
Qt C++ GPL, Yes Yes Yes Yes Yes Yes
Commercial
Dear
C++ MIT Yes Yes Yes Yes Yes Yes
ImGui
CPU Intrinsics
System.Runtime.Intrinsics and its sub-namespaces
System.Runtime.Intrinsics.X86 and
System.Runtime.Intrinsics.Arm contain “intrinsics” for x86 and
ARM CPUs. These are functions whose calls are translated by the
compiler directly into a named CPU instruction. They provide low-
level control without needing to resort to assembly code.
Support for stack traces has been proposed for the Standard Library,
but we’ll have to wait until at least C++23 for it to be adopted. For
now, the Boost-licensed Boost.Stacktrace library that the Standard
Library proposal is based on is available to fill the gap.
Database Clients
The System.Data goes as far as to build in support for particular
databases. System.Data.SqlClient is a client for Microsoft SQL
Server and System.Data.OracleClient is a client for Oracle
Database.
Microsof
ODBC Driver for Microsoft
C Commercial SQL
SQL Server
Server
Oracle
Oracle C++ Call Interface C+++ Commercial
Database
GPL or
MySQL Connector C++ MySQL
Commercial
The area of overlap between .NET and the C++ Standard Library
relates to general tools such as collections and file system access.
Sometimes the C++ Standard Library has more available here, such
as with its doubly-ended queue type. Other times .NET has more
available, such as with its ability to break an interactive debugger or
open a network socket. These tools could be added to the C++
Standard Library in the future, but for now we need to employ other
libraries to get access to them.
52. Idioms and Best Practices
Guides
There are several existing, popular guides that aim to impose
programming standards on C++ codebases for a variety of reasons.
These reasons range from trivialities such as formatting to
standardization of error-handling and outright bans on certain
language features. Additionally, many teams and organizations will
create their own in-house rules and possibly enforce them with tools
like ClangFormat.
// Avoid
#define PI 3.14
// Encourage
const float PI = 3.14;
// Avoid
#define SQUARE(x) x * x
// Encourage
float Square(float x)
{
return x * x;
}
Add include guards to every header
As of this writing, modules are not yet in common usage. While still
using the classic header file-based build system, every header file
should have “include guards” to prevent redundant definitions by
multiple #include directives:
// Avoid
struct Point2
{
float X;
float Y;
};
// Encourage
#ifndef POINT2_HPP
#define POINT2_HPP
struct Point2
{
float X;
float Y;
};
#endif
// Encourage (non-standard but widely-supported
alternative)
#pragma once
struct Point2
{
float X;
float Y;
};
Include dependencies directly instead of relying on
indirect includes
When one file uses code in another file, it should #include the file
declaring that code directly rather than relying on another header to
#include the desired code. This prevents compilation errors if the
middle header removes its #include.
////////
// Avoid
////////
// a.h
struct A {};
// b.h
#include "a.h"
struct B : A {};
// c.h
#include "b.h"
A a; // Not in b.h
////////////
// Encourage
////////////
// a.h
struct A {};
// b.h
#include "a.h"
struct B : A {};
// c.h
#include "a.h"
A a;
Don’t call virtual functions in constructors
Virtual functions rely on a table that’s initialized by the constructors
of the classes in an inheritance hierarchy. Calling virtual functions
before these are set up can result in crashes or calling the wrong
version of the function:
// Avoid
struct Parent
{
Parent()
{
Foo();
}
virtual void Foo()
{
DebugLog("Parent");
}
};
struct Child : Parent
{
virtual void Foo() override
{
DebugLog("Child");
}
};
Child c; // Prints "Parent" not "Child"!
// Encourage designs that do not require such calls
Don’t use variadic functions
Variadic functions aren’t type-safe, rely on error-prone macros, are
difficult to optimize, and often result in error-prone APIs. They should
be avoided in favor of techniques such as fold expressions or the
use of container types:
// Avoid
void DebugLog(int count, ...)
{
va_list args;
va_start(args, count);
for (int i = 0; i < count; ++i)
{
const char* log = va_arg(args, const char*);
std::cout << log << ", ";
}
va_end(args);
}
DebugLog(4, "foo", "bar", "baz"); // Whoops! 4 reads
beyond the last arg!
// Encourage
void DebugLog()
{
std::cout << '\n';
}
template <typename T>
void DebugLog(const T& val)
{
std::cout << val << '\n';
}
template <typename TFirst, typename TSecond, typename
...TRemain>
void DebugLog(const TFirst& first, const TSecond& second,
TRemain... remain)
{
std::cout << first << ", ";
DebugLog(second, remain...);
}
DebugLog("foo", "bar", "baz"); // No need for a count.
Can't get it wrong.
No naked new and delete
The new and delete operators to dynamically allocate memory
should be a rare sight. Instead, “owning” types such as containers
and smart pointers should call new in their constructors and delete in
their destructors to ensure that memory is always cleaned up:
// Avoid
struct Game
{
Stats* stats;
Game()
: stats{new Stats()}
{
}
~Game()
{
delete stats;
}
};
// Encourage
struct Game
{
std::unique_ptr<Stats> stats;
Game()
: stats{std::make_unique<Stats>()}
{
}
};
// Avoid
struct FloatBuffer
{
float* floats;
FloatBuffer(int32_t count)
: floats{new float[count]}
{
}
~FloatBuffer()
{
delete [] floats;
}
};
// Encourage
struct FloatBuffer
{
std::vector<float> floats;
FloatBuffer(int32_t count)
: floats{count}
{
}
};
Prefer range-based loops
The most common loop is from the beginning to the end of a
collection. To avoid mistakes and make this more terse, use a range-
based) for loop instead of the three-part for loop, a while loop, or a
do-while loop:
// Avoid
for (
std::vector<float>::iterator it = floats.begin();
it != floats.end();
++i)
{
DebugLog(*it);
}
// Encourage
for (float f : floats)
{
DebugLog(f);
}
Use scoped enums instead of unscoped enums
To avoid adding all the enumerators of an unscoped enum to the
surrounding scope, use a scoped enumeration:
// Avoid
enum Colors { Red, Green, Blue };
uint32_t Red = 0x00ff00ff; // Error: redefinition because
Red escaped the enum
// Encourage
enum class Colors { Red, Green, Blue };
uint32_t Red = 0x00ff00ff; // OK
Don’t breach namespaces in headers
When commonly using the members of a namespace, it can be
convenient to pull them out with using namespace. When this is done
in a header, the files that #include it have this decision forced on
them. This can lead to namespace collisions and confusion, so it
should be avoided.
// Avoid
using namespace std;
struct Name
{
string First;
string Last;
};
// Encourage
struct Name
{
std::string First;
std::string Last;
};
Make single-parameter constructors explicit
Constructors default to allowing implicit conversion, which can be
surprising and expensive. Use the explicit keyword to disallow this
behavior:
// Avoid
struct Buffer
{
std::vector<float> floats;
Buffer(int x)
: floats{x}
{
}
};
void DoStuff(Buffer b)
{
}
DoStuff(1'000'000); // Allocates a Buffer of one million
floats!
// Encourage
struct Buffer
{
std::vector<float> floats;
explicit Buffer(int x)
: floats{x}
{
}
};
void DoStuff(Buffer b)
{
}
DoStuff(1'000'000); // Compiler error
Don’t use C casts
C-style casts may just change the type but also might perform value
conversion. They’re hard to search for as they blend in with other
parentheses. Instead, use a named C++ cast for better control and
easier searching:
// Avoid
const float val = 5.5;
int x = (int)val; // Changes type, truncates, and removes
constness!
DebugLog(x);
// Encourage
const float val = 5.5;
int x = const_cast<int>(val); // Compiler error
DebugLog(x);
Use specific integer sizes
All the way back in the second chapter of the book, we learned that
the size guarantees for primitive types like long are very weak. They
should be avoided in favor of guaranteed sizes in <cstdint>:
// Avoid
long GetFileSize(const char* path)
{
// Does this support files larger than 4 GB?
// Depends on whether long is 32-bit or 64-bit
}
// Encourage
int64_t GetFileSize(const char* path)
{
// Definitely supports large files
}
Use nullptr
NULL is an implementation-defined variable-like macro that even
requires a #include to use. It’s a null pointer constant, but of
unknown type and erroneously usable in arithmetic. In contrast
nullptr is not a macro or an integer, requires no header, and even
has its own type which can be used in overload resolution. It should
be used to represent null pointers:
// Avoid
int* p = NULL;
// Encourage
int* p = nullptr;
Follow the Rule of Zero
Most classes shouldn’t need any copy constructors, move
constructors, destructors, copy assignment operators, or move
assignment operators. Instead, their data members should take care
of these functions. This is called the “rule of zero” because no
special functions need to be added. It’s the simplest approach and
the hardest to implement incorrectly:
// Avoid
struct Player
{
std::string Name;
int32_t Score;
Player(std::string name, int32_t score)
: Name{ name }
, Score{ score }
{
}
Player(const Player& other)
: Name{ other.Name }
, Score{ other.Score }
{
}
Player(Player&& other) noexcept
: Name{ std::move(other.Name) }
, Score{ std::move(other.Score) }
{
}
virtual ~Player()
{
}
Player& operator=(const Player& other)
{
Name = other.Name;
Score = other.Score;
return *this;
}
Player& operator==(Player&& other)
{
Name = std::move(other.Name);
Score = std::move(other.Score);
return *this;
}
};
Player p{ "Jackson", 1000 };
// Encourage
struct Player
{
std::string Name;
int32_t Score;
};
Player p{ "Jackson", 1000 };
Follow the Rule of Five
In cases where the Rule of Zero can’t be followed and a special
function needs to be added, add all five of them to handle all the
ways that objects can be copied, moved, and destroyed:
// Avoid
struct File
{
std::string Path;
const char* Mode;
FILE* Handle;
File(std::string path, const char* mode)
: Path{ path }
, Mode{ mode }
, Handle(fopen(path.c_str(), mode))
{
}
File(const File& other)
: Path{ other.Path }
, Mode{ other.Mode }
, Handle(fopen(other.Path.c_str(), other.Mode))
{
}
virtual ~File()
{
if (Handle)
{
fclose(Handle);
}
}
File& operator=(const File& other)
{
Path = other.Path;
Mode = other.Mode;
Handle = fopen(other.Path.c_str(), other.Mode);
return *this;
}
// No move constructor or move assignment operator
// Expensive copies will be required: more file open
and close operations!
};
// Encourage
struct File
{
std::string Path;
const char* Mode;
FILE* Handle;
File(std::string path, const char* mode)
: Path{ path }
, Mode{ mode }
, Handle(fopen(path.c_str(), mode))
{
}
File(const File& other)
: Path{ other.Path }
, Mode{ other.Mode }
, Handle(fopen(other.Path.c_str(), other.Mode))
{
}
File(File&& other) noexcept
: Path{ std::move(other.Path) }
, Mode{ other.Mode }
, Handle(other.Handle)
{
other.Handle = nullptr;
}
virtual ~File()
{
if (Handle)
{
fclose(Handle);
}
}
File& operator=(const File& other)
{
Path = other.Path;
Mode = other.Mode;
Handle = fopen(other.Path.c_str(), other.Mode);
return *this;
}
File& operator==(File&& other)
{
Path = std::move(other.Path);
Mode = other.Mode;
Handle = other.Handle;
other.Handle = nullptr;
return *this;
}
};
Avoid raw loops
Hand-implemented algorithms are error-prone and difficult to read.
Many common algorithms, and even parallelized versions, are
implemented in the algorithms library and ranges library for us and
can be used with a broad number of types. Readers of such code
encounter just a named algorithm which they’re likely already familiar
with rather than needing to interpret that from a possibly-complex
loop.
// Avoid
struct Player
{
const char* Name;
int NumPoints;
};
void Avoid(const std::vector<Player>& players)
{
using It =
std::reverse_iterator<std::vector<Player>::const_iterator
>;
for (It it = players.rbegin(); it != players.rend();
++it)
{
const Player& player = *it;
if (player.NumPoints > 25)
{
Player copy = player;
copy.NumPoints--;
DebugLog(copy.Name, copy.NumPoints);
}
}
}
// Encourage
struct Player
{
const char* Name;
int NumPoints;
};
void Encourage(const std::vector<Player>& players)
{
using namespace std::ranges::views;
auto result =
players
| filter([](Player p) { return p.NumPoints > 25;
})
| transform([](Player p) { p.NumPoints--; return
p; })
| reverse;
for (const Player& p : result)
{
DebugLog(p.Name, p.NumPoints);
}
}
Add restrictions
When using objects in a read-only way, make them const. When
code can usefully run at compile time, make it constexpr. When a
function can’t throw any exceptions, make it noexcept. When fields
don’t need to be used outside of a class, make them protected or
private. When derivation or overriding are undesirable, make
classes and member functions final. All of these restrictions will add
compiler-enforced rules that prevent misuse such as field access or
enable new uses such as compile-time code execution.
// Avoid
template <typename T>
T GetDefault()
{
T t; // Default constructor for classes but nothing
for primitives
return t;
}
struct Point2
{
float X{ 0 };
float Y{ 0 };
};
std::ostream& operator<<(std::ostream& s, const Point2&
p)
{
s << p.X << ", " << p.Y;
return s;
}
DebugLog(GetDefault<Point2>()); // 0, 0
DebugLog(GetDefault<int>()); // undefined behavior!
// Encourage
template <typename T>
T GetDefault()
{
T t{}; // Default constructor for classes, primitives
are value-initialized
return t;
}
DebugLog(GetDefault<Point2>()); // 0, 0
DebugLog(GetDefault<int>()); // 0
Standardize error-handling
There are two main choices for error-handling in C++: exceptions
and error codes. C’s errno isn’t considered a valid choice due to its
reliance on global state which is not part of the call signature and not
thread-safe. Codebases should choose one approach or the other to
handle errors consistently and safely. For example, introducing
exceptions into a codebase that uses error codes is likely to cause
uncaught exceptions that crash the program.
// Avoid
try
{
throw new std::runtime_error{ "Boom!" };
}
catch (std::runtime_error* err)
{
DebugLog(err->what()); // Boom!
// ... memory leak here ...
}
// Encourage
try
{
throw std::runtime_error{ "Boom!" };
}
catch (const std::runtime_error& err)
{
DebugLog(err.what()); // Boom!
}
////////
// Avoid
////////
// Caller doesn't know what happens upon error
// Caller can ignore error return values
FILE* OpenFile(const char* path, const char* mode)
{
return fopen(path, mode);
}
FILE* handle = OpenFile("/path/to/file", "rw");
fprintf(handle, "Hello!"); // Crash if null is returned
fclose(handle);
////////////
// Encourage
////////////
// Caller clearly knows this can fail due to the
std::optional return value
// Caller can't ignore it due to the [[nodiscard]]
attribute
[[nodiscard]] std::optional<FILE*> OpenFile(const char*
path, const char* mode)
{
FILE* handle = fopen(path, mode);
if (!handle)
{
return {};
}
return handle;
}
// Handling the return value is required by [[nodiscard]]
std::optional<FILE*> result = OpenFile("/path/to/file",
"rw");
if (!result.has_value())
{
DebugLog("Failed to open file");
return;
}
// Can't directly use the result. Forced to deal with it
being optional.
// Fewer chances to dereference null and crash
FILE* handle = result.value();
fprintf(handle, "Hello!");
fclose(handle);
Mark overridden member functions with override
A virtual member function that overrides a base class’ member
function doesn’t have to be marked that way, but it’s helpful to
indicate this. It provides a keyword that readers of the code can look
for to know how the function fits into the class design. It also
provides the compiler with a way to enforce that the function really
overrides a base class version. If the function signatures subtlely
don’t match or the base class no longer has such a function, the
compiler will catch the mistake instead of creating a new function.
// Avoid
struct Animal
{
virtual void Speak(const char* message, bool
loud=false)
{
// By default, animals can't speak
}
};
struct Dog : Animal
{
// Missing "loud" parameter creates a new function
virtual void Speak(const char* message)
{
DebugLog("woof: ", message);
}
};
std::unique_ptr<Animal> a = std::make_unique<Dog>();
a->Speak("go for a walk?"); // Prints nothing because Dog
doesn't override
// Encourage
// Avoid
struct Animal
{
virtual void Speak(const char* message, bool
loud=false)
{
// By default, animals can't speak
}
};
struct Dog : Animal
{
// Missing "loud" parameter is a compiler error
virtual void Speak(const char* message) override
{
DebugLog("woof: ", message);
}
};
std::unique_ptr<Animal> a = std::make_unique<Dog>();
a->Speak("go for a walk?"); // Never executed due to
compiler error
Use using, not typedef
C’s typedef alias is still supported, but using is a strictly better
version of it. The alias and the target are put into the familiar
assignment form where the left hand side is assigned to from the
right hand side. It also supports being templated, so it fits in better
with generic programming.
// Avoid
typedef float f32;
f32 pi = 3.14f;
// Encourage
using f32 = float;
f32 pi = 3.14f;
// Avoid
#define VEC(T) std::vector<T>
VEC(float) floats;
// Encourage
template <typename T> using Vec = std::vector<T>;
Vec<float> floats;
Minimize function definitions in header files
Header files are typically compiled many times as many translation
units directly or indirectly #include them. Any changes to the header
file will require recompiling all the translation units that #include it.
The linker will eventually de-duplicate these, but the compilation is
slow and so build and iteration times suffer. To reduce the time it
takes to compile header files, reduce the number of function
definitions in them. Instead, declare functions in them and define
them in translation units whenever possible.
////////
// Avoid
////////
// math.h
#pragma once
bool IsNearlyZero(float x)
{
return std::abs(x) < 0.0001f;
}
////////////
// Encourage
////////////
// math.h
#pragma once
bool IsNearlyZero(float x);
// math.cpp
#include "math.h"
bool IsNearlyZero(float x)
{
return std::abs(x) < 0.0001f;
}
Use internal linkage for file-specific definitions
By default, entities like variables, functions, and classes have
external linkage at file scope. This slows down compilation and the
linker because they need to consider the possibility that some other
translation unit might want to reference those entities. To speed it up,
use static or an unnamed namespace to give those entities internal
linkage and remove their candidacy for reference by other translation
units.
// Avoid
float PI = 3.14f;
// Encourage
static float PI = 3.14f;
// Encourage
namespace
{
float PI = 3.14f;
}
Use operator overloading and user-defined literals
very sparingly
Overloaded operators don’t really get a name and user-defined
literals usually only have a terse one. As such, it’s often hard for
readers to understand what they’re doing. Even worse, overloaded
operators may appear to have one meaning while the
implementation of the overloaded operator does something else.
These should generally be avoided except in cases where the
meaning is already well-understood. For example, the + operator on
two std::string objects is clearly concatenation of the left hand
operand followed by the right hand operand but the + operator on
two Player objects is quite a puzzle.
// Avoid
struct Player
{
int32_t Points;
Player operator+(const Player& other)
{
return { Points + other.Points };
}
};
Player a{ 100 };
Player b{ 200 };
Player c = a + b; // No conventional meaning for what +
does
DebugLog(c.Points); // 300
// Encourage
struct Vector2
{
float X;
float Y;
Vector2 operator+(const Vector2& other)
{
return { X + other.X, Y + other.Y };
}
};
Vector2 a{ 100, 200 };
Vector2 b{ 300, 400 };
Vector2 c = a + b; // Well-understood mathematical
operator
DebugLog(c.X, c.Y); // 400, 600
Prefer pre-increment to post-increment
Whether we use the pre-increment operator (++x) or the post-
increment operator (x++) on a primitive type like int makes no
difference. With classes that have overloaded this operator,
especially in the case of iterators, the pre-increment operator can be
implemented more efficiently by removing the need to temporarily
have two copies. It’s generally preferable to use the pre-increment
operator for this reason:
Regardless of the decision, and both are popular, both camps tend
to agree that very long types are hard to read and often made more
clear by the use of auto:
In the end, we’re not working on some abstract code. We’ll work on
particular codebases and it’s important to be aware of the norms of
those particular environments. Each will have their own written or
unwritten rules, not to mention very subjective thoughts on style such
as the placement of curly braces and whether indentation should be
done with tabs or spaces. These certainly aren’t C++-specific issues,
but in the case of C++ it’s wide use, large size, and long history
somewhat increase the challenge.
53. Conclusion
Language
C++ and C# have quite different design goals. C++ aims to be able
to be implemented by a compiler so efficiently that a programmer
would never need to use another language, like C, to improve
performance. In practice, assembly is sometimes used when
ultimate performance is required. It’s debatable as to whether this
counts as another language. C++ then tries to provide as much
programmer convenience as it can while also keeping to a high
degree of backward-compatibility.
The same kind of criticisms are made of C++, but in the opposite
direction. It’s focus on performance results in many sharp edges.
Variables aren’t initialized by default and it’s pretty easy to use a
“dangling” pointer. There’s a lot of “undefined behavior,” too. Most of
this is necessary because providing these guarantees is deemed to
be too limiting or would entail overhead such as the addition of a GC.
In the end, both languages have different goals and have made
decades of design choices in line with achieving those goals. Each
language becomes rather unpleasant to use outside its intended
purpose. C++ is a probably a poor choice for a web service and C#
is probably a poor choice for training a neural network. Heroic efforts
have been made to improve C++’s programmer-friendliness and
C#’s performance, but these remain uphill battles even after many
years of struggle.
Standard Library
C++’s standard library is much more conservative than C#’s. It’s
company- and industry-agnostic and sticks to well-standardized
techniques and algorithms. C#’s standard library is has a lot of
company-specific features, especially when it comes to Microsoft-
owned technologies such as Windows. In general, it’s a lot larger
than the C++ Standard Library as it contains all of this company-
specific functionality but also a lot of support for widely-used
standards such as JSON and AES. One consequence of this
broader support is that support for older features such as GDI+ are
carried forward as baggage in C# or dropped at the cost of
backward-compatibility.
In terms of design, the two again diverge quite a bit. C++ provides
powerful language features that enable it to efficiently implement
“core” types like strings and tuples in the C++ Standard Library. C#
prefers to build these into the language. Where C++ provides zero-
overhead extension, such as through template-based compile-time
polymorphism, of the types in its standard library, C# often provides
little extensibility or extensibility via mechanisms such as virtual
functions that entail a runtime cost. The C# standard library is
typically easier to use and more consistent across codebases but
with lower performance and customizability. This is an extension and
implication of the two languages’ design goals to their standard
libraries.
When we need this level of control, we’re outside the C#’s comfort
zone and we’ll face headwinds. To illustrate, let’s consider two paths
we could take to solve the problem. First, we can use a subset of C#
that doesn’t include features like classes. This is the route taken by
Unity’s Burst compiler and its “High Performance C#” (HPC#)
language subset. It uses structs and (unsafe) pointers instead of
classes in order to provide its own memory allocation and
deallocation.
The main issue with this approach is that a lot of C# language and
library design assumes that classes are present. When we kick them
out of the language, we lose our only mechanism that supports
inheritance, virtual functions, default constructors, and reference
semantics. We also make almost all C# libraries unusable as they
don’t conform to our language subset. The result is a very
constrained environment where we end up needing to call Dispose
functions to manually manage memory and where we cut ourselves
on sharp edges like the use of uninitialized objects due to the lack of
default constructors or the use of objects after calling Dispose.
Runtime safeguards can and have been added, but with runtime
overhead and feedback on programming errors delayed to runtime.
Neither is necessary in idiomatic C# where classes are used.
The second path is to keep using the whole language but in a very
unidiomatic way. This has been the traditional approach to C#
programming in Unity. One common example is the object pool
where we avoid releasing references so that the GC doesn’t run and
cause a frame spike:
It’s to the point that best practices discourage using these language
features outside of specialty code such as classes that own the
memory through their lifecycle functions. We need to instead use
library code that makes the raw language easier to use:
// Need a library
#include <memory>
void Foo()
{
// When needed, allocate and initialize a particle
std::unique_ptr<Particle> p =
std::make_unique<Particle>(Color::Red);
// ... use p ...
} // unique_ptr's destructor deletes the Particle
Practically, our best option is to learn the strong and weak suits of
the two languages and use them for the purposes they’re best suited
to. A deep knowledge of each language, their standard libraries, and
the surrounding world of libraries and frameworks is extremely
helpful when it comes to knowing what’s possible, what’s feasible,
which language to choose for which task, and, ultimately, how to go
about the process of actually implementing in the chosen language.