0% found this document useful (0 votes)
453 views314 pages

Design-Sys 2

This document discusses namespaces in C++ and provides several examples. It shows how namespaces avoid name collisions and allow the same name to be used in different contexts. Specifically, it demonstrates: 1) How namespaces work and how functions or variables with the same name in different namespaces do not cause ambiguities. 2) Nested namespaces and how names can be qualified to specify which namespace a name belongs to. 3) The std namespace that is reserved for standard library components. 4) How namespaces can affect each other if care is not taken, such as when a definition is added that breaks code in another namespace.

Uploaded by

drbulus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
453 views314 pages

Design-Sys 2

This document discusses namespaces in C++ and provides several examples. It shows how namespaces avoid name collisions and allow the same name to be used in different contexts. Specifically, it demonstrates: 1) How namespaces work and how functions or variables with the same name in different namespaces do not cause ambiguities. 2) Nested namespaces and how names can be qualified to specify which namespace a name belongs to. 3) The std namespace that is reserved for standard library components. 4) How namespaces can affect each other if care is not taken, such as when a definition is added that breaks code in another namespace.

Uploaded by

drbulus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 314

NAMESPACES 63

generated output:
fun called for 0
*/
The compiler is rather smart when handling namespaces. If Value in the namespace FBB would
have been defined as typedef int Value then FBB::Value would be recognized as int, thus
causing the Koenig lookup to fail.
As another example, consider the next program. Here two namespaces are involved, each defining
their own fun function. There is no ambiguity, since the argument defines the namespace and
FBB::fun is called:
#include <iostream>
namespace FBB
{
enum Value // defines FBB::Value
{
FIRST
};
void fun(Value x)
{
std::cout << "FBB::fun() called for " << x << ’\n’;
}
}
namespace ES
{
void fun(FBB::Value x)
{
std::cout << "ES::fun() called for " << x << ’\n’;
}
}
int main()
{
fun(FBB::FIRST); // No ambiguity: argument determines
// the namespace
}
/*
generated output:
FBB::fun() called for 0
*/
Here is an example in which there is an ambiguity: fun has two arguments, one from each namespace.
The ambiguity must be resolved by the programmer:
#include <iostream>
namespace ES
{
enum Value // defines ES::Value
{
64 CHAPTER 4. NAMESPACES
FIRST
};
}
namespace FBB
{
enum Value // defines FBB::Value
{
FIRST
};
void fun(Value x, ES::Value y)
{
std::cout << "FBB::fun() called\n";
}
}
namespace ES
{
void fun(FBB::Value x, Value y)
{
std::cout << "ES::fun() called\n";
}
}
int main()
{
// fun(FBB::FIRST, ES::FIRST); ambiguity: resolved by
// explicitly mentioning
// the namespace
ES::fun(FBB::FIRST, ES::FIRST);
}
/*
generated output:
ES::fun() called
*/
An interesting subtlety with namespaces is that definitions in one namespace may break the code
defined in another namespace. It shows that namespacesmay affect each other and that namespaces
may backfire if we’re not aware of their peculiarities. Consider the following example:
namespace FBB
{
struct Value
{};
void fun(int x);
void gun(Value x);
}
namespace ES
{
void fun(int x)
{
4.1. NAMESPACES 65
fun(x);
}
void gun(FBB::Value x)
{
gun(x);
}
}
Whatever happens, the programmer’d better not use any of the ES::fun functions since it results
in infinite recursion. However, that’s not the point. The point is that the programmer won’t even be
given the opportunity to call ES::fun since the compilation fails.
Compilation fails for gun but not for fun. But why is that so? Why is ES::fun flawlessly compiling
while ES::gun isn’t? In ES::fun fun(x) is called. As x’s type is not defined in a namespace the
Koenig lookup does not apply and fun calls itself with infinite recursion.
With ES::gun the argument is defined in the FBB namespace. Consequently, the FBB::gun function
is a possible candidate to be called. But ES::gun itself also is possible as ES::gun’s prototype
perfectly matches the call gun(x).
Now consider the situation where FBB::gun has not yet been declared. Then there is of course
no ambiguity. The programmer responsible for the ES namespace is resting happily. Some time
after that the programmer who’s maintaining the FBB namespace decides it may be nice to add
a function gun(Value x) to the FBB namespace. Now suddenly the code in the namespace ES
breaks because of an addition in a completely other namespace (FBB). Namespaces clearly are not
completely independent of each other and we should be aware of subtleties like the above. Later in
the C++ Annotations (chapter 11) we’ll return to this issue.
4.1.3 The standard namespace
The std namespace is reserved by C++. The standard defines many entities that are part of the
runtime available software (e.g., cout, cin, cerr); the templates defined in the Standard Template
Library (cf. chapter 18); and the Generic Algorithms (cf. chapter 19) are defined in the std
namespace.
Regarding the discussion in the previous section, using declarations may be used when referring
to entities in the std namespace. For example, to use the std::cout stream, the code may declare
this object as follows:
#include <iostream>
using std::cout;
Often, however, the identifiers defined in the std namespace can all be accepted without much
thought. Because of that, one frequently encounters a using directive, allowing the programmer
to omit a namespace prefix when referring to any of the entities defined in the namespace specified
with the using directive. Instead of specifying using declarations the following using directive is
frequently encountered: construction like
#include <iostream>
using namespace std;
Should a using directive, rather than using declarations be used? As a rule of thumb one might
decide to stick to using declarations, up to the point where the list becomes impractically long, at
which point a using directive could be considered.
66 CHAPTER 4. NAMESPACES
Two restrictions apply to using directives and declarations:
• Programmers should not declare or define anything inside the namespace std. This is not
compiler enforced but is imposed upon user code by the standard;
• Using declarations and directives should not be imposed upon code written by third parties.
In practice this means that using directives and declarations should be banned from header
files and should only be used in source files (cf. section 7.11.1).
4.1.3.1 The std::placeholders namespace
This section contains quite a few forward references. It merely introduces the placeholders namespace,
which is nested under the std namespace; this section can be skipped without loss of continuity.
Before using the namespace std::placeholders the <functional> header file must be included.
Further down the C++ Annotations we will encounter function objects (section 11.10), which are
‘objects’ that can be used as functions. Such function objects (also called functors) are extensively
used in the Standard Template Library (STL, chapter 18). The STL offers a function (bind, see
section 18.1.4.1), returning a function adaptor in which a function is called which may or may not
already have received its arguments. If not, then placeholders for arguments must be used, for
which actual arguments must be specified once the functor that is returned by bind is called.
Such placeholders have predefined names: _1, _2, _3, etc. These placeholders are defined in the
std::placeholders namespace. Several illustrations of the use of these placeholders are found in
section 18.1.4.1.
4.1.4 Nesting namespaces and namespace aliasing
Namespaces can be nested. Here is an example:
namespace CppAnnotations
{
int value;
namespace Virtual
{
void *pointer;
}
}
The variable value is defined in the CppAnnotations namespace. Within the CppAnnotations
namespace another namespace (Virtual) is nested. Within that latter namespace the variable
pointer is defined. To refer to these variable the following options are available:
• The fully qualified names can be used. A fully qualified name of an entity is a list of all the
namespaces that are encountered until reaching the definition of the entity. The namespaces
and entity are glued together by the scope resolution operator:
int main()
{
4.1. NAMESPACES 67
CppAnnotations::value = 0;
CppAnnotations::Virtual::pointer = 0;
}
• A using namespace CppAnnotations directive can be provided. Now value can be used
without any prefix, but pointer must be used with the Virtual:: prefix:
using namespace CppAnnotations;
int main()
{
value = 0;
Virtual::pointer = 0;
}
• A using namespace directive for the full namespace chain can be used. Now value needs its
CppAnnotations prefix again, but pointer doesn’t require a prefix anymore:
using namespace CppAnnotations::Virtual;
int main()
{
CppAnnotations::value = 0;
pointer = 0;
}
• When using two separate using namespace directives none of the namespace prefixes are
required anymore:
using namespace CppAnnotations;
using namespace Virtual;
int main()
{
value = 0;
pointer = 0;
}
• The same can be accomplished (i.e., no namespace prefixes) for specific variables by providing
specific using declarations:
using CppAnnotations::value;
using CppAnnotations::Virtual::pointer;
int main()
{
value = 0;
pointer = 0;
}
• A combination of using namespace directives and using declarations can also be used. E.g.,
a using namespace directive can be used for the CppAnnotations::Virtual namespace,
and a using declaration can be used for the CppAnnotations::value variable:
using namespace CppAnnotations::Virtual;
68 CHAPTER 4. NAMESPACES
using CppAnnotations::value;
int main()
{
pointer = 0;
}
Following a using namespace directive all entities of that namespace can be used without any
further prefix. If a single using namespace directive is used to refer to a nested namespace, then
all entities of that nested namespace can be used without any further prefix. However, the entities
defined in the more shallow namespace(s) still need the shallow namespace’s name(s). Only after
providing specific using namespace directives or using declarations namespace qualifications can
be omitted.
When fully qualified names are preferred but a long name like
CppAnnotations::Virtual::pointer
is considered too long, a namespace alias may be used:
namespace CV = CppAnnotations::Virtual;
This defines CV as an alias for the full name. The variable pointer may now be accessed using:
CV::pointer = 0;
A namespace alias can also be used in a using namespace directive or using declaration:
namespace CV = CppAnnotations::Virtual;
using namespace CV;
4.1.4.1 Defining entities outside of their namespaces
It is not strictly necessary to define members of namespaces inside a namespace region. But before
an entity is defined outside of a namespace it must have been declared inside its namespace.
To define an entity outside of its namespace its name must be fully qualified by prefixing the member
by its namespaces. The definition may be provided at the global level or at intermediate levels in the
case of nested namespaces. This allows us to define an entity belonging to namespace A::B within
the region of namespace A.
Assume the type int INT8[8] is defined in the CppAnnotations::Virtual namespace. Furthermore
assume that it is our intent to define a function squares, inside the namespace
CppAnnotations::Virtual returning a pointer to CppAnnotations::Virtual::INT8.
Having defined the prerequisites within the CppAnnotations::Virtual namespace, our function
could be defined as follows (cf. chapter 9 for coverage of the memory allocation operator new[]):
namespace CppAnnotations
{
namespace Virtual
4.1. NAMESPACES 69
{
void *pointer;
typedef int INT8[8];
INT8 *squares()
{
INT8 *ip = new INT8[1];
for (size_t idx = 0; idx != sizeof(INT8) / sizeof(int); ++idx)
(*ip)[idx] = (idx + 1) * (idx + 1);
return ip;
}
}
}
The function squares defines an array of one INT8 vector, and returns its address after initializing
the vector by the squares of the first eight natural numbers.
Now the function squares can be defined outside of the CppAnnotations::Virtual namespace:
namespace CppAnnotations
{
namespace Virtual
{
void *pointer;
typedef int INT8[8];
INT8 *squares();
}
}
CppAnnotations::Virtual::INT8 *CppAnnotations::Virtual::squares()
{
INT8 *ip = new INT8[1];
for (size_t idx = 0; idx != sizeof(INT8) / sizeof(int); ++idx)
(*ip)[idx] = (idx + 1) * (idx + 1);
return ip;
}
In the above code fragment note the following:
• squares is declared inside of the CppAnnotations::Virtual namespace.
• The definition outside of the namespace region requires us to use the fully qualified name of
the function and of its return type.
• Inside the body of the function squares we are within the CppAnnotations::Virtual
namespace, so inside the function fully qualified names (e.g., for INT8) are not required any
more.
70 CHAPTER 4. NAMESPACES
Finally, note that the function could also have been defined in the CppAnnotations region. In
that case the Virtual namespace would have been required when defining squares() and when
specifying its return type, while the internals of the function would remain the same:
namespace CppAnnotations
{
namespace Virtual
{
void *pointer;
typedef int INT8[8];
INT8 *squares();
}
Virtual::INT8 *Virtual::squares()
{
INT8 *ip = new INT8[1];
for (size_t idx = 0; idx != sizeof(INT8) / sizeof(int); ++idx)
(*ip)[idx] = (idx + 1) * (idx + 1);
return ip;
}
}

Chapter 5
The ‘string’ Data Type
C++ offers many solutions for common problems. Most of these facilities are part of the Standard
Template Library or they are implemented as generic algorithms (see chapter 19).
Among the facilities C++ programmers have developed over and over again are those manipulating
chunks of text, commonly called strings. The C programming language offers rudimentary string
support. C’s NTBS is the foundation upon which an enormous amount of code has been built1.
To process text C++ offers a std::string type. In C++ the traditional C library functions manipulating
NTB strings are deprecated in favor of using string objects. Many problems in C programs
are caused by buffer overruns, boundary errors and allocation problems that can be traced back
to improperly using these traditional C string library functions. Many of these problems can be
prevented using C++ string objects.
Actually, string objects are class type variables, and in that sense they are comparable to stream
objects like cin and cout. In this section the use of string type objects is covered. The focus is on
their definition and their use. When using string objects the member function syntax is commonly
used:
stringVariable.operation(argumentList)
For example, if string1 and string2 are variables of type std::string, then
string1.compare(string2)
can be used to compare both strings.
In addition to the commonmember functions the string class also offers a wide variety of operators,
like the assignment (=) and the comparison operator (==). Operators often result in code that is
easy to understand and their use is generally preferred over the use of member functions offering
comparable functionality. E.g., rather than writing
if (string1.compare(string2) == 0)
the following is generally preferred:
if (string1 == string2)
1An NTBS (null-terminated byte string, also NTB string) is a character sequence whose highest-addressed element with
defined content has the value zero (the terminating null character); no other character in the sequence has the value zero.
71
72 CHAPTER 5. THE ‘STRING’ DATA TYPE
To define and use string-type objects, sources must include the header file <string>. To merely
declare the string type the header iosfwd can be included.
In addition to std::string, the header file string defines the following string types:
• std::wstring, a string type consisting of wchar_t characters;
• std::u16string, a string type consisting of char16_t characters;
• std::u32string, a string type consisting of char32_t characters.
5.1 Operations on strings
Some of the operations that can be performed on strings return indices within the strings. Whenever
such an operation fails to find an appropriate index, the value string::npos is returned. This
value is a symbolic value of type string::size_type, which is (for all practical purposes) an
(unsigned) int.
All string members accepting string objects as arguments also accept char const _ (NTBS)
arguments. The same usually holds true for operators accepting string objects.
Some string-members use iterators. Iterators are formally introduced in section 18.2. Member
functions using iterators are listed in the next section (5.2), but the iterator concept itself is not
further covered by this chapter.
Strings support a large variety of members and operators. A short overview listing their capabilities
is provided in this section, with subsequent sections offering a detailed discussion. The bottom line:
C++ strings are extremely versatile and there is hardly a reason for falling back on the C library to
process text. C++ strings handle all the required memory management and thus memory related
problems, which is the #1 source of problems in C programs, can be prevented when C++ strings are
used. Strings do come at a price, though. The class’s extensive capabilities have also turned it into
a beast. It’s hard to learn and master all its features and in the end you’ll find that not all that you
expected is actually there. For example, std::string doesn’t offer case-insensitive comparisons.
But in the end it isn’t even as simple as that. It is there, but it is somewhat hidden and at this
point in the C++ Annotations it’s too early to study into that hidden corner yet. Instead, realize
that C’s standard library does offer useful functions that can be used as long as we’re aware of their
limitations and are able to avoid their traps. So for now, to perform a traditional case-insensitive
comparison of the contents of two std::string objects str1 and str2 the following will do:
strcasecmp(str1.c_str(), str2.c_str());
Strings support the following functionality:
• initialization:
when string objects are defined they are always properly initialized. In other words,
they are always in a valid state. Strings may be initialized empty or already existing
text can be used to initialize strings.
• assignment:
strings may be given new values. New values may be assigned using member functions
(like assign) but a plain assignment operator (i.e., =)may also be used. Furthermore,
assignment to a character buffer is also supported.
5.2. A STD::STRING REFERENCE 73
• conversions:
the partial or complete contents of string objects may be interpreted as C strings
but the string’s contents may also be processed as a series of raw binary bytes, not
necessarily terminating in a 0-valued character. Furthermore, in many situations
plain characters and C strings may be used where std::strings are accepted as
well.
• breakdown:
the individual characters stored in a string can be accessed using the familiar index
operator ([]) allowing us to either access or modify information in the middle of a
string.
• comparisons:
strings may be compared to other strings (NTB strings) using the familiar logical
comparison operators ==, !=, <, <=, > and >=. There are also member functions
available offering a more fine-grained comparison.
• modification:
the contents of strings may be modified in many ways. Operators are available to add
information to string objects, to insert information in the middle of string objects, or
to replace or erase (parts of) a string’s contents.
• swapping:
the string’s swapping capability allows us in principle to exchange the contents of two
string objects without a byte-by-byte copying operation of the string’s contents.
• searching:
the locations of characters, sets of characters, or series of characters may be searched
for from any position within the string object and either searching in a forward or
backward direction.
• housekeeping:
several housekeeping facilities are offered: the string’s length, or its empty-state may
be interrogated. But string objects may also be resized.
• stream I/O:
strings may be extracted from or inserted into streams. In addition to plain string
extraction a line of a text file may be read without running the risk of a buffer overrun.
Since extraction and insertion operations are stream based the I/O facilities are
device independent.
5.2 A std::string reference
In this section the string members and string-related operations are referenced. The subsections
cover, respectively the string’s initializers, iterators, operators, and member functions. The following
terminology is used throughout this section:
• object is always a string-object;
74 CHAPTER 5. THE ‘STRING’ DATA TYPE
• argument is a string const & or a char const _ unless indicated otherwise. The contents
of an argument never is modified by the operation processing the argument;
• opos refers to an offset into an object string;
• apos refers to an offset into an argument;
• on represents a number of characters in an object (starting at opos);
• an represents a number of characters in an argument (starting at apos).
Both opos and apos must refer to existing offsets, or an exception (cf. chapter 10) is generated. In
contrast, an and on may exceed the number of available characters, in which case only the available
characters are considered.
Many members declare default values for on, an and apos. Some members declare default values
for opos. Default offset values are 0, the default values of on and an is string::npos, which can
be interpreted as ‘the required number of characters to reach the end of the string’.
With members starting their operations at the end of the string object’s contents proceeding backwards,
the default value of opos is the index of the object’s last character, with on by default equal
to opos + 1, representing the length of the substring ending at opos.
In the overview of member functions presented below it may be assumed that all these parameters
accept default values unless indicated otherwise. Of course, the default argument values cannot be
used if a function requires additional arguments beyond the ones otherwise accepting default values.
Some members have overloaded versions expecting an initial argument of type char const _. But
even if that is not the case the first argument can always be of type char const _ where a parameter
of std::string is defined.
Several member functions accept iterators. Section 18.2 covers the technical aspects of iterators,
but these may be ignored at this point without loss of continuity. Like apos and opos, iterators
must refer to existing positions and/or to an existing range of characters within the string object’s
contents.
All string-member functions computing indices return the predefined constant string::npos on
failure.
The C++14 standard offers the s literal suffix to indicate that a std::string constant is intended
when a string literal (like "hello world") is used. When string literals are used in the context of
std::string objects, this literal suffix is hardly ever required, but it may come in handy when using
the auto keyword. E.g., auto str = "hello world"s defines std::string str, whereas
it would have been a char const _ if the literal suffix had been omitted.
5.2.1 Initializers
After defining string objects they are guaranteed to be in a valid state. At definition time string
objects may be initialized in one of the following ways: The following string constructors are available:
• string object:
initializes object to an empty string. When defining a string this way no argument
list may be specified;
5.2. A STD::STRING REFERENCE 75
• string object(string::size_type count, char ch):
initializes object with count characters ch;
• string object(string const &argument):
initializes object with argument;
• string object(std::string const &argument, string::size_type apos,
string::size_type an):
initializes object with argument’s contents starting at index position apos, using
at most an of argument’s characters;
• string object(InputIterator begin, InputIterator end):
initializes object with the characters in the range of characters defined by the two
InputIterators.
5.2.2 Iterators
See section 18.2 for details about iterators. As a quick introduction to iterators: an iterator acts
like a pointer, and pointers can often be used in situations where iterators are requested. Iterators
usually come in pairs, defining a range of entities. The begin-iterator points to the first entity, the
end-iterator points just beyond the last entity of the range. Their difference is equal to the number
of entities in the iterator-range.
Iterators play an important role in the context of generic algorithms (cf. chapter 19). The class
std::string defines the following iterator types:
• string::iterator and string::const_iterator:
these iterators are forward iterators. The const_iterator is returned by string
const objects, the plain iterator is returned by non-const string objects. Characters
referred to by iterators may be modified;
• string::reverse_iterator and string::reverse_const_iterator:
these iterators are also forward iterators but when incrementing the iterator the previous
character in the string object is reached. Other than that they are comparable
to, respectively, string::iterator and string::const_iterator.
5.2.3 Operators
String objects may be manipulated by member functions but also by operators. Using operators
often results in more natural-looking code. In cases where operators are available having equivalent
functionality as member function the operator is practically always preferred.
The following operators are available for string objects (in the examples ‘object’ and ‘argument’
refer to existing std::string objects).
• plain assignment:
a character, C or C++ string may be assigned to a string object. The assignment
operator returns its left-hand side operand. Example:
object = argument;
76 CHAPTER 5. THE ‘STRING’ DATA TYPE
object = "C string";
object = ’x’;
object = 120; // same as object = ’x’
• addition:
the arithmetic additive assignment operator and the addition operator add text to
a string object. The compound assignment operator returns its left-hand side
operand, the addition operator returns its result in a temporary string object. When
using the addition operator either the left-hand side operand or the right-hand side
operandmust be a std::string object. The other operand may be a char, a C string
or a C++ string. Example:
object += argument;
object += "hello";
object += ’x’; // integral expressions are OK
argument + otherArgument; // two std::string objects
argument + "hello"; // using + at least one
"hello" + argument; // std::string is required
argument + ’a’; // integral expressions are OK
’a’ + argument;
• index operator:
The index operator may be used to retrieve object’s individual characters, or to
assign new values to individual characters of a non-const string object. There is no
range-checking (use the at() member function for that). This operator returns a
char & or char const &. Example:
object[3] = argument[5];
• logical operators:
the logical comparison operators may be applied to two string objects or to a string
object and a C string to compare their contents. These operators return a bool value.
The ==, !=, >, >=, <, and <= operators are available. The ordering operators
perform a lexicographical comparison of their contents using the ASCII character
collating sequence. Example:
object == object; // true
object != (object + ’x’); // true
object <= (object + ’x’); // true
• stream related operators:
the insertion-operator (cf. section 3.1.4) may be used to insert a string object into
an ostream, the extraction-operator may be used to extract a string object from an
istream. The extraction operator by default first ignores all white space characters
and then extracts all consecutively non-blank characters from an istream. Instead
of a string a character array may be extracted as well, but the advantage of using a
string object should be clear: the destination string object is automatically resized to
the required number of characters. Example:
cin >> object;
cout << object;
5.2. A STD::STRING REFERENCE 77
5.2.4 Member functions
The std::string class offers many member function as well as additional non-member functions
that should be considered part of the string class. All these functions are listed below in alphabetic
order.
The symbolic value string::npos is defined by the string class. It represents ‘index-not-found’
when returned by member functions returning string offset positions. Example: when calling
‘object.find(’x’)’ (see below) on a string object not containing the character ’x’, npos is returned,
as the requested position does not exist.
The final 0-byte used in C strings to indicate the end of an NTBS is not considered part of a C++
string, and so the member function will return npos, rather than length() when looking for 0 in a
string object containing the characters of a C string.
Here are the standard functions that operate on objects of the class string. When a parameter
of size_t is mentioned it may be interpreted as a parameter of type string::size_type,
but without defining a default argument value. The type size_type should be read as
string::size_type. With size_type the default argument values mentioned in section 5.2 apply.
All quoted functions are member functions of the class std::string, except where indicated
otherwise.
• char &at(size_t opos):
a reference to the character at the indicated position is returned. When called with
string const objects a char const & is returned. The member function performs
range-checking, raising an exception (that by default aborts the program) if an invalid
index is passed.
• string &append(InputIterator begin, InputIterator end):
the characters in the range defined by begin and end are appended to the current
string object.
• string &append(string const &argument, size_type apos, size_type an):
argument (or a substring) is appended to the current string object.
• string &append(char const _argument, size_type an):
the first an characters of argument are appended to the string object.
• string &append(size_type n, char ch):
n characters ch are appended to the current string object.
• string &assign(string const &argument, size_type apos, size_type an):
argument (or a substring) is assigned to the string object. If argument is of type
char const _ and one additional argument is provided the second argument is interpreted
as a value initializing an, using 0 to initialize apos.
• string &assign(size_type n, char ch):
n characters ch are assigned to the current string object.
• char &back():
returns a reference to the last char stored inside the string object. The result is
undefined for empty strings.
78 CHAPTER 5. THE ‘STRING’ DATA TYPE
• string::iterator begin():
an iterator referring to the first character of the current string object is returned.
With const string objects a const_iterator is returned.
• size_type capacity() const:
the number of characters that can currently be stored in the string object without
needing to resize it is returned.
• string::const_iterator cbegin():
a const_iterator referring to the first character of the current string object is
returned.
• string::const_iterator cend():
a const_iterator referring to the end of the current string object is returned.
• int compare(string const &argument) const:
the text stored in the current string object and the text stored in argument is compared
using a lexicographical comparison using the ASCII character collating sequence.
zero is returned if the two strings have identical contents, a negative value
is returned if the text in the current object should be ordered before the text in
argument; a positive value is returned if the text in the current object should be
ordered beyond the text in argument.
• int compare(size_t opos, size_t on, string const &argument) const:
a substring of the text stored in the current string object is compared to the text
stored in argument. At most on characters starting at offset opos are compared to
the text in argument.
• int compare(size_t opos, size_t on, string const &argument, size_type
apos, size_type an):
a substring of the text stored in the current string object is compared to a substring
of the text stored in argument. At most on characters of the current string object,
starting at offset opos, are compared to at most an characters of argument, starting
at offset apos. In this case argument must be a string object.
• int compare(size_t opos, size_t on, char const _argument, size_t an):
a substring of the text stored in the current string object is compared to a substring of
the text stored in argument. At most on characters of the current string object starting
at offset opos are compared to at most an characters of argument. Argument
must have at least an characters. The characters may have arbitrary values: 0-
valued characters have no special meanings.
• size_t copy(char _argument, size_t on, size_type opos) const:
the contents of the current string object are (partially) copied into argument. The
actual number of characters copied is returned. The second argument, specifying the
number of characters to copy, from the current string object is required. No 0-valued
character is appended to the copied string but can be appended to the copied text
using an idiom like the following:
argument[object.copy(argument, string::npos)] = 0;
Of course, the programmer should make sure that argument’s size is large enough
to accomodate the additional 0-byte.
5.2. A STD::STRING REFERENCE 79
• string::const_reverse_iterator crbegin():
a const_reverse_iterator referring to the last character of the current string
object is returned.
• string::const_reverse_iterator crend():
a const_reverse_iterator referring to the begin of the current string object is
returned.
• char const _c_str() const:
the contents of the current string object as an NTBS.
• char const _data() const:
the raw contents of the current string object are returned. Since thismember does not
return an NTBS (as c_str does), it can be used to retrieve any kind of information
stored inside the current string object including, e.g., series of 0-bytes:
string s(2, 0);
cout << static_cast<int>(s.data()[1]) << ’\n’;
• bool empty() const:
true is returned if the current string object contains no data.
• string::iterator end():
an iterator referring to the position just beyond the last character of the current
string object is returned. With const string objects a const_iterator is returned.
• string &erase(size_type opos, size_type on):
a (sub)string of the information stored in the current string object is erased.
• string::iterator erase(string::iterator begin, string::iterator end):
the parameter end is optional. If omitted the value returned by the current object’s
end member is used. The characters defined by the begin and end iterators are
erased. The iterator begin is returned, which is then referring to the position immediately
following the last erased character.
• size_t find(string const &argument, size_type opos) const:
the first index in the current string object where argument is found is returned.
• size_t find(char const _argument, size_type opos, size_type an) const:
the first index in the current string object where argument is found is returned.
When all three arguments are specified the first argument must be a char const
_.
• size_t find(char ch, size_type opos) const:
the first index in the current string object where ch is found is returned.
• size_t find_first_of(string const &argument, size_type opos) const:
the first index in the current string object where any character in argument is found
is returned.
80 CHAPTER 5. THE ‘STRING’ DATA TYPE
• size_type find_first_of(char const _argument, size_type opos, size_type
an) const:
the first index in the current string object where any character in argument is found
is returned. If opos is provided it refers to the first index in the current string object
where the search for argument should start. If omitted, the string object is scanned
completely. If an is provided it indicates the number of characters of the char const
_ argument that should be used in the search. It defines a substring starting at the
beginning of argument. If omitted, all of argument’s characters are used.
• size_type find_first_of(char ch, size_type opos):
the first index in the current string object where character ch is found is returned.
• size_t find_first_not_of(char ch, size_type opos) const:
the first index in the current string object where another character than ch is found
is returned.
• size_t find_last_of(string const &argument, size_type opos) const:
the last index in the current string object where any character in argument is found
is returned.
• size_type find_last_of(char const _argument, size_type opos, size_type
an) const:
the last index in the current string object where any character in argument is found
is returned. If opos is provided it refers to the last index in the current string object
where the search for argument should start. If omitted, the string object is scanned
completely. If an is provided it indicates the number of characters of the char const
_ argument that should be used in the search. It defines a substring starting at the
beginning of argument. If omitted, all of argument’s characters are used.
• size_type find_last_of(char ch, size_type opos):
the last index in the current string object where character ch is found is returned.
• size_t find_last_not_of(string const &argument, size_type opos) const:
the last index in the current string object where any character not appearing in
argument is found is returned.
• char &front():
returns a reference to the first char stored inside the string object. The result is
undefined for empty strings.
• allocator_type get_allocator():
returns the allocator of the class std::string
• istream &std::getline(istream &istr, string &object, char delimiter =
’\n’):
Note: this is not a member function of the class string.
A line of text is read from istr. All characters until delimiter (or the end of the
stream, whichever comes first) are read from istr and are stored in object. If the
delimiter is encountered it is removed from the stream, but is not stored in line.
If the delimiter is not found, istr.eof returns true (see section 6.3.1). Since
5.2. A STD::STRING REFERENCE 81
streams may be interpreted as bool values (cf. section 6.3.1) a commonly encountered
idiom to read all lines from a stream successively into a string object line
looks like this:
while (getline(istr, line))
process(line);
The contents of the last line, whether or not it was terminated by a delimiter, is
eventually also assigned to object.
• string &insert(size_t opos, string const &argument, size_type apos,
size_type an):
a (sub)string of argument is inserted into the current string object at the current
string object’s index position opos. Arguments for apos and an must either both be
provided or they must both be omitted.
• string &insert(size_t opos, char const _argument, size_type an):
argument (of type char const _) is inserted at index opos into the current string
object.
• string &insert(size_t opos, size_t count, char ch):
Count characters ch are inserted at index opos into the current string object.
• string::iterator insert(string::iterator begin, char ch):
the character ch is inserted at the current object’s position referred to by begin.
Begin is returned.
• string::iterator insert(string::iterator begin, size_t count, char ch):
Count characters ch are inserted at the current object’s position referred to by begin.
Begin is returned.
• string::iterator insert(string::iterator begin, InputIterator abegin,
InputIterator aend):
the characters in the range defined by the InputIterators abegin and aend are
inserted at the current object’s position referred to by begin. Begin is returned.
• size_t length() const:
the number of characters stored in the current string object is returned.
• size_t max_size() const:
the maximum number of characters that can be stored in the current string object is
returned.
• void pop_back():
The string’s last character is removed from the string object.
• void push_back(char ch):
The character ch is appended to the string object.
• string::reverse_iterator rbegin():
a reverse iterator referring to the last character of the current string object is returned.
With const string objects a reverse_const_iterator is returned.
82 CHAPTER 5. THE ‘STRING’ DATA TYPE
• string::iterator rend():
a reverse iterator referring to the position just before the first character of the current
string object is returned. With const string objects a reverse_const_iterator is
returned.
• string &replace(size_t opos, size_t on, string const &argument,
size_type apos, size_type an):
a (sub)string of characters in object are replaced by the (subset of) characters of
argument. If on is specified as 0 argument is inserted into object at offset opos.
• string &replace(size_t opos, size_t on, char const _argument, size_type
an):
a series of characters in object are replaced by the first an characters of char
const _ argument.
• string &replace(size_t opos, size_t on, size_type count, char ch):
on characters of the current string object, starting at index position opos, are replaced
by count characters ch.
• string &replace(string::iterator begin, string::iterator end, string
const &argument):
the series of characters in the current string object defined by the iterators begin
and end are replaced by argument. If argument is a char const _, an additional
argument an may be used, specifying the number of characters of argument that are
used in the replacement.
• string &replace(string::iterator begin, string::iterator end, size_type
count, char ch):
the series of characters in the current string object defined by the iterators begin
and end are replaced by count characters having values ch.
• string &replace(string::iterator begin, string::iterator end,
InputIterator abegin, InputIterator aend):
the series of characters in the current string object defined by the iterators begin
and end are replaced by the characters in the range defined by the InputIterators
abegin and aend.
• void reserve(size_t request):
the current string object’s capacity is changed to at least request. After calling
this member, capacity’s return value will be at least request. A request
for a smaller size than the value returned by capacity is ignored. A
std::length_error exception is thrown if request exceeds the value returned
by max_size (std::length_error is defined in the stdexcept header). Calling
reserve() has the effect of redefining a string’s capacity, not of actually making
available the memory to the program. This is illustrated by the exception thrown by
the string’s at() member when trying to access an element exceeding the string’s
size but not the string’s capacity.
• void resize(size_t size, char ch = 0):
the current string object is resized to size characters. If the string object is resized
to a size larger than its current size the additional characters will be initialized to
ch. If it is reduced in size the characters having the highest indices are chopped off.
5.2. A STD::STRING REFERENCE 83
• size_t rfind(string const &argument, size_type opos) const:
the last index in the current string object where argument is found is returned.
Searching proceeds from the current object’s offset opos back to its beginning.
• size_t rfind(char const _argument, size_type opos, size_type an) const:
the last index in the current string object where argument is found is returned.
Searching proceeds from the current object’s offset opos back to its beginning. The
parameter an specifies the length of the substring of argument to look for, starting
at argument’s beginning.
• size_t rfind(char ch, size_type opos)const:
the last index in the current string object where ch is found is returned. Searching
proceeds from the current object’s offset opos back to its beginning.
• void shrink_to_fit():
optionally reduces the amount of memory allocated by a vector to its current size.
The implementor is free to ignore or otherwise optimize this request. In order to
guarantee a ‘shrink to fit’ operation the
string(stringObject).swap(stringObject)
idiom can be used.
• size_t size() const:
the number of characters stored in the current string object is returned. This member
is a synonym of length().
• string substr(size_type opos, size_type on) const:
a substring of the current string object of at most on characters starting at index
opos is returned.
• void swap(string &argument):
the contents of the current string object are swapped with the contents of argument.
For this member argument must be a string object and cannot be a char const _.
5.2.5 Conversion functions
Several string conversion functions are available operating on or producing std::string objects.
These functions are listed below in alphabetic order. They are not member functions, but class-less
(free) functions declared in the std namespace. The <string> header file must be included before
they can be used.
• float stof(std::string const &str, size_t _pos = 0):
Initial white space characters in str are ignored. Then the following sequences of
characters are converted to a float value, which is returned:
– A decimal floating point constant:
_ An optional + or - character
_ A series of decimal digits, possibly containing one decimal point character
_ An optional e or E character, followed by an optional - or + character, followed
by a series of decimal digits
84 CHAPTER 5. THE ‘STRING’ DATA TYPE
– A hexadecimal floating point constant:
_ An optional + or - character
_ 0x or 0X
_ A series of hexadecimal digits, possibly containing one decimal point character
_ An optional p or P character, followed by an optional - or + character, followed
by a series of decimal digits
– An infinity expression:
_ An optional + or - character
_ The words inf or infinity (case insensitive words)
– A ‘not a number’ expression:
_ An optional + or - character
_ The words nan or nan(alphanumeric character sequence) (nan is a
case insensitive word), resulting in a NaN floating point value
If pos != 0 the index of the first character in str which was not converted is returned
in _pos. A std::invalid_argument exception is thrown if the characters
in str could not be converted to a float, a std::out_of_range exception is thrown
if the converted value would have exceeded the range of float values.
• double stod(std::string const &str, size_t _pos = 0):
A conversion as described with stof is performed, but now to a value of type double.
• double stold(std::string const &str, size_t _pos = 0):
A conversion as described with stof is performed, but now to a value of type long
double.
• int stoi(std::string const &str, size_t _pos = 0, int base = 10):
Initial white space characters in str are ignored. Then all characters representing
numeric constants of the number system whose base is specified are converted to
an int value, which is returned. An optional + or - character may prefix the numeric
characters. Values starting with 0 are automatically interpreted as octal values,
values starting with 0x or 0X as hexadecimal characters. The value base must
be between 2 and 36. If pos != 0 the index of the first character in str which was
not converted is returned in _pos. A std::invalid_argument exception is thrown
if the characters in str could not be converted to an int, a std::out_of_range
exception is thrown if the converted value would have exceeded the range of int
values.
Here is an example of its use:
int value = stoi(string(" -123")); // assigns value -123
value = stoi(string(" 123"), 0, 5); // assigns value 38
• long stol(std::string const &str, size_t _pos = 0, int base = 10):
A conversion as described with stoi is performed, but now to a value of type long.
• long long stoll(std::string const &str, size_t _pos = 0, int base = 10):
A conversion as described with stoi is performed, but now to a value of type long
long.
• unsigned long stoul(std::string const &str, size_t _pos = 0, int base =
10):
A conversion as described with stoi (not allowing an initial + or - character) is performed,
but now to a value of type unsigned long.
5.2. A STD::STRING REFERENCE 85
• unsigned long long stoull(std::string const &str, size_t _pos = 0, int
base = 10):
A conversion as described with stoul is performed, but now to a value of type
unsigned long long.
• std::string to_string(Type value):
Type can be of the types int, long, long long, unsigned, unsigned long,
unsigned long long, float, double, or long double. The value of the argument
is converted to a textual representation, which is returned as a std::string
value.
• std::string to_wstring(Type value):
The conversion as described at to_string is performed, returning a std::wstring.
86 CHAPTER 5. THE ‘STRING’ DATA TYPE

Chapter 6
The IO-stream Library
Extending the standard stream (FILE) approach, well known from the C programming language,
C++ offers an input/output (I/O) library based on class concepts.
All C++ I/O facilities are defined in the namespace std. The std:: prefix is omitted below, except
for situations where this would result in ambiguities.
Earlier (in chapter 3) we’ve seen several examples of the use of the C++ I/O library, in particular
showing insertion operator (<<) and the extraction operator (>>). In this chapter we’ll cover I/O in
more detail.
The discussion of input and output facilities provided by the C++ programming language heavily
uses the class concept and the notion of member functions. Although class construction has not
yet been covered (for that see chapter 7) and although inheritance is not covered formally before
chapter 13, it is quite possible to discuss I/O facilities long before the technical background of class
construction has been covered.
Most C++ I/O classes have names starting with basic_ (like basic_ios). However, these basic_
names are not regularly found in C++ programs, as most classes are also defined using typedef
definitions like:
typedef basic_ios<char> ios;
Since C++ supports various kinds of character types (e.g., char, wchar_t), I/O facilities were developed
using the template mechanism allowing for easy conversions to character types other than the
traditional char type. As elaborated in chapter 21, this also allows the construction of generic software,
that could thereupon be used for any particular type representing characters. So, analogously
to the above typedef there exists a
typedef basic_ios<wchar_t> wios;
This type definition can be used for the wchar_t type. Because of the existence of these type definitions,
the basic_ prefix was omitted from the C++ Annotations without loss of continuity. The C++
Annotations primarily focus on the standard 8-bits char type.
Iostream objects cannot be declared using standard forward declarations, like:
class std::ostream; // now erroneous
87
88 CHAPTER 6. THE IO-STREAM LIBRARY
Instead, to declare iostream classes the <iosfwd> header file should be included:
#include <iosfwd> // correct way to declare iostream classes
Using C++ I/O offers the additional advantage of type safety. Objects (or plain values) are inserted
into streams. Compare this to the situation commonly encountered in C where the fprintf function
is used to indicate by a format string what kind of value to expect where. Compared to this latter
situation C++’s iostream approach immediately uses the objects where their values should appear,
as in
cout << "There were " << nMaidens << " virgins present\n";
The compiler notices the type of the nMaidens variable, inserting its proper value at the appropriate
place in the sentence inserted into the cout iostream.
Compare this to the situation encountered in C. Although C compilers are getting smarter and
smarter, and although a well-designed C compiler may warn you for a mismatch between a format
specifier and the type of a variable encountered in the corresponding position of the argument list
of a printf statement, it can’t do much more than warn you. The type safety seen in C++ prevents
you from making type mismatches, as there are no types to match.
Apart from this, iostreams offer more or less the same set of possibilities as the standard FILE-
based I/O used in C: files can be opened, closed, positioned, read, written, etc.. In C++ the basic
FILE structure, as used in C, is still available. But C++ adds to this I/O based on classes, resulting
in type safety, extensibility, and a clean design.
In the ANSI/ISO standard the intent was to create architecture independent I/O. Previous implementations
of the iostreams library did not always comply with the standard, resulting in many
extensions to the standard. The I/O sections of previously developed software may have to be partially
rewritten. This is tough for those who are now forced to modify old software, but every feature
and extension that was once available can be rebuilt easily using ANSI/ISO standard conforming
I/O. Not all of these reimplementations can be covered in this chapter, as many reimplementations
rely on inheritance and polymorphism, which topics are formally covered by chapters 13 and 14.
Selected reimplementations are provided in chapter 24, and in this chapter references to particular
sections in other chapters are given where appropriate. This chapter is organized as follows (see
also Figure 6.1):
• The class ios_base is the foundation upon which the iostreams I/O library was built. It
defines the core of all I/O operations and offers, among other things, facilities for inspecting
the state of I/O streams and for output formatting.
• The class ios was directly derived fromios_base. Every class of the I/O library doing input or
output is itself derived fromthis ios class, and so inherits its (and, by implication, ios_base’s)
capabilities. The reader is urged to keep this in mind while reading this chapter. The concept
of inheritance is not discussed here, but rather in chapter 13.
The class ios is important in that it implements the communication with a buffer that is
used by streams. This buffer is a streambuf object which is responsible for the actual I/O
to/from the underlying device. Consequently iostream objects do not perform I/O operations
themselves, but leave these to the (stream)buffer objects with which they are associated.
• Next, basic C++ output facilities are discussed. The basic class used for output operations
is ostream, defining the insertion operator as well as other facilities writing information to
streams. Apart from inserting information into files it is possible to insert information into
memory buffers, for which the ostringstream class is available. Formatting output is to a
89
Figure 6.1: Central I/O Classes
90 CHAPTER 6. THE IO-STREAM LIBRARY
great extent possible using the facilities defined in the ios class, but it is also possible to insert
formatting commands directly into streams using manipulators. This aspect of C++ output is
discussed as well.
• Basic C++ input facilities are implemented by the istream class. This class defines the extraction
operator and related input facilities. Comparably to inserting information into memory
buffers (using ostringstream) a class istringstream is available to extract information
from memory buffers.
• Finally, several advanced I/O-related topics are discussed. E.g., reading and writing from the
same stream and mixing C and C++ I/O using filebuf objects. Other I/O related topics are
covered elsewhere in the C++ Annotations, e.g., in chapter 24.
Stream objects have a limited but important role: they are the interface between, on the one hand,
the objects to be input or output and, on the other hand, the streambuf, which is responsible for
the actual input and output to the device accessed by a streambuf object.
This approach allows us to construct a new kind of streambuf for a new kind of device, and use that
streambuf in combination with the ‘good old’ istream- and ostream-class facilities. It is important
to understand the distinction between the formatting roles of iostream objects and the buffering
interface to an external device as implemented in a streambuf object. Interfacing to new devices
(like sockets or file descriptors) requires the construction of a new kind of streambuf, rather than a
new kind of istream or ostream object. A wrapper class may be constructed around the istream
or ostream classes, though, to ease the access to a special device. This is how the stringstream
classes were constructed.
6.1 Special header files
Several iostream related header files are available. Depending on the situation at hand, the following
header files should be used:
• iosfwd: sources should include this header file if only a declaration of the stream classes is
required. For example, if a function defines a reference parameter to an ostream then the
compiler does not need to know exactly what an ostream is. When declaring such a function
the ostream class merely needs to be be declared. One cannot use
class std::ostream; // erroneous declaration
void someFunction(std::ostream &str);
but, instead, one should use:
#include <iosfwd> // correctly declares class ostream
void someFunction(std::ostream &str);
• <ios>: sources should include this header file when using types and facilites (like
ios::off_type, see below) defined in the ios class.
• <streambuf>: sources should include this header file when using streambuf or filebuf
classes. See sections 14.8 and 14.8.2.
• <istream>: sources should include this preprocessor directive when using the class istream
or when using classes that do both input and output. See section 6.5.1.
6.2. THE FOUNDATION: THE CLASS ‘IOS_BASE’ 91
• <ostream>: sources should include this header file when using the class ostream class or
when using classes that do both input and output. See section 6.4.1.
• <iostream>: sources should include this header file when using the global stream objects
(like cin and cout).
• <fstream>: sources should include this header file when using the file stream classes. See
sections 6.4.2, 6.5.2, and 6.6.3.
• <sstream>: sources should include this header file when using the string stream classes. See
sections 6.4.3 and 6.5.3.
• <iomanip>: sources should include this header file when using parameterized manipulators.
See section 6.3.2.
6.2 The foundation: the class ‘ios_base’
The class std::ios_base forms the foundation of all I/O operations, and defines, among other
things, facilities for inspecting the state of I/O streams and most output formatting facilities. Every
stream class of the I/O library is, through the class ios, derived from this class, and inherits its
capabilities. As ios_base is the foundation on which all C++ I/O was built, we introduce it here as
the first class of the C++ I/O library.
Note that, as in C, I/O in C++ is not part of the language (although it is part of the ANSI/ISO
standard on C++). Although it is technically possible to ignore all predefined I/O facilities, nobody
does so, and the I/O library therefore represents a de facto I/O standard for C++. Also note that,
as mentioned before, the iostream classes themselves are not responsible for the eventual I/O, but
delegate this to an auxiliary class: the class streambuf or its derivatives.
It is neither possible nor required to construct an ios_base object directly. Its construction is
always a side-effect of constructing an object further down the class hierarchy, like std::ios. Ios
is the next class down the iostream hierarchy (see figure 6.1). Since all stream classes in turn inherit
from ios, and thus also from ios_base, the distinction between ios_base and ios is in practice
not important. Therefore, facilities actually provided by ios_base will be discussed as facilities
provided by ios. The reader who is interested in the true class in which a particular facility is
defined should consult the relevant header files (e.g., ios_base.h and basic_ios.h).
6.3 Interfacing ‘streambuf’ objects: the class ‘ios’
The std::ios class is derived directly from ios_base, and it defines de facto the foundation for all
stream classes of the C++ I/O library.
Although it is possible to construct an ios object directly, this is seldom done. The purpose of the
class ios is to provide the facilities of the class basic_ios, and to add several new facilites, all
related to the streambuf object which is managed by objects of the class ios.
All other stream classes are either directly or indirectly derived fromios. This implies, as explained
in chapter 13, that all facilities of the classes ios and ios_base are also available to other stream
classes. Before discussing these additional stream classes, the features offered by the class ios (and
by implication: by ios_base) are now introduced.
In some cases it may be required to include ios explicitly. An example is the situations where the
formatting flags themselves (cf. section 6.3.2.2) are referred to in source code.
92 CHAPTER 6. THE IO-STREAM LIBRARY
The class ios offers several member functions, most of which are related to formatting. Other
frequently used member functions are:
• std::streambuf _ios::rdbuf():
A pointer to the streambuf object forming the interface between the ios object and
the device with which the ios object communicates is returned. See sections 14.8
and 24.1.2 for more information about the class streambuf.
• std::streambuf _ios::rdbuf(std::streambuf _new):
The current ios object is associated with another streambuf object. A pointer to the
ios object’s original streambuf object is returned. The object to which this pointer
points is not destroyed when the stream object goes out of scope, but is owned by the
caller of rdbuf.
• std::ostream _ios::tie():
A pointer to the ostream object that is currently tied to the ios object is returned
(see the nextmember). The return value 0 indicates that currently no ostream object
is tied to the ios object. See section 6.5.5 for details.
• std::ostream _ios::tie(std::ostream _outs):
The ostream object is tied to current ios object. This means that the ostream object
is flushed every time before an input or output action is performed by the current ios
object. A pointer to the ios object’s original ostream object is returned. To break
the tie, pass the argument 0. See section 6.5.5 for an example.
6.3.1 Condition states
Operations on streams may fail for various reasons. Whenever an operation fails, further operations
on the stream are suspended. It is possible to inspect, set and possibly clear the condition state
of streams, allowing a program to repair the problem rather than having to abort. The members
that are available for interrogating or manipulating the stream’s state are described in the current
section.
Conditions are represented by the following condition flags:
• ios::badbit:
if this flag has been raised an illegal operation has been requested at the level of the
streambuf object to which the stream interfaces. See the member functions below
for some examples.
• ios::eofbit:
if this flag has been raised, the ios object has sensed end of file.
• ios::failbit:
if this flag has been raised, an operation performed by the stream object has failed
(like an attempt to extract an int when no numeric characters are available on input).
In this case the stream itself could not performthe operation that was requested
of it.
6.3. INTERFACING ‘STREAMBUF’ OBJECTS: THE CLASS ‘IOS’ 93
• ios::goodbit:
this flag is raised when none of the other three condition flags were raised.
Several condition member functions are available to manipulate or determine the states of ios
objects. Originally they returned int values, but their current return type is bool:
• bool ios::bad():
the value true is returned when the stream’s badbit has been set and false otherwise.
If true is returned it indicates that an illegal operation has been requested
at the level of the streambuf object to which the stream interfaces. What does this
mean? It indicates that the streambuf itself is behaving unexpectedly. Consider the
following example:
std::ostream error(0);
Here an ostream object is constructed without providing it with a working
streambuf object. Since this ‘streambuf’ will never operate properly, its badbit
flag is raised from the very beginning: error.bad() returns true.
• bool ios::eof():
the value true is returned when end of file (EOF) has been sensed (i.e., the eofbit
flag has been set) and false otherwise. Assume we’re reading lines line-by-line
from cin, but the last line is not terminated by a final \n character. In that case
std::getline attempting to read the \n delimiter hits end-of-file first. This raises
the eofbit flag and cin.eof() returns true. For example, assume std::string
str and main executing the statements:
getline(cin, str);
cout << cin.eof();
Then
echo "hello world" | program
prints the value 0 (no EOF sensed). But after
echo -n "hello world" | program
the value 1 (EOF sensed) is printed.
• bool ios::fail():
the value true is returned when bad returns true or when the failbit flag was
set. The value false is returned otherwise. In the above example, cin.fail()
returns false, whether we terminate the final line with a delimiter or not (as we’ve
read a line). However, executing another getline results in raising the failbit
flag, causing cin::fail() to return true. In general: fail returns true if the
requested stream operation failed. A simple example showing this consists of an
attempt to extract an int when the input stream contains the text hello world.
The value not fail() is returned by the bool interpretation of a stream object (see
below).
• ios::good():
the value of the goodbit flag is returned. It equals true when none of the other
condition flags (badbit, eofbit, failbit) was raised. Consider the following
little program:
#include <iostream>
94 CHAPTER 6. THE IO-STREAM LIBRARY
#include <string>
using namespace std;
void state()
{
cout << "\n"
"Bad: " << cin.bad() << " "
"Fail: " << cin.fail() << " "
"Eof: " << cin.eof() << " "
"Good: " << cin.good() << ’\n’;
}
int main()
{
string line;
int x;
cin >> x;
state();
cin.clear();
getline(cin, line);
state();
getline(cin, line);
state();
}
When this program processes a file having two lines, containing, respectively, hello
and world, while the second line is not terminated by a \n character the following is
shown:
Bad: 0 Fail: 1 Eof: 0 Good: 0
Bad: 0 Fail: 0 Eof: 0 Good: 1
Bad: 0 Fail: 0 Eof: 1 Good: 0
Thus, extracting x fails (good returning false). Then, the error state is cleared, and
the first line is successfully read (good returning true). Finally the second line is
read (incompletely): good returning false, and eof returning true.
• Interpreting streams as bool values:
streams may be used in expressions expecting logical values. Some examples are:
if (cin) // cin itself interpreted as bool
if (cin >> x) // cin interpreted as bool after an extraction
if (getline(cin, str)) // getline returning cin
When interpreting a stream as a logical value, it is actually ‘not fail()’ that is
interpreted. The above examples may therefore be rewritten as:
if (not cin.fail())
if (not (cin >> x).fail())
if (not getline(cin, str).fail())
The former incantation, however, is used almost exclusively.
6.3. INTERFACING ‘STREAMBUF’ OBJECTS: THE CLASS ‘IOS’ 95
The following members are available to manage error states:
• void ios::clear():
When an error condition has occurred, and the condition can be repaired, then clear
can be used to clear the error state of the file. An overloaded version exists accepting
state flags, that are set after first clearing the current set of flags: clear(int
state). Its return type is void
• ios::iostate ios::rdstate():
The current set of flags that are set for an ios object are returned (as an int). To
test for a particular flag, use the bitwise and operator:
if (!(iosObject.rdstate() & ios::failbit))
{
// last operation didn’t fail
}
Note that this test cannot be performed for the goodbit flag as its value equals zero.
To test for ‘good’ use a construction like:
if (iosObject.rdstate() == ios::goodbit)
{
// state is ‘good’
}
• void ios::setstate(ios::iostate state):
A stream may be assigned a certain set of states using setstate. Its return type is
void. E.g.,
cin.setstate(ios::failbit); // set state to ‘fail’
To set multiple flags in one setstate() call use the bitor operator:
cin.setstate(ios::failbit | ios::eofbit)
The member clear is a shortcut to clear all error flags. Of course, clearing the flags doesn’t
automatically mean the error condition has been cleared too. The strategy should be:
– An error condition is detected,
– The error is repaired
– The member clear is called.
C++ supports an exception mechanism to handle exceptional situations. According to the ANSI/ISO
standard, exceptions can be used with stream objects. Exceptions are covered in chapter 10. Using
exceptions with stream objects is covered in section 10.7.
6.3.2 Formatting output and input
The way information is written to streams (or, occasionally, read from streams) is controlled by
formatting flags.
Formatting is used when it is necessary to, e.g., set the width of an output field or input buffer and to
determine the form (e.g., the radix) in which values are displayed. Most formatting features belong
to the realm of the ios class. Formatting is controlled by flags, defined by the ios class. These flags
may be manipulated in two ways: using specialized member functions or using manipulators, which
96 CHAPTER 6. THE IO-STREAM LIBRARY
are directly inserted into or extracted from streams. There is no special reason for using either
method; usually both methods are possible. In the following overview the various member functions
are first introduced. Following this the flags and manipulators themselves are covered. Examples
are provided showing how the flags can be manipulated and what their effects are.
Many manipulators are parameterless and are available once a stream header file (e.g., iostream)
has been included. Some manipulators require arguments. To use the latter manipulators the
header file iomanip must be included.
6.3.2.1 Format modifying member functions
#ifndef INCLUDED_FORMATMEMBERS_YO_ #define INCLUDED_FORMATMEMBERS_YO_
Several member functions are available manipulating the I/O formatting flags. Instead of using
the members listed below manipulators are often available that may directly be inserted into or
extracted from streams. The available members are listed in alphabetical order, but the most important
ones in practice are setf, unsetf and width.
• ios &ios::copyfmt(ios &obj):
all format flags of obj are copied to the current ios object. The current ios object is
returned.
• ios::fill() const:
the current padding character is returned. By default, this is the blank space.
• ios::fill(char padding):
the padding character is redefined, the previous padding character is returned. Instead
of using this member function the setfill manipulator may be inserted directly
into an ostream. Example:
cout.fill(’0’); // use ’0’ as padding char
cout << setfill(’+’); // use ’+’ as padding char
• ios::fmtflags ios::flags() const:
the current set of flags controlling the format state of the stream for which the member
function is called is returned. To inspect whether a particular flag was set, use
the bit_and operator. Example:
if (cout.flags() & ios::hex)
cout << "Integral values are printed as hex numbers\n"
• ios::fmtflags ios::flags(ios::fmtflags flagset):
the previous set of flags are returned and the new set of flags are defined by flagset.
Multiple flags are specified using the bitor operator. Example:
// change the representation to hexadecimal
cout.flags(ios::hex | cout.flags() & ~ios::dec);
• int ios::precision() const:
the number of significant digits used when outputting floating point values is returned
(default: 6).
6.3. INTERFACING ‘STREAMBUF’ OBJECTS: THE CLASS ‘IOS’ 97
• int ios::precision(int signif):
the number of significant digits to use when outputting real values is set to signif.
The previously used number of significant digits is returned. If the number of required
digits exceeds signif then the number is displayed in ‘scientific’ notation (cf.
section 6.3.2.2). Manipulator: setprecision. Example:
cout.precision(3); // 3 digits precision
cout << setprecision(3); // same, using the manipulator
cout << 1.23 << " " << 12.3 << " " << 123.12 << " " << 1234.3 << ’\n’;
// displays: 1.23 12.3 123 1.23e+03
• Type std::put_time(std::tm const _tm, char const _fmt):
The function put_time can be used to display (elements of) the current time. Time
elements are provided in a std::tm object, and the way the time is displayed is
defined by the format string fmt.
In the next example put_time is used to display the current date and time:
#include <iostream>
#include <ctime>
int main()
{
std::time_t tm {time(0)};
std::cout << std::put_time(std::localtime(&tm), "%c") << ’\n’;
}
// displays, e.g., Sun Dec 20 15:05:18 2015
The function put_time, as well as std::localtime and gmtime is described in
more detail in section 20.1.5.
• ios::fmtflags ios::setf(ios::fmtflags flags):
sets one or more formatting flags (use the bitor operator to combine multiple flags).
Already set flags are not affected. The previous set of flags is returned. Instead of
using this member function the manipulator setiosflags may be used. Examples
are provided in the next section (6.3.2.2).
• ios::fmtflags ios::setf(ios::fmtflags flags, ios::fmtflags mask):
clears all flags mentioned in mask and sets the flags specified in flags. The previous
set of flags is returned. Some examples are (but see the next section (6.3.2.2) for a
more thorough discussion):
// left-adjust information in wide fields
cout.setf(ios::left, ios::adjustfield);
// display integral values as hexadecimal numbers
cout.setf(ios::hex, ios::basefield);
// display floating point values in scientific notation
cout.setf(ios::scientific, ios::floatfield);
• ios::fmtflags ios::unsetf(fmtflags flags):
the specified formatting flags are cleared (leaving the remaining flags unaltered)
and returns the previous set of flags. A request to unset an active default flag (e.g.,
cout.unsetf(ios::dec)) is ignored. Instead of this member function the manipulator
resetiosflags may also be used. Example:
cout << 12.24; // displays 12.24
98 CHAPTER 6. THE IO-STREAM LIBRARY
cout << setf(ios::fixed);
cout << 12.24; // displays 12.240000
cout.unsetf(ios::fixed); // undo a previous ios::fixed setting.
cout << 12.24; // displays 12.24
cout << resetiosflags(ios::fixed); // using manipulator rather
// than unsetf
• int ios::width() const:
the currently active output field width to use on the next insertion is returned. The
default value is 0, meaning ‘as many characters as needed to write the value’.
• int ios::width(int nchars):
the field width of the next insertion operation is set to nchars, returning the previously
used field width. This setting is not persistent. It is reset to 0 after every
insertion operation. Manipulator: std::setw(int). Example:
cout.width(5);
cout << 12; // using 5 chars field width
cout << setw(12) << "hello"; // using 12 chars field width
#endif
6.3.2.2 Formatting flags
Most formatting flags are related to outputting information. Information can be written to output
streams in basically two ways: using binary output information is written directly to an output
stream, without converting it first to some human-readable format and using formatted output by
which values stored in the computer’s memory are converted to human-readable text first. Formatting
flags are used to define the way this conversion takes place. In this section all formatting flags
are covered. Formatting flags may be (un)set using member functions, but often manipulators having
the same effect may also be used. For each of the flags it is shown how they can be controlled by
a member function or -if available- a manipulator.
To display information in wide fields:
• ios::internal:
to add fill characters (blanks by default) between the minus sign of negative numbers
and the value itself. Other values and data types are right-adjusted. Manipulator:
std::internal. Example:
cout.setf(ios::internal, ios::adjustfield);
cout << internal; // same, using the manipulator
cout << ’\’’ << setw(5) << -5 << "’\n"; // displays ’- 5’
• ios::left:
to left-adjust values in fields that are wider than needed to display the values. Manipulator:
std::left. Example:
cout.setf(ios::left, ios::adjustfield);
cout << left; // same, using the manipulator
cout << ’\’’ << setw(5) << "hi" << "’\n"; // displays ’hi ’
6.3. INTERFACING ‘STREAMBUF’ OBJECTS: THE CLASS ‘IOS’ 99
• ios::right:
to right-adjust values in fields that are wider than needed to display the values.
Manipulator: std::right. This is the default. Example:
cout.setf(ios::right, ios::adjustfield);
cout << right; // same, using the manipulator
cout << ’\’’ << setw(5) << "hi" << "’\n"; // displays ’ hi’
Using various number representations:
• ios::dec:
to display integral values as decimal numbers. Manipulator: std::dec. This is the
default. Example:
cout.setf(ios::dec, ios::basefield);
cout << dec; // same, using the manipulator
cout << 0x10; // displays 16
• ios::hex:
to display integral values as hexadecimal numbers. Manipulator: std::hex. Example:
cout.setf(ios::hex, ios::basefield);
cout << hex; // same, using the manipulator
cout << 16; // displays 10
• ios::oct:
to display integral values as octal numbers. Manipulator: std::oct. Example:
cout.setf(ios::oct, ios::basefield);
cout << oct; // same, using the manipulator
cout << 16; // displays 20
• std::setbase(int radix):
This is a manipulator that can be used to change the number representation to decimal,
hexadecimal or octal. Example:
cout << setbase(8); // octal numbers, use 10 for
// decimal, 16 for hexadecimal
cout << 16; // displays 20
Fine-tuning displaying values:
• ios::boolalpha:
logical values may be displayed as text using the text ‘true’ for the true logical
value, and ‘false’ for the false logical value using boolalpha. By default
this flag is not set. Complementary flag: ios::noboolalpha. Manipulators:
std::boolalpha and std::noboolalpha. Example:
cout.setf(ios::boolalpha);
cout << boolalpha; // same, using the manipulator
cout << (1 == 1); // displays true
100 CHAPTER 6. THE IO-STREAM LIBRARY
• ios::showbase:
to display the numeric base of integral values. With hexadecimal values the 0x
prefix is used, with octal values the prefix 0. For the (default) decimal value no
particular prefix is used. Complementary flag: ios::noshowbase. Manipulators:
std::showbase and std::noshowbase. Example:
cout.setf(ios::showbase);
cout << showbase; // same, using the manipulator
cout << hex << 16; // displays 0x10
• ios::showpos:
to display the + sign with positive decimal (only) values. Complementary flag:
ios::noshowpos. Manipulators: std::showpos and std::noshowpos. Example:
cout.setf(ios::showpos);
cout << showpos; // same, using the manipulator
cout << 16; // displays +16
cout.unsetf(ios::showpos); // Undo showpos
cout << 16; // displays 16
• ios::uppercase:
to display letters in hexadecimal values using capital letters. Complementary flag:
ios::nouppercase. Manipulators: std::uppercase and std::nouppercase.
By default lower case letters are used. Example:
cout.setf(ios::uppercase);
cout << uppercase; // same, using the manipulator
cout << hex << showbase <<
3735928559; // displays 0XDEADBEEF
Displaying floating point numbers
• ios::fixed:
to display real values using a fixed decimal point (e.g., 12.25 rather than 1.225e+01),
the fixed formatting flag is used. It can be used to set a fixed number of digits
behind the decimal point. Manipulator: fixed. Example:
cout.setf(ios::fixed, ios::floatfield);
cout.precision(3); // 3 digits behind the .
// Alternatively:
cout << setiosflags(ios::fixed) << setprecision(3);
cout << 3.0 << " " << 3.01 << " " << 3.001 << ’\n’;
<< 3.0004 << " " << 3.0005 << " " << 3.0006 << ’\n’
// Results in:
// 3.000 3.010 3.001
// 3.000 3.001 3.001
The example shows that 3.0005 is rounded away from zero, becoming 3.001 (likewise
-3.0005 becomes -3.001). First setting precision and then fixed has the same effect.
6.3. INTERFACING ‘STREAMBUF’ OBJECTS: THE CLASS ‘IOS’ 101
• ios::scientific:
to display real values in scientific notation (e.g., 1.24e+03). Manipulator:
std::scientific. Example:
cout.setf(ios::scientific, ios::floatfield);
cout << scientific; // same, using the manipulator
cout << 12.25; // displays 1.22500e+01
• ios::showpoint:
to display a trailing decimal point and trailing decimal zeros when real numbers
are displayed. Complementary flag: ios::noshowpoint. Manipulators:
std::showpoint, std::noshowpoint. Example:
cout << fixed << setprecision(3); // 3 digits behind .
cout.setf(ios::showpoint); // set the flag
cout << showpoint; // same, using the manipulator
cout << 16.0 << ", " << 16.1 << ", " << 16;
// displays: 16.000, 16.100, 16
Note that the final 16 is an integral rather than a floating point number, so it has
no decimal point. So showpoint has no effect. If ios::showpoint is not active
trailing zeros are discarded. If the fraction is zero the decimal point is discarded as
well. Example:
cout.unsetf(ios::fixed, ios::showpoint); // unset the flags
cout << 16.0 << ", " << 16.1;
// displays: 16, 16.1
Handling white space and flushing streams
• std::endl:
manipulator inserting a newline character and flushing the stream. Often flushing
the stream is not required and doing so would needlessly slow down I/O processing.
Consequently, using endl should be avoided (in favor of inserting ’\n’) unless flusing
the stream is explicitly intended. Note that streams are automatically flushed
when the program terminates or when a stream is ‘tied’ to another stream (cf. tie in
section 6.3). Example:
cout << "hello" << endl; // prefer: << ’\n’;
• std::ends:
manipulator inserting a 0-byte into a stream. It is usually used in combination with
memory-streams (cf. section 6.4.3).
• std::flush:
a streammay be flushed using this member. Often flushing the streamis not required
and doing so would needlessly slow down I/O processing. Consequently, using flush
should be avoided unless it is explicitly required to do so. Note that streams are
automatically flushed when the program terminates or when a stream is ‘tied’ to
another stream (cf. tie in section 6.3). Example:
cout << "hello" << flush; // avoid if possible.
102 CHAPTER 6. THE IO-STREAM LIBRARY
• ios::skipws:
leading white space characters (blanks, tabs, newlines, etc.) are skipped when a value
is extracted from a stream. This is the default. If the flag is not set, leading white
space characters are not skipped. Manipulator: std::skipws. Example:
cin.setf(ios::skipws); // to unset, use
// cin.unsetf(ios::skipws)
cin >> skipws; // same, using the manipulator
int value;
cin >> value; // skips initial blanks
• ios::unitbuf:
the stream for which this flag is set flushes its buffer after every output operation
Often flushing a stream is not required and doing so would needlessly slow down
I/O processing. Consequently, setting unitbuf should be avoided unless flusing the
stream is explicitly intended. Note that streams are automatically flushed when the
program terminates or when a stream is ‘tied’ to another stream (cf. tie in section
6.3). Complementary flag: ios::nounitbuf. Manipulators: std::unitbuf,
std::nounitbuf. Example:
cout.setf(ios::unitbuf);
cout << unitbuf; // same, using the manipulator
cout.write("xyz", 3); // flush follows write.
• std::ws:
manipulator removing all white space characters (blanks, tabs, newlines, etc.) at
the current file position. White space are removed if present even if the flag
ios::noskipws has been set. Example (assume the input contains 4 blank characters
followed by the character X):
cin >> ws; // skip white space
cin.get(); // returns ’X’
6.4 Output
In C++ output is primarily based on the std::ostream class. The ostream class defines the basic
operators and members inserting information into streams: the insertion operator (<<), and special
members like write writing unformatted information to streams.
The class ostream acts as base class for several other classes, all offering the functionality of the
ostream class, but adding their own specialties. In the upcoming sections the following classes are
discussed:
• The class ostream, offering the basic output facilities;
• The class ofstream, allowing us to write files (comparable to C’s fopen(filename, "w"));
• The class ostringstream, allowing us to write information to memory (comparable to C’s
sprintf function).
6.4. OUTPUT 103
6.4.1 Basic output: the class ‘ostream’
The class ostream defines basic output facilities. The cout, clog and cerr objects are all ostream
objects. All facilities related to output as defined by the ios class are also available in the ostream
class.
We may define ostream objects using the following ostream constructor:
• std::ostream object(std::streambuf _sb):
this constructor creates an ostream object which is a wrapper around an existing
std::streambuf object. It isn’t possible to define a plain ostream object (e.g., using
std::ostream out;) that can thereupon be used for insertions. When cout or
its friends are used, we are actually using a predefined ostream object that has
already been defined for us and interfaces to the standard output stream using a
(also predefined) streambuf object handling the actual interfacing.
It is, however, possible to define an ostream object passing it a 0-pointer. Such an
object cannot be used for insertions (i.e., it raises its ios::bad flag when something
is inserted into it), but it may be given a streambuf later. Thus it may be preliminary
constructed, suspending its use until an appropriate streambuf becomes available
(see also section 14.8.3).
To define the ostream class in C++ sources, the <ostream> header file must be included. To use
the predefined ostream objects (std::cin, std::cout etc.) the <iostream> header file must
be included.
6.4.1.1 Writing to ‘ostream’ objects
The class ostream supports both formatted and binary output.
The insertion operator (<<) is used to insert values in a type safe way into ostream objects. This is
called formatted output, as binary values which are stored in the computer’s memory are converted
to human-readable ASCII characters according to certain formatting rules.
The insertion operator points to the ostream object to receive the information. The normal associativity
of << remains unaltered, so when a statement like
cout << "hello " << "world";
is encountered, the leftmost two operands are evaluated first (cout << "hello "), and an ostream
& object, which is actually the same cout object, is returned. Now, the statement is reduced to
cout << "world";
and the second string is inserted into cout.
The << operator has a lot of (overloaded) variants, so many types of variables can be inserted into
ostream objects. There is an overloaded <<-operator expecting an int, a double, a pointer, etc.
etc.. Each operator returns the ostream object into which the information so far has been inserted,
and can thus immediately be followed by the next insertion.
Streams lack facilities for formatted output like C’s printf and vprintf functions. Although it is
not difficult to implement these facilities in the world of streams, printf-like functionality is hardly
104 CHAPTER 6. THE IO-STREAM LIBRARY
ever required in C++ programs. Furthermore, as it is potentially type-unsafe, it might be better to
avoid this functionality completely.
When binary files must be written, normally no text-formatting is used or required: an int value
should be written as a series of raw bytes, not as a series of ASCII numeric characters 0 to 9. The
following member functions of ostream objects may be used to write ‘binary files’:
• ostream& put(char c):
to write a single character to the output stream. Since a character is a byte, this
member function could also be used for writing a single character to a text-file.
• ostream& write(char const _buffer, int length):
to write at most length bytes, stored in the char const _buffer to the ostream
object. Bytes are written as they are stored in the buffer, no formatting is done
whatsoever. Note that the first argument is a char const _: a type cast is required
to write any other type. For example, to write an int as an unformatted series of
byte-values use:
int x;
out.write(reinterpret_cast<char const *>(&x), sizeof(int));
The bytes written by the above write call are written to the ostream in an order depending on
the endian-ness of the underlying hardware. Big-endian computers write the most significant
byte(s) ofmulti-byte values first, little-endian computers first write the least significant byte(s).
6.4.1.2 ‘ostream’ positioning
Although not every ostream object supports repositioning, they usually do. This means that it is
possible to rewrite a section of the stream which was written earlier. Repositioning is frequently
used in database applications where it must be possible to access the information in the database at
random.
The current position can be obtained and modified using the following members:
• ios::pos_type tellp():
the current (absolute) position in the file where the next write-operation to the stream
will take place is returned.
• ostream &seekp(ios::off_type step, ios::seekdir org):
modifies a stream’s actual position. The function expects an off_type step representing
the number of bytes the current stream position is moved with respect to
org. The step value may be negative, zero or positive.
The origin of the step, org is a value in the ios::seekdir enumeration. Its values
are:
– ios::beg:
the stepsize is computed relative to the beginning of the stream. This
value is used by default.
– ios::cur:
the stepsize is computed relative to the current position of the stream (as
returned by tellp).
6.4. OUTPUT 105
– ios::end:
the stepsize is interpreted relative to the current end position of the
stream.
It is OK to seek or write beyond the last file position. Writing bytes to a location
beyond EOF will pad the intermediate bytes with 0-valued bytes: null-bytes. Seeking
before ios::beg raises the ios::fail flag.
6.4.1.3 ‘ostream’ flushing
Unless the ios::unitbuf flag has been set, information written to an ostream object is not immediately
written to the physical stream. Rather, an internal buffer is filled during the writeoperations,
and when full it is flushed.
The stream’s internal buffer can be flushed under program control:
• ostream& flush():
any buffered information stored internally by the ostream object is flushed to the
device to which the ostream object interfaces. A stream is flushed automatically
when:
– the object ceases to exist;
– the endl or flush manipulators (see section 6.3.2.2) are inserted into an
ostream object;
– a stream supporting the close-operation is explicitly closed (e.g., a
std::ofstream object, cf. section 6.4.2).
6.4.2 Output to files: the class ‘ofstream’
The std::ofstream class is derived from the ostream class: it has the same capabilities as the
ostream class, but can be used to access files or create files for writing.
In order to use the ofstream class in C++ sources, the <fstream> header file must be included.
Including fstream does not automatically make available the standard streams cin, cout and
cerr. Include iostream to declare these standard streams.
The following constructors are available for ofstream objects:
• ofstream object:
this is the basic constructor. It defines an ofstream object which may be associated
with an actual file later, using its open() member (see below).
• ofstream object(char const _name, ios::openmode mode = ios::out):
this constructor defines an ofstream object and associates it immediately with the
file named name using output mode mode. Section 6.4.2.1 provides an overview of
available output modes. Example:
ofstream out("/tmp/scratch");
It is not possible to open an ofstream using a file descriptor. The reason for this is (apparently)
that file descriptors are not universally available over different operating systems. Fortunately, file
106 CHAPTER 6. THE IO-STREAM LIBRARY
descriptors can be used (indirectly) with a std::streambuf object (and in some implementations:
with a std::filebuf object, which is also a streambuf). Streambuf objects are discussed in
section 14.8, filebuf objects are discussed in section 14.8.2.
Instead of directly associating an ofstream object with a file, the object can be constructed first,
and opened later.
• void open(char const _name, ios::openmode mode = ios::out):
associates an ofstream object with an actual file. If the ios::fail flag was set
before calling open and opening succeeds the flag is cleared. Opening an already
open stream fails. To reassociate a stream with another file it must first be closed:
ofstream out("/tmp/out");
out << "hello\n";
out.close(); // flushes and closes out
out.open("/tmp/out2");
out << "world\n";
• void close():
closes the ofstream object. The function sets the ios::fail flag of the closed object.
Closing the file flushes any buffered information to the associated file. A file is
automatically closed when the associated ofstream object ceases to exist.
• bool is_open() const:
assume a stream was properly constructed, but it has not yet been attached to a
file. E.g., the statement ofstream ostr was executed. When we now check its
status through good(), a non-zero (i.e., OK) value is returned. The ‘good’ status
here indicates that the stream object has been constructed properly. It doesn’t mean
the file is also open. To test whether a stream is actually open, is_open should be
called. If it returns true, the stream is open. Example:
#include <fstream>
#include <iostream>
using namespace std;
int main()
{
ofstream of;
cout << "of’s open state: " << boolalpha << of.is_open() << ’\n’;
of.open("/dev/null"); // on Unix systems
cout << "of’s open state: " << of.is_open() << ’\n’;
}
/*
Generated output:
of’s open state: false
of’s open state: true
*/
6.4. OUTPUT 107
6.4.2.1 Modes for opening stream objects
The following file modes or file flags are available when constructing or opening ofstream (or
istream, see section 6.5.2) objects. The values are of type ios::openmode. Flags may be combined
using the bitor operator.
• ios::app:
reposition the stream to its end before every output command (see also ios::ate
below). The file is created if it doesn’t yet exist. When opening a stream in this mode
any existing contents of the file are kept.
• ios::ate:
start initially at the end of the file. Note that any existing contents are only kept
if some other flag tells the object to do so. For example ofstream out("gone",
ios::ate) rewrites the file gone, because the implied ios::out causes the rewriting.
If rewriting of an existing file should be prevented, the ios::in mode should
be specified too. However, when ios::in is specified the file must already exist. The
ate mode only initially positions the file at the end of file position. After that information
may be written in the middle of the file using seekp. When the app mode is
used information is only written at end of file (effectively ignoring seekp operations).
• ios::binary:
open a file in binary mode (used on systems distinguishing text- and binary files, like
MS-Windows).
• ios::in:
open the file for reading. The file must exist.
• ios::out:
open the file for writing. Create it if it doesn’t yet exist. If it exists, the file is rewritten.
• ios::trunc:
start initially with an empty file. Any existing contents of the file are lost.
The following combinations of file flags have special meanings:
in | out: The stream may be read and written. However, the
file must exist.
in | out | trunc: The stream may be read and written. It is
(re)created empty first.
An interesting subtlety is that the open members of the ifstream, ofstream and fstream
classes have a second parameter of type ios::openmode. In contrast to this, the bitor operator
returns an int when applied to two enum-values. The question why the bitor operator may
nevertheless be used here is answered in a later chapter (cf. section 11.11).
108 CHAPTER 6. THE IO-STREAM LIBRARY
6.4.3 Output to memory: the class ‘ostringstream’
To write information to memory using stream facilities, std::ostringstream objects should be
used. As the class ostringstream is derived from the class ostream all ostream’s facilities are
available to ostringstream objects as well. To use and define ostringstream objects the header
file <sstream> must be included. In addition the class ostringstream offers the following constructors
and members:
• ostringstream ostr(string const &init, ios::openmode mode = ios::out):
when specifying openmode as ios::ate, the ostringstream object is initialized
by the string init and remaining insertions are appended to the contents of the
ostringstream object.
• ostringstream ostr(ios::openmode mode = ios::out):
this constructor can also be used as default constructor. Alternatively it allows,
e.g., forced additions at the end of the information stored in the object so far (using
ios::app). Example:
std::ostringstream out;
• std::string str() const:
a copy of the string that is stored inside the ostringstream object is returned.
• void str(std::string const &str):
the current object is reinitialized with new initial contents.
The following example illustrates the use of the ostringstream class: several values are inserted
into the object. Then, the text contained by the ostringstream object is stored in a std::string,
whose length and contents are thereupon printed. Such ostringstream objects are most often
used for doing ‘type to string’ conversions, like converting int values to text. Formatting flags can
be used with ostringstreams as well, as they are part of the ostream class.
Here is an example showing an ostringstream object being used:
#include <iostream>
#include <sstream>
using namespace std;
int main()
{
ostringstream ostr("hello ", ios::ate);
cout << ostr.str() << ’\n’;
ostr.setf(ios::showbase);
ostr.setf(ios::hex, ios::basefield);
ostr << 12345;
cout << ostr.str() << ’\n’;
ostr << " -- ";
6.5. INPUT 109
ostr.unsetf(ios::hex);
ostr << 12;
cout << ostr.str() << ’\n’;
ostr.str("new text");
cout << ostr.str() << ’\n’;
ostr.seekp(4, ios::beg);
ostr << "world";
cout << ostr.str() << ’\n’;
}
/*
Output from this program:
hello
hello 0x3039
hello 0x3039 -- 12
new text
new world
*/
6.5 Input
In C++ input is primarily based on the std::istream class. The istream class defines the basic
operators and members extracting information from streams: the extraction operator (>>), and
special members like istream::read reading unformatted information from streams.
The class istream acts as base class for several other classes, all offering the functionality of the
istream class, but adding their own specialties. In the upcoming sections the following classes are
discussed:
• The class istream, offering the basic facilities for doing input;
• The class ifstream, allowing us to read files (comparable to C’s fopen(filename, "r"));
• The class istringstream, allowing us to read information from text that is not stored on files
(streams) but in memory (comparable to C’s sscanf function).
6.5.1 Basic input: the class ‘istream’
The class istream defines basic input facilities. The cin object, is an istream object. All facilities
related to input as defined by the ios class are also available in the istream class.
We may define istream objects using the following istream constructor:
• istream object(streambuf _sb):
this constructor can be used to construct a wrapper around an existing
std::streambuf object. Similarly to ostream objects, istream objects may be defined
by passing it initially a 0-pointer. See section 6.4.1 for a discussion, see also
section 14.8.3, and see chapter 24 for examples.
110 CHAPTER 6. THE IO-STREAM LIBRARY
To define the istream class in C++ sources, the <istream> header file must be included. To use
the predefined istream object cin, the <iostream> header file must be included.
6.5.1.1 Reading from ‘istream’ objects
The class istream supports both formatted and unformatted binary input. The extraction operator
(operator>>) is used to extract values in a type safe way from istream objects. This is called
formatted input, whereby human-readable ASCII characters are converted, according to certain
formatting rules, to binary values.
The extraction operator points to the objects or variables which receive new values. The normal
associativity of >> remains unaltered, so when a statement like
cin >> x >> y;
is encountered, the leftmost two operands are evaluated first (cin >> x), and an istream & object,
which is actually the same cin object, is returned. Now, the statement is reduced to
cin >> y
and the y variable is extracted from cin.
The >> operator has many (overloaded) variants and thus many types of variables can be extracted
fromistream objects. There is an overloaded>> available for the extraction of an int, of a double,
of a string, of an array of characters, possibly to a pointer, etc. etc.. String or character array
extraction by default first skips all white space characters, and then extracts all consecutive nonwhite
space characters. Once an extraction operator has been processed the istream object from
which the information was extracted is returned and it can immediately be used for additional
istream operations that appear in the same expression.
Streams lack facilities for formatted input (as used by, e.g., C’s scanf and vscanf functions). Although
it is not difficult to add these facilities to the world of streams, scanf-like functionality is
hardly ever required in C++ programs. Furthermore, as it is potentially type-unsafe, it might be
better to avoid formatted input completely.
When binary files must be read, the information should normally not be formatted: an int value
should be read as a series of unaltered bytes, not as a series of ASCII numeric characters 0 to 9. The
following member functions for reading information from istream objects are available:
• int gcount() const:
the number of characters read from the input stream by the last unformatted input
operation is returned.
• int get():
the next available single character is returned as an unsigned char value using an
int return type. EOF is returned if no more character are available.
• istream &get(char &ch):
the next single character read from the input stream is stored in ch. The member
function returns the stream itself which may be inspected to determine whether a
character was obtained or not.
6.5. INPUT 111
• istream& get(char _buffer, int len, char delim = ’\n’):
At most len - 1 characters are read from the input stream into the array starting
at buffer, which should be at least len bytes long. Reading also stops when the
delimiter delim is encountered. However, the delimiter itself is not removed from
the input stream.
Having stored the characters into buffer, an 0-valued character is written beyond
the last character stored into the buffer. The functions eof and fail (see section
6.3.1) return 0 (false) if the delimiter was encountered before reading len - 1
characters or if the delimiter was not encountered after reading len - 1 characters.
It is OK to specifiy an 0-valued character delimiter: this way NTB strings may be
read from a (binary) file.
• istream& getline(char _buffer, int len, char delim = ’\n’):
this member function operates analogously to the getmember function, but getline
removes delim from the stream if it is actually encountered. The delimiter itself, if
encountered, is not stored in the buffer. If delim was not found (before reading
len - 1 characters) the fail member function, and possibly also eof returns true.
Realize that the std::string class also offers a function std::getline which is
generally preferred over this getline member function that is described here (see
section 5.2.4).
• istream& ignore():
one character is skipped from the input stream.
• istream& ignore(int n):
n characters are skipped from the input stream.
• istream& ignore(int n, int delim):
at most n characters are skipped but skipping characters stops after having removed
delim from the input stream.
• int peek():
this function returns the next available input character, but does not actually remove
the character from the input stream. EOF is returned if no more characters are
available.
• istream& putback(char ch):
The character ch is ‘pushed back’ into the input stream, to be read again as the next
available character. EOF is returned if this is not allowed. Normally, it is OK to put
back one character. Example:
string value;
cin >> value;
cin.putback(’X’);
// displays: X
cout << static_cast<char>(cin.get());
• istream &read(char _buffer, int len):
At most len bytes are read from the input stream into the buffer. If EOF is encountered
first, fewer bytes are read, with the member function eof returning true. This
function is commonly used when reading binary files. Section 6.5.2 contains an example
in which this member function is used. The member function gcount() may
be used to determine the number of characters that were retrieved by read.
112 CHAPTER 6. THE IO-STREAM LIBRARY
• istream& readsome(char _buffer, int len):
atmost len bytes are read fromthe input streaminto the buffer. All available characters
are read into the buffer, but if EOF is encountered, fewer bytes are read, without
setting the ios::eofbit or ios::failbit.
• istream &unget():
the last character that was read from the stream is put back.
6.5.1.2 ‘istream’ positioning
Although not every istream object supports repositioning, some do. This means that it is possible
to read the same section of a stream repeatedly. Repositioning is frequently used in database
applications where it must be possible to access the information in the database randomly.
The current position can be obtained and modified using the following members:
• ios::pos_type tellg():
the stream’s current (absolute) position where the stream’s next read-operation will
take place is returned.
• istream &seekg(ios::off_type step, ios::seekdir org):
modifies a stream’s actual position. The function expects an off_type step representing
the number of bytes the current stream position is moved with respect to
org. The step value may be negative, zero or positive.
The origin of the step, org is a value in the ios::seekdir enumeration. Its values
are:
– ios::beg:
the stepsize is computed relative to the beginning of the stream. This
value is used by default.
– ios::cur:
the stepsize is computed relative to the current position of the stream (as
returned by tellp).
– ios::end:
the stepsize is interpreted relative to the current end position of the the
stream.
It is OK to seek beyond the last file position. Seeking before ios::beg raises the
ios::failbit flag.
6.5.2 Input from files: the class ‘ifstream’
The std::ifstream class is derived from the istream class: it has the same capabilities as the
istream class, but can be used to access files for reading.
In order to use the ifstream class in C++ sources, the <fstream> header file must be included.
Including fstream does not automatically make available the standard streams cin, cout and
cerr. Include iostream to declare these standard streams.
6.5. INPUT 113
The following constructors are available for ifstream objects:
• ifstream object:
this is the basic constructor. It defines an ifstream object which may be associated
with an actual file later, using its open() member (see below).
• ifstream object(char const _name, ios::openmode mode = ios::in):
this constructor can be used to define an ifstream object and associate it immediately
with the file named name using input mode mode. Section 6.4.2.1 provides an
overview of available input modes. Example:
ifstream in("/tmp/input");
Instead of directly associating an ifstream object with a file, the object can be constructed first,
and opened later.
• void open(char const _name, ios::openmode mode = ios::in):
associates an ifstream object with an actual file. If the ios::fail flag was set
before calling open and opening succeeds the flag is cleared. Opening an already
open stream fails. To reassociate a stream with another file it must first be closed:
ifstream in("/tmp/in");
in >> variable;
in.close(); // closes in
in.open("/tmp/in2");
in >> anotherVariable;
• void close():
closes the ifstream object. The function sets the ios::fail flag of the closed object.
Closing the file flushes any buffered information to the associated file. A file is
automatically closed when the associated ifstream object ceases to exist.
• bool is_open() const:
assume a stream was properly constructed, but it has not yet been attached to a
file. E.g., the statement ifstream ostr was executed. When we now check its
status through good(), a non-zero (i.e., OK) value is returned. The ‘good’ status
here indicates that the stream object has been constructed properly. It doesn’t mean
the file is also open. To test whether a stream is actually open, is_open should be
called. If it returns true, the stream is open. Also see the example in section 6.4.2.
The following example illustrates reading from a binary file (see also section 6.5.1.1):
#include <fstream>
using namespace std;
int main(int argc, char **argv)
{
ifstream in(argv[1]);
double value;
// reads double in raw, binary form from file.
in.read(reinterpret_cast<char *>(&value), sizeof(double));
}
114 CHAPTER 6. THE IO-STREAM LIBRARY
6.5.3 Input from memory: the class ‘istringstream’
To read information from memory using stream facilities, std::istringstream objects should
be used. As the class istringstream is derived from the class istream all istream’s facilities
are available to istringstream objects as well. To use and define istringstream objects the
header file <sstream> must be included. In addition the class istringstream offers the following
constructors and members:
• istringstream istr(string const &init, ios::openmode mode = ios::in):
the object is initialized with init’s contents
• istringstream istr(ios::openmode mode = ios::in) (this constructor is usually
used as the default constructor. Example:
std::istringstream in;
)
• void str(std::string const &str):
the current object is reinitialized with new initial contents.
The following example illustrates the use of the istringstream class: several values are extracted
from the object. Such istringstream objects are most often used for doing ‘string to type’ conversions,
like converting text to int values (cf. C’s atoi function). Formatting flags can be used with
istringstreams as well, as they are part of the istream class. In the example note especially the
use of the member seekg:
#include <iostream>
#include <sstream>
using namespace std;
int main()
{
istringstream istr("123 345"); // store some text.
int x;
istr.seekg(2); // skip "12"
istr >> x; // extract int
cout << x << ’\n’; // write it out
istr.seekg(0); // retry from the beginning
istr >> x; // extract int
cout << x << ’\n’; // write it out
istr.str("666"); // store another text
istr >> x; // extract it
cout << x << ’\n’; // write it out
}
/*
output of this program:
3
123
666
*/
6.5. INPUT 115
6.5.4 Copying streams
Usually, files are copied either by reading a source file character by character or line by line. The
basic mold to process streams is as follows:
• Continuous loop:
1. read from the stream
2. if reading did not succeed (i.e., fail returns true), break from the loop
3. process the information that was read
Note that reading must precede testing, as it is only possible to know after actually attempting
to read from a file whether the reading succeeded or not. Of course, variations are possible:
getline(istream &, string &) (see section 6.5.1.1) returns an istream &, so here reading
and testing may be contracted using one expression. Nevertheless, the above mold represents the
general case. So, the following program may be used to copy cin to cout:
#include <iostream>
using namespace::std;
int main()
{
while (true)
{
char c;
cin.get(c);
if (cin.fail())
break;
cout << c;
}
}
Contraction is possible here by combining get with the if-statement, resulting in:
if (!cin.get(c))
break;
Even so, this would still follow the basic rule: ‘read first, test later’.
Simply copying a file isn’t required very often. More often a situation is encountered where a file
is processed up to a certain point, followed by plain copying the file’s remaining information. The
next program illustrates this. Using ignore to skip the first line (for the sake of the example it is
assumed that the first line is at most 80 characters long), the second statement uses yet another
overloaded version of the <<-operator, in which a streambuf pointer is inserted into a stream. As
the member rdbuf returns a stream’s streambuf _, we have a simple means of inserting a stream’s
contents into an ostream:
#include <iostream>
using namespace std;
int main()
116 CHAPTER 6. THE IO-STREAM LIBRARY
{
cin.ignore(80, ’\n’); // skip the first line and...
cout << cin.rdbuf(); // copy the rest through the streambuf *
}
This way of copying streams only assumes the existence of a streambuf object. Consequently it can
be used with all specializations of the streambuf class.
6.5.5 Coupling streams
Ostream objects can be coupled to ios objects using the tie member function. Tying results in
flushing the ostream’s buffer whenever an input or output operation is performed on the ios object
to which the ostream object is tied. By default cout is tied to cin (using cin.tie(cout)). This
tie means that whenever an operation on cin is requested, cout is flushed first. To break the tie,
ios::tie(0) can be called. In the example: cin.tie(0).
Another useful coupling of streams is shown by the tie between cerr and cout. Because of the tie
standard output and error messages written to the screen are shown in sync with the time at which
they were generated:
#include <iostream>
using namespace std;
int main()
{
cerr.tie(0); // untie
cout << "first (buffered) line to cout ";
cerr << "first (unbuffered) line to cerr\n";
cout << "\n";
cerr.tie(&cout); // tie cout to cerr
cout << "second (buffered) line to cout ";
cerr << "second (unbuffered) line to cerr\n";
cout << "\n";
}
/*
Generated output:
first (unbuffered) line to cerr
first (buffered) line to cout
second (buffered) line to cout second (unbuffered) line to cerr
*/
An alternative way to couple streams is to make streams use a common streambuf object. This can
be implemented using the ios::rdbuf(streambuf _) member function. This way two streams
can use, e.g. their own formatting, one stream can be used for input, the other for output, and
redirection using the stream library rather than operating system calls can be implemented. See
the next sections for examples.
6.6. ADVANCED TOPICS 117
6.6 Advanced topics
6.6.1 Moving streams
Stream classes (e.g.„ all stream classes covered in this chapter) are movable and can be swapped.
This implies that factory functions can be designed for stream classes. Here is an example:
ofstream out(string const &name)
{
ofstream ret(name); // construct ofstream
return ret; // return value optimization, but
} // OK as moving is supported
int main()
{
ofstream mine(out("out")); // return value optimizations, but
// OK as moving is supported
ofstream base("base");
ofstream other;
base.swap(other); // swapping streams is OK
other = std::move(base); // moving streams is OK
// other = base; // this would ail: copy assignment
// is not available for streams
}
6.6.2 Redirecting streams
Using ios::rdbuf streams can be forced to share their streambuf objects. Thus information written
to one stream is actually written to another stream; a phenomenon normally called redirection.
Redirection is commonly implemented at the operating system level, and sometimes that is still
necessary (see section 24.2.3).
A common situation where redirection is useful is when error messages should be written to file
rather than to the standard error stream, usually indicated by its file descriptor number 2. In the
Unix operating system using the bash shell, this can be realized as follows:
program 2>/tmp/error.log
Following this command any error messages written by program are saved on the file
/tmp/error.log, instead of appearing on the screen.
Here is an example showing how this can be implemented using streambuf objects. Assume
program expects an argument defining the name of the file to write the error messages to. It could
be called as follows:
program /tmp/error.log
118 CHAPTER 6. THE IO-STREAM LIBRARY
The program looks like this, an explanation is provided below the program’s source text:
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char **argv)
{
ofstream errlog; // 1
streambuf *cerr_buffer = 0; // 2
if (argc == 2)
{
errlog.open(argv[1]); // 3
cerr_buffer = cerr.rdbuf(errlog.rdbuf()); // 4
}
else
{
cerr << "Missing log filename\n";
return 1;
}
cerr << "Several messages to stderr, msg 1\n";
cerr << "Several messages to stderr, msg 2\n";
cout << "Now inspect the contents of " <<
argv[1] << "... [Enter] ";
cin.get(); // 5
cerr << "Several messages to stderr, msg 3\n";
cerr.rdbuf(cerr_buffer); // 6
cerr << "Done\n"; // 7
}
/*
Generated output on file argv[1]
at cin.get():
Several messages to stderr, msg 1
Several messages to stderr, msg 2
at the end of the program:
Several messages to stderr, msg 1
Several messages to stderr, msg 2
Several messages to stderr, msg 3
*/
• At lines 1-2 local variables are defined: errlog is the ofstream to write the error messages
too, and cerr_buffer is a pointer to a streambuf, to point to the original cerr buffer.
• At line 3 the alternate error stream is opened.
• At line 4 redirection takes place: cerr now writes to the streambuf defined by errlog. It is
6.6. ADVANCED TOPICS 119
important that the original buffer used by cerr is saved, as explained below.
• At line 5 we pause. At this point, two lines were written to the alternate error file. We get a
chance to take a look at its contents: there were indeed two lines written to the file.
• At line 6 the redirection is terminated. This is very important, as the errlog object is destroyed
at the end of main. If cerr’s buffer would not have been restored, then at that point
cerr would refer to a non-existing streambuf object, which might produce unexpected results.
It is the responsibility of the programmer to make sure that an original streambuf is
saved before redirection, and is restored when the redirection ends.
• Finally, at line 7, Done is again written to the screen, as the redirection has been terminated.
6.6.3 Reading AND Writing streams
In order to both read and write to a stream an std::fstream object must be created. As with
ifstream and ofstream objects, its constructor receives the name of the file to be opened:
fstream inout("iofile", ios::in | ios::out);
Note the use of the constants ios::in and ios::out, indicating that the file must be opened
for both reading and writing. Multiple mode indicators may be used, concatenated by the bitor
operator. Alternatively, instead of ios::out, ios::app could have been used and mere writing
would become appending (at the end of the file).
Reading and writing to the same file is always a bit awkward: what to do when the file may not yet
exist, but if it already exists it should not be rewritten? Having fought with this problem for some
time I now use the following approach:
#include <fstream>
#include <iostream>
#include <string>
using namespace std;
int main()
{
fstream rw("fname", ios::out | ios::in);
if (!rw) // file didn’t exist yet
{
rw.clear(); // try again, creating it using ios::trunc
rw.open("fname", ios::out | ios::trunc | ios::in);
}
if (!rw) // can’t even create it: bail out
{
cerr << "Opening ‘fname’ failed miserably" << ’\n’;
return 1;
}
cerr << "We’re at: " << rw.tellp() << ’\n’;
120 CHAPTER 6. THE IO-STREAM LIBRARY
// write something
rw << "Hello world" << ’\n’;
rw.seekg(0); // go back and read what’s written
string s;
getline(rw, s);
cout << "Read: " << s << ’\n’;
}
Under this approach if the first construction attempt fails fname doesn’t exist yet. But then open
can be attempted using the ios::trunc flag. If the file already existed, the construction would
have succeeded. By specifying ios::ate when defining rw, the initial read/write action would by
default have taken place at EOF.
Under DOS-like operating systems that use the multiple character sequence \r\n to separate lines
in text files the flag ios::binary is required to process binary files ensuring that \r\n combinations
are processed as two characters. In general, ios::binary should be specified when binary
(non-text) files are to be processed. By default files are opened as text files. Unix operating systems
do not distinguish text files from binary files.
With fstream objects, combinations of file flags are used to make sure that a stream is or is not
(re)created empty when opened. See section 6.4.2.1 for details.
Once a file has been opened in read and write mode, the << operator can be used to insert information
into the file, while the >> operator may be used to extract information from the file. These
operations may be performed in any order, but a seekg or seekp operation is required when switching
between insertions and extractions. The seek operation is used to activate the stream’s data
used for reading or those used for writing (and vice versa). The istream and ostream parts of
fstream objects share the stream’s data buffer and by performing the seek operation the stream
either activates its istream or its ostream part. If the seek is omitted, reading after writing and
writing after reading simply fails. The example shows a white space delimited word being read from
a file, writing another string to the file, just beyond the point where the just read word terminated.
Finally yet another string is read which is found just beyond the location where the just written
strings ended:
fstream f("filename", ios::in | ios::out);
string str;
f >> str; // read the first word
// write a well known text
f.seekg(0, ios::cur);
f << "hello world";
f.seekp(0, ios::cur);
f >> str; // and read again
Since a seek or clear operation is required when alternating between read and write (extraction and
insertion) operations on the same file it is not possible to execute a series of << and >> operations
in one expression statement.
Of course, random insertions and extractions are hardly ever used. Generally, insertions and extractions
occur at well-known locations in a file. In those cases, the position where insertions or
6.6. ADVANCED TOPICS 121
extractions are required can be controlled and monitored by the seekg, seekp, tellg and tellp
members (see sections 6.4.1.2 and 6.5.1.2).
Error conditions (see section 6.3.1) occurring due to, e.g., reading beyond end of file, reaching end
of file, or positioning before begin of file, can be cleared by the clear member function. Following
clear processing may continue. E.g.,
fstream f("filename", ios::in | ios::out);
string str;
f.seekg(-10); // this fails, but...
f.clear(); // processing f continues
f >> str; // read the first word
A situation where files are both read and written is seen in database applications, using files consisting
of records having fixed sizes, and where locations and sizes of pieces of information are known.
For example, the following program adds text lines to a (possibly existing) file. It can also be used to
retrieve a particular line, given its order-number in the file. A binary file index allows for the quick
retrieval of the location of lines.
#include <iostream>
#include <fstream>
#include <string>
#include <climits>
using namespace std;
void err(char const *msg)
{
cout << msg << ’\n’;
}
void err(char const *msg, long value)
{
cout << msg << value << ’\n’;
}
void read(fstream &index, fstream &strings)
{
int idx;
if (!(cin >> idx)) // read index
{
cin.clear(); // allow reading again
cin.ignore(INT_MAX, ’\n’); // skip the line
return err("line number expected");
}
index.seekg(idx * sizeof(long)); // go to index-offset
long offset;
if
(
122 CHAPTER 6. THE IO-STREAM LIBRARY
!index.read // read the line-offset
(
reinterpret_cast<char *>(&offset),
sizeof(long)
)
)
return err("no offset for line", idx);
if (!strings.seekg(offset)) // go to the line’s offset
return err("can’t get string offset ", offset);
string line;
if (!getline(strings, line)) // read the line
return err("no line at ", offset);
cout << "Got line: " << line << ’\n’; // show the line
}
void write(fstream &index, fstream &strings)
{
string line;
if (!getline(cin, line)) // read the line
return err("line missing");
strings.seekp(0, ios::end); // to strings
index.seekp(0, ios::end); // to index
long offset = strings.tellp();
if
(
!index.write // write the offset to index
(
reinterpret_cast<char *>(&offset),
sizeof(long)
)
)
return err("Writing failed to index: ", offset);
if (!(strings << line << ’\n’)) // write the line itself
return err("Writing to ‘strings’ failed");
// confirm writing the line
cout << "Write at offset " << offset << " line: " << line << ’\n’;
}
int main()
{
fstream index("index", ios::trunc | ios::in | ios::out);
fstream strings("strings", ios::trunc | ios::in | ios::out);
cout << "enter ‘r <number>’ to read line <number> or "
"w <line>’ to write a line\n"
6.6. ADVANCED TOPICS 123
"or enter ‘q’ to quit.\n";
while (true)
{
cout << "r <nr>, w <line>, q ? "; // show prompt
index.clear();
strings.clear();
string cmd;
cin >> cmd; // read cmd
if (cmd == "q") // process the cmd.
return 0;
if (cmd == "r")
read(index, strings);
else if (cmd == "w")
write(index, strings);
else if (cin.eof())
{
cout << "\n"
"Unexpected end-of-file\n";
return 1;
}
else
cout << "Unknown command: " << cmd << ’\n’;
}
}
Another example showing reading and writing of files is provided by the next program. It also
illustrates the processing of NTB strings:
#include <iostream>
#include <fstream>
using namespace std;
int main()
{ // r/w the file
fstream f("hello", ios::in | ios::out | ios::trunc);
f.write("hello", 6); // write 2 NTB strings
f.write("hello", 6);
f.seekg(0, ios::beg); // reset to begin of file
char buffer[100]; // or: char *buffer = new char[100]
char c;
// read the first ‘hello’
cout << f.get(buffer, sizeof(buffer), 0).tellg() << ’\n’;
f >> c; // read the NTB delim
// and read the second ‘hello’
cout << f.get(buffer + 6, sizeof(buffer) - 6, 0).tellg() << ’\n’;
124 CHAPTER 6. THE IO-STREAM LIBRARY
buffer[5] = ’ ’; // change asciiz to ’ ’
cout << buffer << ’\n’; // show 2 times ‘hello’
}
/*
Generated output:
5
11
hello hello
*/
A completely different way to read and write streams may be implemented using streambuf members.
All considerationsmentioned so far remain valid (e.g., before a read operation following a write
operation seekg must be used). When streambuf objects are used, either an istream is associated
with the streambuf object of another ostream object, or an ostream object is associated with
the streambuf object of another istream object. Here is the previous program again, now using
associated streams:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
void err(char const *msg); // see earlier example
void err(char const *msg, long value);
void read(istream &index, istream &strings)
{
index.clear();
strings.clear();
// insert the body of the read() function of the earlier example
}
void write(ostream &index, ostream &strings)
{
index.clear();
strings.clear();
// insert the body of the write() function of the earlier example
}
int main()
{
ifstream index_in("index", ios::trunc | ios::in | ios::out);
ifstream strings_in("strings", ios::trunc | ios::in | ios::out);
ostream index_out(index_in.rdbuf());
ostream strings_out(strings_in.rdbuf());
cout << "enter ‘r <number>’ to read line <number> or "
"w <line>’ to write a line\n"
"or enter ‘q’ to quit.\n";
6.6. ADVANCED TOPICS 125
while (true)
{
cout << "r <nr>, w <line>, q ? "; // show prompt
string cmd;
cin >> cmd; // read cmd
if (cmd == "q") // process the cmd.
return 0;
if (cmd == "r")
read(index_in, strings_in);
else if (cmd == "w")
write(index_out, strings_out);
else
cout << "Unknown command: " << cmd << ’\n’;
}
}
In this example
• the streams associated with the streambuf objects of existing streams are not ifstream or
ofstream objects but basic istream and ostream objects.
• The streambuf object is not defined by an ifstream or ofstream object. Instead it is defined
outside of the streams, using a filebuf (cf. section 14.8.2) and constructions like:
filebuf fb("index", ios::in | ios::out | ios::trunc);
istream index_in(&fb);
ostream index_out(&fb);
• An ifstream object can be constructed using stream modes normally used with ofstream
objects. Conversely, an ofstream objects can be constructed using stream modes normally
used with ifstream objects.
• If istream and ostreams share a streambuf, then their read and write pointers (should)
point to the shared buffer: they are tightly coupled.
• The advantage of using an external (separate) streambuf over a predefined fstream object
is (of course) that it opens the possibility of using stream objects with specialized streambuf
objects. These streambuf objects may specifically be constructed to control and interface particular
devices. Elaborating this (see also section 14.8) is left as an exercise to the reader.
126 CHAPTER 6. THE IO-STREAM LIBRARY

Chapter 7
Classes
The C programming language offers two methods for structuring data of different types. The C
struct holds data members of various types, and the C union also defines data members of various
types. However, a union’s data members all occupy the same location in memory and the
programmer may decide on which one to use.
In this chapter classes are introduced. A class is a kind of struct, but its contents are by default
inaccessible to the outside world, whereas the contents of a C++ struct are by default accessible to
the outside world. In C++ structs find little use: they are mainly used to aggregate data within the
context of classes or to define elaborate return values. Often a C++ struct merely contains plain
old data (POD, cf. section 9.9). In C++ the class is the main data structuring device, by default
enforcing two core concepts of current-day software engineering: data hiding and encapsulation (cf.
sections 3.2.1 and 7.1.1).
The union is another data structuring device the language offers. The traditional C union is still
available, but C++ also offers unrestricted unions. Unrestricted unions are unions whose data fields
may be of class types. The C++ Annotations covers these unrestricted unions in section 12.6, after
having introduced several other new concepts of C++,
C++ extends the C struct and union concepts by allowing the definition of member functions
(introduced in this chapter) within these data types. Member functions are functions that can only
be used with objects of these data types or within the scope of these data types. Some of these
member functions are special in that they are always, usually automatically, called when an object
starts its life (the so-called constructor) or ends its life (the so-called destructor). These and other
types of member functions, as well as the design and construction of, and philosophy behind, classes
are introduced in this chapter.
We step-by-step construct a class Person, which could be used in a database application to store a
person’s name, address and phone number.
Let’s start by creating a class Person right away. From the onset, it is important to make the
distinction between the class interface and its implementation. A class may loosely be defined as ‘a
set of data and all the functions operating on those data’. This definition is later refined but for now
it is sufficient to get us started.
A class interface is a definition, defining the organization of objects of that class. Normally a definition
results in memory reservation. E.g., when defining int variable the compiler ensures that
some memory is reserved in the final program storing variable’s values. Although it is a definition
no memory is set aside by the compiler once it has processed the class definition. But a class definition
follows the one definition rule: in C++ entities may be defined only once. As a class definition
127
128 CHAPTER 7. CLASSES
does not imply that memory is being reserved the term class interface is preferred instead.
Class interfaces are normally contained in a class header file, e.g., person.h. We’ll start our class
Person interface here (cf section 7.7 for an explanation of the const keywords behind some of the
class’s member functions):
#include <string>
class Person
{
std::string d_name; // name of person
std::string d_address; // address field
std::string d_phone; // telephone number
size_t d_mass; // the mass in kg.
public: // member functions
void setName(std::string const &name);
void setAddress(std::string const &address);
void setPhone(std::string const &phone);
void setMass(size_t mass);
std::string const &name() const;
std::string const &address() const;
std::string const &phone() const;
size_t mass() const;
};
The member functions that are declared in the interface must still be implemented. The implementation
of these members is properly called their definition.
In addition to the member function a class defines the data manipulated by the member functions.
These data are called the data memberdata members. In Person they are d_name, d_address,
d_phone and d_mass. Data members should be given private access rights. Since the class uses
private access rights by default they may simply be listed at the top of the interface.
All communication between the outer world and the class data is routed through the class’s member
functions. Data members may receive new values (e.g., using setName) or they may be retrieved
for inspection (e.g., using name). Functions merely returning values stored inside the object, not
allowing the caller to modify these internally stored values, are called accessors.
Syntactically there is only a marginal difference between a class and a struct. Classes by default
define private members, structs define public members. Conceptually, though, there are differences.
In C++ structs are used in the way they are used in C: to aggregate data, which are all freely
accessible. Classes, on the other hand, hide their data from access by the outside world (which is
aptly called data hiding) and offermember functions to define the communication between the outer
world and the class’s data members.
Following Lakos (Lakos, J., 2001) Large-Scale C++ Software Design (Addison-Wesley) I suggest
the following setup of class interfaces:
• All data members have private access rights, and are placed at the top of the interface.
• All data members start with d_, followed by a name suggesting their meaning (in chapter 8
we’ll also encounter data members starting with s_).
• Non-private data members do exist, but one should be hesitant to define non-private access
rights for data members (see also chapter 13).
7.1. THE CONSTRUCTOR 129
• Two broad categories ofmember functions are manipulators and accessors. Manipulators allow
the users of objects to modify the internal data of the objects. By convention, manipulators
start with set. E.g., setName.
• With accessors, a get-prefix is still frequently encountered, e.g., getName. However, following
the conventions promoted by the Qt (see http://www.trolltech.com) Graphical User
Interface Toolkit, the get-prefix is now deprecated. So, rather than defining the member
getAddress, it should simply be named address.
• Normally (exceptions exist) the public member functions of a class are listed first, immediately
following the class’s data members. They are the important elements of the interface as they
define the features the class is offering to its users. It’s a matter of convention to list them high
up in the interface. The keyword private is needed beyond the public members to switch back
from public members to private access rights which nicely separates the members that may be
used ‘by the general public’ from the class’s own support members.
Style conventions usually take a long time to develop. There is nothing obligatory about them,
though. I suggest that readers who have compelling reasons not to follow the above style conventions
use their own. All others are strongly advised to adopt the above style conventions.
Finally, referring back to section 3.1.2 that
using namespace std;
must be used in most (if not all) examples of source code. As explained in sections 7.11 and 7.11.1
the using directive should follow the preprocessor directive(s) including the header files, using a
setup like the following:
#include <iostream>
#include "person.h"
using namespace std;
int main()
{
...
}
7.1 The constructor
C++ classes may contain two special categories of member functions which are essential to the
proper working of the class. These categories are the constructors and the destructor. The destructor’s
primary task is to return memory allocated by an object to the common pool when an object
goes ‘out of scope’. Allocation of memory is discussed in chapter 9, and destructors are therefore be
discussed in depth in that chapter. In this chapter the emphasis is on the class’s organization and
its constructors.
Constructor are recognized by their names which is equal to the class name. Constructors do
not specify return values, not even void. E.g., the class Person may define a constructor
Person::Person(). The C++ run-time system ensures that the constructor of a class is called
when a variable of the class is defined. It is possible to define a class lacking any constructor. In that
case the compiler defines a default constructor that is called when an object of that class is defined.
130 CHAPTER 7. CLASSES
What actually happens in that case depends on the data members that are defined by that class (cf.
section 7.3.1).
Objects may be defined locally or globally. However, in C++ most objects are defined locally. Globally
defined objects are hardly ever required and are somewhat deprecated.
When a local object is defined its constructor is called every time the function is called. The object’s
constructor is activated at the point where the object is defined (a subtlety is that an object may be
defined implicitly as, e.g., a temporary variable in an expression).
When an object is defined as a static object it is constructed when the program starts. In this case
its constructor is called even before the function main starts. Example:
#include <iostream>
using namespace std;
class Demo
{
public:
Demo();
};
Demo::Demo()
{
cout << "Demo constructor called\n";
}
Demo d;
int main()
{}
/*
Generated output:
Demo constructor called
*/
The programcontains one global object of the class Demo with main having an empty body. Nonetheless,
the program produces some output generated by the constructor of the globally defined Demo
object.
Constructors have a very important and well-defined role. They must ensure that all the class’s
data members have sensible or at least well-defined values once the object has been constructed.
We’ll get back to this important task shortly. The default constructor has no argument. It is defined
by the compiler unless another constructor is defined and unless its definition is suppressed (cf.
section 7.6). If a default constructor is required in addition to another constructor then the default
constructor must explicitly be defined as well. C++ provides special syntax to do that as well, which
is also covered by section 7.6.
7.1.1 A first application
Our example class Person has three string data members and a size_t d_mass data member.
Access to these data members is controlled by interface functions.
7.1. THE CONSTRUCTOR 131
Whenever an object is defined the class’s constructor(s) ensure that its data members are given ‘sensible’
values. Thus, objects never suffer from uninitialized values. Data members may be given new
values, but that should never be directly allowed. It is a core principle (called data hiding) of good
class design that its data members are private. The modification of data members is therefore fully
controlled by member functions and thus, indirectly, by the class-designer. The class encapsulates all
actions performed on its data members and due to this encapsulation the class object may assume
the ‘responsibility’ for its own data-integrity. Here is a minimal definition of Person’s manipulating
members:
#include "person.h" // given earlier
using namespace std;
void Person::setName(string const &name)
{
d_name = name;
}
void Person::setAddress(string const &address)
{
d_address = address;
}
void Person::setPhone(string const &phone)
{
d_phone = phone;
}
void Person::setMass(size_t mass)
{
d_mass = mass;
}
It’s a minimal definition in that no checks are performed. But it should be clear that checks are easy
to implement. E.g., to ensure that a phone number only contains digits one could define:
void Person::setPhone(string const &phone)
{
if (phone.find_first_not_of("0123456789") == string::npos)
d_phone = phone;
else
cout << "A phone number may only contain digits\n";
}
Similarly, access to the data members is controlled by encapsulating accessor members. Accessors
ensure that data members cannot suffer from uncontrolled modifications. Since accessors conceptually
do not modify the object’s data (but only retrieve the data) these member functions are given
the predicate const. They are called const member functions, which, as they are guaranteed not to
modify their object’s data, are available to both modifiable and constant objects (cf. section 7.7).
To prevent backdoors we must also make sure that the data member is not modifiable through
an accessor’s return value. For values of built-in primitive types that’s easy, as they are usually
returned by value, which are copies of the values found in variables. But since objects may be fairly
large making copies are usually prevented by returning objects by reference. A backdoor is created
by returning a data member by reference, as in the following example, showing the allowed abuse
below the function definition:
string &Person::name() const
132 CHAPTER 7. CLASSES
{
return d_name;
}
Person somebody;
somebody.setName("Nemo");
somebody.name() = "Eve"; // Oops, backdoor changing the name
To prevent the backdoor objects are returned as const references from accessors. Here are the implementations
of Person’s accessors:
#include "person.h" // given earlier
using namespace std;
string const &Person::name() const
{
return d_name;
}
string const &Person::address() const
{
return d_address;
}
string const &Person::phone() const
{
return d_phone;
}
size_t Person::mass() const
{
return d_mass;
}
The Person class interface remains the starting point for the class design: its member functions
define what can be asked of a Person object. In the end the implementation of its members merely
is a technicality allowing Person objects to do their jobs.
The next example shows how the class Person may be used. An object is initialized and passed to a
function printperson(), printing the person’s data. Note the reference operator in the parameter
list of the function printperson. Only a reference to an existing Person object is passed to the
function, rather than a complete object. The fact that printperson does not modify its argument
is evident from the fact that the parameter is declared const.
#include <iostream>
#include "person.h" // given earlier
using namespace std;
void printperson(Person const &p)
{
cout << "Name : " << p.name() << "\n"
"Address : " << p.address() << "\n"
"Phone : " << p.phone() << "\n"
"Mass : " << p.mass() << ’\n’;
}
7.1. THE CONSTRUCTOR 133
int main()
{
Person p;
p.setName("Linus Torvalds");
p.setAddress("E-mail: [email protected]");
p.setPhone(" - not sure - ");
p.setMass(75); // kg.
printperson(p);
}
/*
Produced output:
Name : Linus Torvalds
Address : E-mail: [email protected]
Phone : - not sure -
Mass : 75
*/
7.1.2 Constructors: with and without arguments
The class Person’s constructor so far has no parameters. C++ allows constructors to be defined with
or without parameter lists. The arguments are supplied when an object is defined.
For the class Person a constructor expecting three strings and a size_t might be useful. Representing,
respectively, the person’s name, address, phone number and mass. This constructor is (but
see also section 7.3.1):
Person::Person(string const &name, string const &address,
string const &phone, size_t mass)
{
d_name = name;
d_address = address;
d_phone = phone;
d_mass = mass;
}
It must of course also be declared in the class interface:
class Person
{
// data members (not altered)
public:
Person(std::string const &name, std::string const &address,
std::string const &phone, size_t mass);
// rest of the class interface (not altered)
};
Now that this constructor has been declared, the default constructor must explicitly be declared as
134 CHAPTER 7. CLASSES
well if we still want to be able to construct a plain Person object without any specific initial values
for its data members. The class Person would thus support two constructors, and the part declaring
the constructors now becomes:
class Person
{
// data members
public:
Person();
Person(std::string const &name, std::string const &address,
std::string const &phone, size_t mass);
// additional members
};
In this case, the default constructor doesn’t have to do very much, as it doesn’t have to initialize the
string data members of the Person object. As these data members are objects themselves, they are
initialized to empty strings by their own default constructor. However, there is also a size_t data
member. That member is a variable of a built-in type and such variabes do not have constructors
and so are not initialized automatically. Therefore, unless the value of the d_mass data member is
explicitly initialized its value is:
• a random value for local Person objects;
• 0 for global and static Person objects.
The 0-value might not be too bad, but normally we don’t want a random value for our data members.
So, even the default constructor has a job to do: initializing the data members which are not
initialized to sensible values automatically. Its implementation can be:
Person::Person()
{
d_mass = 0;
}
Using constructors with and without arguments is illustrated next. The object karel is initialized
by the constructor defining a non-empty parameter list while the default constructor is used with
the anon object:
int main()
{
Person karel("Karel", "Rietveldlaan 37", "542 6044", 70);
Person anon;
}
The two Person objects are defined when main starts as they are local objects, living only for as
long as main is active.
If Person objects must be definable using other arguments, corresponding constructors must be
added to Person’s interface. Apart from overloading class constructors it is also possible to provide
constructors with default argument values. These default arguments must be specified with the
constructor declarations in the class interface, like so:
class Person
7.1. THE CONSTRUCTOR 135
{
public:
Person(std::string const &name,
std::string const &address = "--unknown--",
std::string const &phone = "--unknown--",
size_t mass = 0);
};
Often, constructors use highly similar implementions. This results from the fact that the constructor’s
parameters are often defined for convenience: a constructor not requiring a phone number but
requiring a mass cannot be defined using default arguments, since phone is not the constructor’s
last parameter. Consequently a special constructor is required not having phone in its parameter
list.
Before the C++11 standard this situation was commonly handled like this: all constructors must initialize
their reference and const data members, or the compiler (rightfully) complains. To initialize
the remaining members (non-const and non-reference members) we have two options:
• If the body of the construction process is sizeable but (parameterizable) identical to other constructors
bodies then factorize. Define a private member init which is called by the constructors
to provide the object’s data members with their appropriate values.
• If the constructors act fundamentally differently, then there’s nothing to factorize and each
constructor must be implemented by itself.
Currently, C++ allows constructors to call each other (called constructor delegation). This is illustrated
in section 7.4.1 below.
7.1.2.1 The order of construction
The possibility to pass arguments to constructors allows us to monitor the construction order of
objects during program execution. This is illustrated by the next program using a class Test. The
program defines a global Test object and two local Test objects. The order of construction is as
expected: first global, then main’s first local object, then func’s local object, and then, finally, main’s
second local object:
#include <iostream>
#include <string>
using namespace std;
class Test
{
public:
Test(string const &name); // constructor with an argument
};
Test::Test(string const &name)
{
cout << "Test object " << name << " created" << ’\n’;
}
Test globaltest("global");
136 CHAPTER 7. CLASSES
void func()
{
Test functest("func");
}
int main()
{
Test first("main first");
func();
Test second("main second");
}
/*
Generated output:
Test object global created
Test object main first created
Test object func created
Test object main second created
*/
7.2 Ambiguity resolution
Defining objects may result in some unexpected surprises. Assume the following class interface is
available:
class Data
{
public:
Data();
Data(int one);
Data(int one, int two);
void display();
};
The intention is to define two objects of the class Data, using, respectively, the first and second
constructors. Your code looks like this (and compiles correctly):
#include "data.h"
int main()
{
Data d1();
Data d2(argc);
}
Now it’s time to make some good use of the Data objects. You add two statements to main:
d1.display();
d2.display();
But, surprise, the compiler complains about the first of these two:
7.2. AMBIGUITY RESOLUTION 137
error: request for member ’display’ in ’d1’, which is of non-class type ’Data()’
What’s going on here? First of all, notice the data type the compiler refers to: Data(), rather than
Data. What are those () doing there?
Before answering that question, let’s broaden our story somewhat. We know that somewhere in
a library a factory function dataFactory exists. A factory function creates and returns an object
of a certain type. This dataFactory function returns a Data object, constructed using Data’s
default constructor. Hence, dataFactory needs no arguments. We want to use dataFactory in
our program, but must declare the function. So we add the declaration to main, as that’s the only
location where dataFactory will be used. It’s a function, not requiring arguments, returning a
Data object:
Data dataFactory();
This, however, looks remarkably similar to our d1 object definition:
Data d1();
We found the source of our problem: Data d1() apparently is not the definition of a d1 object, but
the declaration of a function, returning a Data object. So, what’s happening here and how should
we define a Data object using Data’s default constructor?
First: what’s happening here is that the compiler, when confronted with Data d1(), actually had a
choice. It could either define a Data object, or declare a function. It declares a function.
In fact, we’re encountering an ambiguity in C++’s syntax here, which is solved, according to the
language’s standard, by always letting a declaration prevail over a definition. We’ll encounter more
situations where this ambiguity occurs later on in this section.
Second: there are several ways we can solve this ambiguity the way we want it to be solved. To
define an object using its default constructor:
• merely mention it (like int x): Data d1;
• use the curly-brace initialization: Data d1{};
• use the assignment operator and an anonymous default constructed Data object: Data d1 =
Data().
7.2.1 Types ‘Data’ vs. ‘Data()’
Data(), which in the above context defines a default constructed anonymous Data object, takes us
back to the compiler error. According to the compiler, our original d1 apparently was not of type
Data, but of type Data(). So what’s that?
Let’s first have a look at our second constructor. It expects an int. We would like to define another
Data object, using the second constructor, but want to pass the default int value to the constructor,
using int(). We know this defines a default int value, as cout << int() << ’\n’ nicely
displays 0, and int x = int() also initialized x to 0. So we define ‘Data di(int())’ in main.
Not good: again the compiler complains when we try to use di. After ‘di.display()’ the compiler
tells us:
error: request for member ’display’ in ’di’, which is of non-class type ’Data(int (_)())’
138 CHAPTER 7. CLASSES
Oops, not as expected.... Didn’t we pass 0? Why the sudden pointer? It’s that same ‘use a declaration
when possible’ strategy again. The notation Type() not only represents the default value of type
Type, but it’s also a shorthand notation for an anonymous pointer to a function, not expecting arguments,
and returning a Type value, which you can verify by defining ‘int (_ip)() = nullptr’,
and passing ip as argument to di: di(ip) compiles fine.
So why doesn’t the error occur when inserting int() or assigning int() to int x? In these latter
cases nothing is declared. Rather, cout and int x = need expressions determining values, which
is provided by int()’s ‘natural’ interpretation. But with ‘Data di(int())’ the compiler again
has a choice, and (by design) it chooses a declaration because the declaration takes priority. Now
int()’s interpretation as an anonymous pointer is available and therefore used.
Likewise, if int x has been defined, ‘Data b1(int(x))’ declares b1 as a function, expecting an
int (as int(x) represents a type), while ‘Data b2((int)x)’ defines b2 as a Data object, using
the constructor expecting a single int value.
7.2.2 Superfluous parentheses
Let’s play some more. At some point in our program we defined int b. Then, in a compound statement
we need to construct an anonymous Data object, initialized using b, followed by displaying
b:
int b = 18;
{
Data(b);
cout << b;
}
About that cout statement the compiler tells us (Imodified the errormessage to reveal its meaning):
error: cannot bind ‘std::ostream & << Data const &’
Here we didn’t insert int b but Data b. Had we omitted the compound statement, the compiler
would have complained about a doubly defined b entity, as Data(b) simply means Data b, a Data
object constructed by default. The compiler may omit superfluous parentheses when parsing a definition
or declaration.
Of course, the question now becomes how a temporary object Data, initialized with int b can be
defined. Remember that the compiler may remove superfluous parentheses. So, what we need to do
is to pass an int to the anonymous Data object, without using the int’s name.
• We can use a cast: Data(static_cast<int>(b));
• We can use a curly-brace initialization: Data {b}.
Values and types make big differences. Consider the following definitions:
Data (*d4)(int); // 1
Data (*d5)(3); // 2
Definition 1 should cause no problems: it’s a pointer to a function, expecting an int, returning a
Data object. Hence, d4 is a pointer variable.
7.2. AMBIGUITY RESOLUTION 139
Definition 2 is slightly more complex. Yes, it’s a pointer. But it has nothing to do with a function.
So what’s that argument list containing 3 doing there? Well, it’s not an argument list. It’s an
initialization that looks like an argument list. Remember that variables can be initialized using the
assignment statement, by parentheses or by curly parentheses. So instead of ‘(3)’ we could have
written ‘= 3’ or ‘{3}’. Let’s pick the first alternative, resulting in:
Data (*d5) = 3;
Now we get to ‘play compiler’ again. Removing some superfluous parentheses we get:
Data *d5 = 3;
It’s a pointer to a Data object, initialized to 3 (semantically incorrect, but that’s only clear after the
syntactical analysis. If I had initially written
Data (*d5)(&d1); // 2
the fun resulting from contrasting int and 3 would most likely have been spoiled).
7.2.3 Existing types
Once a type name has been defined it also prevails over identifiers representing variables, if the
compiler is given a choice. This, too, can result in interesting constructions.
Assume a function process expecting an int exists in a library. We want to use this function to
process some int data values. So in main process is declared and called:
int process(int Data);
process(argc);
No problems here. But unfortunately we once decided to ‘beautify’ our code, by throwing in some
superfluous parentheses, like so:
int process(int (Data));
process(argc);
Now we’re in trouble. The compiler now generates an error, caused by its rule to let declarations
prevail over definitions. Data now becomes the name of the class Data, and analogous to int
(x) the parameter int (Data) is parsed as int (_)(Data): a pointer to a function, expecting a
Data object, returning an int.
Here is another example. When, instead of declaring
int process(int Data[10]);
we declare, e.g., to emphasize the fact that an array is passed to process:
int process(int (Data[10]));
140 CHAPTER 7. CLASSES
the process function does not expect a pointer to int values, but a pointer to a function expecting
a pointer to Data elements, returning an int.
To summarize the findings in the ‘Ambiguity Resolution’ section:
• The compiler will try to remove superfluous parentheses;
• But if the parenthesized construction represents a type, it will try to use the type;
• More in general: when possible the compiler will interpret a syntactic construction as a declaration,
rather than as a definition (of an object or variable).
7.3 Objects inside objects: composition
In the class Person objects are used as data members. This construction technique is called composition.
Composition is neither extraordinary nor C++ specific: in C a struct or union field is commonly
used in other compound types. In C++ it requires some special thought as their initialization sometimes
is subject to restrictions, as discussed in the next few sections.
7.3.1 Composition and const objects: const member initializers
Unless specified otherwise object data members of classes are initialized by their default constructors.
Using the default constructor might not always be the optimal way to intialize an object and it
might not even be possible: a class might simply not define a default constructor.
Earlier we’ve encountered the following constructor of the Person:
Person::Person(string const &name, string const &address,
string const &phone, size_t mass)
{
d_name = name;
d_address = address;
d_phone = phone;
d_mass = mass;
}
Think briefly about what is going on in this constructor. In the constructor’s body we encounter
assignments to string objects. Since assignments are used in the constructor’s body their left-hand
side objects must exist. But when objects are coming into existence constructors must have been
called. The initialization of those objects is thereupon immediately undone by the body of Person’s
constructor. That is not only inefficient but sometimes downright impossible. Assume that the class
interface mentions a string const data member: a data member whose value is not supposed
to change at all (like a birthday, which usually doesn’t change very much and is therefore a good
candidate for a string const data member). Constructing a birthday object and providing it with
an initial value is OK, but changing the initial value isn’t.
The body of a constructor allows assignments to data members. The initialization of data members
happens before that. C++ defines the member initializer syntax allowing us to specify the way
data members are initialized at construction time. Member initializers are specified as a list of
7.3. OBJECTS INSIDE OBJECTS: COMPOSITION 141
constructor specifications between a colon following a constructor’s parameter list and the opening
curly brace of a constructor’s body, as follows:
Person::Person(string const &name, string const &address,
string const &phone, size_t mass)
:
d_name(name),
d_address(address),
d_phone(phone),
d_mass(mass)
{}
Member initialization always occurs when objects are composed in classes: if no constructors are
mentioned in the member initializer list the default constructors of the objects are called. Note that
this only holds true for objects. Data members of primitive data types are not initialized automatically.
Member initialization can, however, also be used for primitive data members, like int and double.
The above example shows the initialization of the data member d_mass from the parameter mass.
When member initializers are used the data member could even have the same name as the constructor’s
parameter (although this is deprecated) as there is no ambiguity and the first (left) identifier
used in a member initializer is always a data member that is initialized whereas the identifier
between parentheses is interpreted as the parameter.
The order in which class type data members are initialized is defined by the order in which those
members are defined in the composing class interface. If the order of the initialization in the constructor
differs from the order in the class interface, the compiler complains, and reorders the initialization
so as to match the order of the class interface.
Member initializers should be used as often as possible. As shown it may be required to use them
(e.g., to initialize const data members, or to initialize objects of classes lacking default constructors)
but not using member initializers also results in inefficient code as the default constructor of a data
member is always automatically called unless an explicit member initializer is specified. Reassignment
in the constructor’s body following default construction is then clearly inefficient. Of course,
sometimes it is fine to use the default constructor, but in those cases the explicit member initializer
can be omitted.
As a rule of thumb: if a value is assigned to a data member in the constructor’s body then try to
avoid that assignment in favor of using a member initializer.
7.3.2 Composition and reference objects: reference member initializers
Apart from using member initializers to initialize composed objects (be they const objects or not),
there is another situation where member initializers must be used. Consider the following situation.
A program uses an object of the class Configfile, defined in main to access the information in a
configuration file. The configuration file contains parameters of the program which may be set by
changing the values in the configuration file, rather than by supplying command line arguments.
Assume another object used in main is an object of the class Process, doing ‘all the work’. What
possibilities do we have to tell the object of the class Process that an object of the class Configfile
exists?
• The objects could have been declared as global objects. This is a possibility, but not a very good
142 CHAPTER 7. CLASSES
one, since all the advantages of local objects are lost.
• The Configfile object may be passed to the Process object at construction time. Bluntly
passing an object (i.e., by value) might not be a very good idea, since the object must be copied
into the Configfile parameter, and then a data member of the Process class can be used to
make the Configfile object accessible throughout the Process class. This might involve yet
another object-copying task, as in the following situation:
Process::Process(Configfile conf) // a copy from the caller
{
d_conf = conf; // copying to d_conf member
}
• The copy-instructions can be avoided if pointers to the Configfile objects are used, as in:
Process::Process(Configfile *conf) // pointer to external object
{
d_conf = conf; // d_conf is a Configfile *
}
This construction as such is OK, but forces us to use the ‘->’ field selector operator, rather
than the ‘.’ operator, which is (disputably) awkward. Conceptually one tends to think of the
Configfile object as an object, and not as a pointer to an object. In C this would probably
have been the preferred method, but in C++ we can do better.
• Rather than using value or pointer parameters, the Configfile parameter could be defined
as a reference parameter of Process’s constructor. Next, use a Config reference data member
in the class Process.
But a reference variable cannot be initialized using an assignment, and so the following is incorrect:
Process::Process(Configfile &conf)
{
d_conf = conf; // wrong: no assignment
}
The statement d_conf = conf fails, because it is not an initialization, but an assignment of one
Configfile object (i.e., conf), to another (d_conf). An assignment to a reference variable is
actually an assignment to the variable the reference variable refers to. But which variable does
d_conf refer to? To no variable at all, since we haven’t initialized d_conf. After all, the whole
purpose of the statement d_conf = conf was to initialize d_conf....
How to initialize d_conf? We once again use the member initializer syntax. Here is the correct way
to initialize d_conf:
Process::Process(Configfile &conf)
:
d_conf(conf) // initializing reference member
{}
The above syntax must be used in all cases where reference data members are used. E.g., if d_ir
would have been an int reference data member, a construction like
Process::Process(int &ir)
7.4. DATA MEMBER INITIALIZERS 143
:
d_ir(ir)
{}
would have been required.
7.4 Data member initializers
Non-static data members of classes are usually initialized by the class’s constructors. Frequently
(but not always) the same initializations are used by different constructors, resulting in multiple
points where the initializations are performed, which in turn complicates class maintenance.
Consider a class defining several data members: a pointer to data, a data member storing the number
of data elements the pointer points at, a data member storing the sequence number of the object.
The class also offer a basic set of constructors, as shown in the following class interface:
class Container
{
Data *d_data;
size_t d_size;
size_t d_nr;
static size_t s_nObjects;
public:
Container();
Container(Container const &other);
Container(Data *data, size_t size);
Container(Container &&tmp);
};
The initial values of the data members are easy to describe, but somewhat hard to implement.
Consider the initial situation and assume the default constructor is used: all data members should
be set to 0, except for d_nr which must be given the value ++s_nObjects. Since these are nondefault
actions, we can’t declare the default constructor using = default, but we must provide an
actual implementation:
Container()
:
d_data(0),
d_size(0),
d_nr(++s_nObjects)
{}
In fact, all constructors require us to state the d_nr(++s_nObjects) initialization. So if d_data’s
type would have been a (move aware) class type, we would still have to provide implementations for
all of the above constructors.
C++, however, also supports data member initializers, simplifying the initialization of non-static
data members. Data member initializers allow us to assign initial values to data members. The
compiler must be able to compute these initial values from initialization expressions, but the initial
values do not have to be constant expressions. So ++s_nObjects can be an initial value.
144 CHAPTER 7. CLASSES
Using data member initializers for the class Container we get:
class Container
{
Data *d_data = 0;
size_t d_size = 0;
size_t d_nr = ++nObjects;
static size_t s_nObjects;
public:
Container() = default;
Container(Container const &other);
Container(Data *data, size_t size);
Container(Container &&tmp);
};
Note that the data member initializations are recognized by the compiler, and are applied to its
implementation of the default constructor. In fact, all constructors will apply the data member initializations,
unless explicitly initialized otherwise. E.g., the move-constructormay now be implented
like this:
Container(Container &&tmp)
:
d_data(tmp.d_data),
d_size(tmp.d_size)
{
tmp.d_data = 0;
}
Although d_nr’s intialization is left out of the implementation it is initialized due to the data member
initialization provided in the class’s interface.
An aggregate is an array or a class (usually a struct with no user-defined constructors, no private
or protected non-static data members, no base classes (cf. chapter 13), and no virtual functions (cf.
chapter 14)). E.g.,
struct POD // defining aggregate POD
{
int first = 5;
double second = 1.28;
std::string hello {"hello"};
};
The C++14 standard allows initialization of such aggregates using braced initializer lists. E.g.,
POD pod {4, 13.5, "hi there"};
When using braced-initializer lists not all data members need to be initialized. Specification may
stop at any data member, in which case the default (or explicitly defined initialization values) of the
remaining data members are used. E.g.,
POD pod {4}; // uses second: 1.28, hello: "hello"
7.4. DATA MEMBER INITIALIZERS 145
7.4.1 Delegating constructors
Often constructors are specializations of each other, allowing objects to be constructed specifying
only subsets of arguments for all of its data members, using default argument values for the remaining
data members.
Before the C++11 standard common practice was to define a member like init performing all initializations
common to constructors. Such an init function, however, cannot be used to initialize
const or reference data members, nor can it be used to perform so-called base class initializations
(cf. chapter 13).
Here is an example where such an init function might have been used. A class Stat is designed
as a wrapper class around C’s stat(2) function. The class might define three constructors: one
expecting no arguments and initializing all data members to appropriate values; a second one doing
the same, but it calls stat for the filename provided to the constructor; and a third one expecting a
filename and a search path for the provided file name. Instead of repeating the initialization code
in each constructor, the common code can be factorized into a member init which is called by the
constructors.
Currently, C++ offers an alternative by allowing constructors to call each other. This is called delegating
constructors The C++11 standard allows us to delegate constructors as illustrated by the next
example:
class Stat
{
public:
Stat()
:
State("", "") // no filename/searchpath
{}
Stat(std::string const &fileName)
:
Stat(fileName, "") // only a filename
{}
Stat(std::string const &fileName, std::string const &searchPath)
:
d_filename(fileName),
d_searchPath(searchPath)
{
// remaining actions to be performed by the constructor
}
};
C++ allows static const integral data members to be initialized within the class interfaces (cf. chapter
8). The C++11 standard adds to this the facility to define default initializations for plain data
members in class interfaces (these data members may or may not be const or of integral types, but
(of course) they cannot be reference data members).
These default initializations may be overruled by constructors. E.g., if the class Stat uses a data
member bool d_hasPath which is false by default but the third constructor (see above) should
initialize it to true then the following approach is possible:
class Stat
{
bool d_hasPath = false;
146 CHAPTER 7. CLASSES
public:
Stat(std::string const &fileName, std::string const &searchPath)
:
d_hasPath(true) // overrule the interface-specified
// value
{}
};
Here d_hasPath receives its value only once: it’s always initialized to false except when the shown
constructor is used in which case it is initialized to true.
7.5 Uniform initialization
When defining variables and objects theymay immediately be given initial values. Class type objects
are always initialized using one of their available constructors. C already supports the array and
struct initializer list consisting of a list of constant expressions surrounded by a pair of curly braces.
C++ supports a comparable initialization, called uniform initialization. It uses the following syntax:
Type object {value list};
When defining objects using a list of objects each individual object may use its own uniform initialization.
The advantage of uniform initialization over using constructors is that using constructor arguments
may sometimes result in an ambiguity as constructing an object may sometimes be confused with
using the object’s overloaded function call operator (cf. section 11.10). As initializer lists can only be
used with plain old data (POD) types (cf. section 9.9) and with classes that are ‘initializer list aware’
(like std::vector) the ambiguity does not arise when initializer lists are used.
Uniform initialization can be used to initialize an object or variable, but also to initialize data members
in a constructor or implicitly in the return statement of functions. Examples:
class Person
{
// data members
public:
Person(std::string const &name, size_t mass)
:
d_name {name},
d_mass {mass}
{}
Person copy() const
{
return {d_name, d_mass};
}
};
Object definitions may be encountered in unexpected places, easily resulting in (human) confusion.
Consider a function ‘func’ and a very simple class Fun (struct is used, as data hiding is not an
issue here; in-class implementatinos are used for brevity):
7.5. UNIFORM INITIALIZATION 147
void func();
struct Fun
{
Fun(void (*f)())
{
std::cout << "Constructor\n";
};
void process()
{
std::cout << "process\n";
}
};
In main a Fun object is defined: Fun fun(func). Running this program displays Constructor:
fun is constructed. Next we intend to call process for an anonymous Fun object:
Fun fun(func);
Fun(func).process();
Constructor appears twice, and then process is displayed.
What about just defining an anonymous Fun object? We do:
Fun(func);
Now we’re in for a surprise. The compiler complains that Fun’s default constructor is missing. Why’s
that? Insert some blanks immediately after Fun and you get Fun (func). Parentheses around an
identifier are OK, and are stripped off once the parenthesized expression has been parsed. In this
case: (func) equals func, and so we have Fun func: the definition of a Fun func object, using
Fun’s default constructor (which isn’t provided).
So why does Fun(func).process() compile? In this case we have a member selector operator,
whose left-hand operand must be an class-type object. The object must exist, and Fun(func) represents
that object. It’s not the name of an existing object, but a constructor expecting a function like
func exists. The compiler now creates an anonymous Fun, passing it func as its argument.
Clearly, in this example, parentheses cannot be used to create an anonymous Fun object. However,
the uniform initialization can be used. To define the anonymous Fun object use this syntax:
Fun {func};
(which can also be used to immediately call one of its members. E.g., Fun{func}.process()).
Although the uniform intialization syntax is slightly different from the syntax of an initializer list
(the latter using the assignment operator) the compiler nevertheless uses the initializer list if a
constructor supporting an initializer list is available. As an example consider:
class Vector
{
public:
Vector(size_t size);
148 CHAPTER 7. CLASSES
Vector(std::initializer_list<int> const &values);
};
Vector vi = {4};
When defining vi the constructor expecting the initializer list is called rather than the constructor
expecting a size_t argument. If the latter constructor is required the definition using the standard
constructor syntax must be used. I.e., Vector vi(4).
Initializer lists are themselves objects that may be constructed using another initializer list. However,
values stored in an initializer list are immutable. Once the initializer list has been defined
their values remain as-is.
Before using the initializer_list the initializer_list header file must be included.
Initializer lists support a basic set of member functions and constructors:
• initializer_list<Type> object:
defines object as an empty initializer list
• initializer_list<Type> object { list of Type values }:
defines object as an initializer list containing Type values
• initializer_list<Type> object(other):
initializes object using the values stored in other
• size_t size() const:
returns the number of elements in the initializer list
• Type const _begin() const:
returns a pointer to the first element of the initializer list
• Type const _end() const:
returns a pointer just beyond the location of the last element of the initializer list
7.6 Defaulted and deleted class members
In everyday class design two situations are frequently encountered:
• A class offering constructors explicitly has to define a default constructor;
• A class (e.g., a class implementing a stream) cannot initialize objects by copying the values
from an existing object of that class (called copy construction) and cannot assign objects to each
other.
Once a class defines at least one constructor its default constructor is not automatically defined by
the compiler. C++ relaxes that restriction somewhat by offering the ‘= default’ syntax. A class
specifying ‘= default’ with its default constructor declaration indicates that the trivial default
constructor should be provided by the compiler. A trivial default constructor performs the following
actions:
• Its data members of built-in or primitive types are not initialized;
7.7. CONST MEMBER FUNCTIONS AND CONST OBJECTS 149
• Its composed (class type) data members are initialized by their default constructors.
• If the class is derived froma base class (cf. chapter 13) the base class is initialized by its default
constructor.
Trivial implementations can also be provided for the copy constructor, the overloaded assignment
operator, and the destructor. Those members are introduced in chapter 9.
Conversely, situations exist where some (otherwise automatically provided) members should not be
made available. This is realized by specifying ‘= delete’. Using = default and = delete is
illustrated by the following example. The default constructor receives its trivial implementation,
copy-construction is prevented:
class Strings
{
public:
Strings() = default;
Strings(std::string const *sp, size_t size);
Strings(Strings const &other) = delete;
};
7.7 Const member functions and const objects
The keyword const is often used behind the parameter list of member functions. This keyword
indicates that a member function does not alter the data members of its object. Such member
functions are called const member functions. In the class Person, we see that the accessor functions
were declared const:
class Person
{
public:
std::string const &name() const;
std::string const &address() const;
std::string const &phone() const;
size_t mass() const;
};
The rule of thumb given in section 3.1.1 applies here too: whichever appears to the left of the keyword
const, is not altered. With member functions this should be interpreted as ‘doesn’t alter its own
data’.
When implementing a const member function the const attribute must be repeated:
string const &Person::name() const
{
return d_name;
}
The compiler prevents the data members of a class from being modified by one of its const member
functions. Therefore a statement like
d_name[0] = toupper(static_cast<unsigned char>(d_name[0]));
150 CHAPTER 7. CLASSES
results in a compiler error when added to the above function’s definition.
Const member functions are used to prevent inadvertent data modification. Except for constructors
and the destructor (cf. chapter 9) only const member functions can be used with (plain, references
or pointers to) const objects.
Const objects are frequently encounterd as const & parameters of functions. Inside such functions
only the object’s const members may be used. Here is an example:
void displayMass(ostream &out, Person const &person)
{
out << person.name() << " weighs " << person.mass() << " kg.\n";
}
Since person is defined as a Person const & the function displayMass cannot call, e.g.,
person.setMass(75).
The const member function attribute can be used to overload member functions. When functions
are overloaded by their const attribute the compiler uses the member function matching most
closely the const-qualification of the object:
• When the object is a const object, only const member functions can be used.
• When the object is not a const object, non-const member functions are used, unless only a
const member function is available. In that case, the const member function is used.
The next example illustrates how (non) const member functions are selected:
#include <iostream>
using namespace std;
class Members
{
public:
Members();
void member();
void member() const;
};
Members::Members()
{}
void Members::member()
{
cout << "non const member\n";
}
void Members::member() const
{
cout << "const member\n";
}
int main()
{
Members const constObject;
Members nonConstObject;
7.7. CONST MEMBER FUNCTIONS AND CONST OBJECTS 151
constObject.member();
nonConstObject.member();
}
/*
Generated output:
const member
non const member
*/
As a general principle of design: member functions should always be given the const attribute,
unless they actually modify the object’s data.
7.7.1 Anonymous objects
Sometimes objects are used because they offer a certain functionality. The objects only exist because
of their functionality, and nothing in the objects themselves is ever changed. The following
class Print offers a facility to print a string, using a configurable prefix and suffix. A partial class
interface could be:
class Print
{
public:
Print(ostream &out);
void print(std::string const &prefix, std::string const &text,
std::string const &suffix) const;
};
An interface like this would allow us to do things like:
Print print(cout);
for (int idx = 0; idx != argc; ++idx)
print.print("arg: ", argv[idx], "\n");
This works fine, but it could greatly be improved if we could pass print’s invariant arguments to
Print’s constructor. This would simplify print’s prototype (only one argument would need to be
passed rather than three) and we could wrap the above code in a function expecting a Print object:
void allArgs(Print const &print, int argc, char *argv[])
{
for (int idx = 0; idx != argc; ++idx)
print.print(argv[idx]);
}
The above is a fairly generic piece of code, at least it is with respect to Print. Since prefix and
suffix don’t change they can be passed to the constructor which could be given the prototype:
Print(ostream &out, string const &prefix = "", string const &suffix = "");
152 CHAPTER 7. CLASSES
Now allArgs may be used as follows:
Print p1(cout, "arg: ", "\n"); // prints to cout
Print p2(cerr, "err: --", "--\n"); // prints to cerr
allArgs(p1, argc, argv); // prints to cout
allArgs(p2, argc, argv); // prints to cerr
But now we note that p1 and p2 are only used inside the allArgs function. Furthermore, as we can
see from print’s prototype, print doesn’t modify the internal data of the Print object it is using.
In such situations it is actually not necessary to define objects before they are used. Instead anonymous
objects may be used. Anonymous objects can be used:
• to initialize a function parameter which is a const reference to an object;
• if the object is only used inside the function call.
These anonymous objects are considered constant as they merely exist for passing the information
of (class type) objects to functions. They are not considered ’variables’. Of course, a const_cast
could be used to cast away the const reference’s constness, but any change is lost once the function
returns. These anonymous objects used to initialize const references should not be confused with
rvalue references (section 3.3.2) which have a completely different purpose in life. Rvalue references
primarily exist to be ‘swallowed’ by functions receiving them. Thus, the information made available
by rvalue references outlives the rvalue reference objects which are also anonymous.
Anonymous objects are defined when a constructor is used without providing a name for the constructed
object. Here is the corresponding example:
allArgs(Print(cout, "arg: ", "\n"), argc, argv); // prints to cout
allArgs(Print(cerr, "err: --", "--\n"), argc, argv);// prints to cerr
In this situation the Print objects are constructed and immediately passed as first arguments to
the allArgs functions, where they are accessible as the function’s print parameter. While the
allArgs function is executing they can be used, but once the function has completed, the anonymous
Print objects are no longer accessible.
7.7.1.1 Subtleties with anonymous objects
Anonymous objects can be used to initialize function parameters that are const references to objects.
These objects are created just before such a function is called, and are destroyed once the
function has terminated. C++’s grammar allows us to use anonymous objects in other situations as
well. Consider the following snippet of code:
int main()
{
// initial statements
Print("hello", "world");
// later statements
}
7.7. CONST MEMBER FUNCTIONS AND CONST OBJECTS 153
In this example an anonymous Print object is constructed, and it is immediately destroyed thereafter.
So, following the ‘initial statements’ our Print object is constructed. Then it is destroyed
again followed by the execution of the ‘later statements’.
The example illustrates that the standard lifetime rules do not apply to anonymous objects. Their
lifetimes are limited to the statements, rather than to the end of the block in which they are defined.
Plain anonymous object are at least useful in one situation. Assume we want to put markers in
our code producing some output when the program’s execution reaches a certain point. An object’s
constructor could be implemented so as to provide that marker-functionality allowing us to put
markers in our code by defining anonymous, rather than named objects.
C++’s grammar contains another remarkable characteristic illustrated by the next example:
int main(int argc, char **argv)
{
Print p(cout, "", ""); // 1
allArgs(Print(p), argc, argv); // 2
}
In this example a non-anonymous object p is constructed in statement 1, which is then used in
statement 2 to initialize an anonymous object. The anonymous object, in turn, is then used to
initialize allArgs’s const reference parameter. This use of an existing object to initialize another
object is common practice, and is based on the existence of a so-called copy constructor. A copy
constructor creates an object (as it is a constructor) using an existing object’s characteristics to
initialize the data of the object that’s created. Copy constructors are discussed in depth in chapter
9, but presently only the concept of a copy constructor is used.
In the above example a copy constructor is used to initialize an anonymous object. The anonymous
object was then used to initialize a parameter of a function. However, when we try to apply the
same trick (i.e., using an existing object to initialize an anonymous object) to a plain statement, the
compiler generates an error: the object p can’t be redefined (in statement 3, below):
int main(int argc, char *argv[])
{
Print p("", ""); // 1
allArgs(Print(p), argc, argv); // 2
Print(p); // 3 error!
}
Does this mean that using an existing object to initialize an anonymous object that is used as function
argument is OK, while an existing object can’t be used to initialize an anonymous object in a
plain statement?
The compiler actually provides us with the answer to this apparent contradiction. About statement
3 the compiler reports something like:
error: redeclaration of ’Print p’
which solves the problem when realizing that within a compound statement objects and variables
may be defined. Inside a compound statement, a type name followed by a variable name is the
grammatical form of a variable definition. Parentheses can be used to break priorities, but if there
are no priorities to break, they have no effect, and are simply ignored by the compiler. In statement
3 the parentheses allowed us to get rid of the blank that’s required between a type name and the
variable name, but to the compiler we wrote
154 CHAPTER 7. CLASSES
Print (p);
which is, since the parentheses are superfluous, equal to
Print p;
thus producing p’s redeclaration.
As a further example: when we define a variable using a built-in type (e.g., double) using superfluous
parentheses the compiler quietly removes these parentheses for us:
double ((((a)))); // weird, but OK.
To summarize our findings about anonymous variables:
• Anonymous objects are great for initializing const reference parameters.
• The same syntaxis, however, can also be used in stand-alone statements, in which they are
interpreted as variable definitions if our intention actually was to initialize an anonymous
object using an existing object.
• Since this may cause confusion, it’s probably best to restrict the use of anonymous objects to
the first (and main) form: initializing function parameters.
7.8 The keyword ‘inline’
Let us take another look at the implementation of the function Person::name():
std::string const &Person::name() const
{
return d_name;
}
This function is used to retrieve the name field of an object of the class Person. Example:
void showName(Person const &person)
{
cout << person.name();
}
To insert person’s name the following actions are performed:
• The function Person::name() is called.
• This function returns person’s d_name as a reference.
• The referenced name is inserted into cout.
Especially the first part of these actions causes some time loss, since an extra function call is necessary
to retrieve the value of the name field. Sometimes a faster procedure immediately making
7.8. THE KEYWORD ‘INLINE’ 155
the d_name data member available is preferred without ever actually calling a function name. This
can be realized using inline functions. An inline function is a request to the compiler to insert the
function’s code at the location of the function’s call. This may speed up execution by avoiding a function
call, which typically comes with some (stack handling and parameter passing) overhead. Note
that inline is a request to the compiler: the compiler may decide to ignore it, and will probably
ignore it when the function’s body contains much code. Good programming discipline suggests to be
aware of this, and to avoid inline unless the function’s body is fairly small. More on this in section
7.8.2.
7.8.1 Defining members inline
Inline functions may be implemented in the class interface itself. For the class Person this results
in the following implementation of name:
class Person
{
public:
std::string const &name() const
{
return d_name;
}
};
Note that the inline code of the function name now literally occurs inline in the interface of the class
Person. The keyword const is again added to the function’s header.
Although members can be defined in-class (i.e., inside the class interface itself), it is considered bad
practice for the following reasons:
• Defining members inside the interface contaminates the interface with implementations. The
interface’s purpose is to document what functionality the class offers. Mixing member declarations
and implementation details complicates understanding the interface. Readers need to
skip implementation details which takes time and makes it hard to grab the ‘broad picture’,
and thus to understand at a glance what functionality the class’s objects are offering.
• In-class implementations of private member functions may usually be avoided altogether (as
they are private members). They should be moved to the internal header file (unless inline
public members use such inline private members).
• Although members that are eligible for inline-coding should remain inline, situations do exist
where such inline members migrate from an inline to a non-inline definition. In-class inline
definitions still need editing (sometimes considerable editing) before they can be compiled.
This additional editing is undesirable.
Because of the above considerations inline members should not be defined in-class. Rather, they
should be defined following the class interface. The Person::name member is therefore preferably
defined as follows:
class Person
{
public:
std::string const &name() const;
156 CHAPTER 7. CLASSES
};
inline std::string const &Person::name() const
{
return d_name;
}
If it is ever necessary to cancel Person::name’s inline implementation, then this becomes its noninline
implementation:
#include "person.ih"
std::string const &Person::name() const
{
return d_name;
}
Only the inline keyword needs to be removed to obtain the correct non-inline implementation.
Defining members inline has the following effect: whenever an inline-defined function is called, the
compilermay insert the function’s body at the location of the function call. It may be that the function
itself is never actually called.
This construction, where the function code itself is inserted rather than a call to the function, is
called an inline function. Note that using inline functions may result in multiple occurrences of the
code of those functions in a program: one copy for each invocation of the inline function. This is
probably OK if the function is a small one, and needs to be executed fast. It’s not so desirable if
the code of the function is extensive. The compiler knows this too, and handles the use of inline
functions as a request rather than a command. If the compiler considers the function too long, it will
not grant the request. Instead it will treat the function as a normal function.
7.8.2 When to use inline functions
When should inline functions be used, and when not? There are some rules of thumb which may be
followed:
• In general inline functions should not be used. Voilà; that’s simple, isn’t it?
• Consider defining a function inline once a fully developed and tested program runs too slowly
and shows ‘bottlenecks’ in certain functions, and the bottleneck is removed by defining inline
members. A profiler, which runs a program and determines where most of the time is spent, is
necessary to perform such optimizations.
• Defining inline functions may be considered when they consist of one very simple statement
(such as the return statement in the function Person::name).
• When a function is defined inline, its implementation is inserted in the code wherever the
function is used. As a consequence, when the implementation of the inline function changes, all
sources using the inline function must be recompiled. In practice that means that all functions
must be recompiled that include (either directly or indirectly) the header file of the class in
which the inline function is defined. Not a very attractive prospect.
• It is only useful to implement an inline function when the time spent during a function call is
long compared to the time spent by the function’s body. An example of an inline function which
hardly affects the program’s speed is:
7.9. LOCAL CLASSES: CLASSES INSIDE FUNCTIONS 157
inline void Person::printname() const
{
cout << d_name << ’\n’;
}
This function contains only one statement. However, the statement takes a relatively long time
to execute. In general, functions which performinput and output take lots of time. The effect of
the conversion of this function printname() to inline would therefore lead to an insignificant
gain in execution time.
All inline functions have one disadvantage: the actual code is inserted by the compiler and must
therefore be known at compile-time. Therefore, as mentioned earlier, an inline function can never
be located in a run-time library. Practically this means that an inline function is found near the
interface of a class, usually in the same header file. The result is a header file which not only
shows the declaration of a class, but also part of its implementation, thus always blurring the
distinction between interface and implementation.
7.8.2.1 A prelude: when NOT to use inline functions
As a prelude to chapter 14 (Polymorphism), there is one situation in which inline functions should
definitely be avoided. At this point in the C++ Annotations it’s a bit too early to expose the full
details, but since the keyword inline is the topic of this section this is considered the appropriate
location for the advice.
There are situations where the compiler is confronted with so-called vague linkage
(cf. http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcc/Vague-Linkage.html). These situations
occur when the compiler does not have a clear indication in what object file to put its compiled
code. This happens, e.g., with inline functions, which are usually encountered in multiple source
files. Since the compiler may insert the code of ordinary inline functions in places where these
functions are called, vague linking is usually no problem with these ordinary functions.
However, as explained in chapter 14, when using polymorphism the compiler must ignore the
inline keyword and define so-called virtual members as true (out-of-line functions). In this situation
the vague linkage may cause problems, as the compiler must decide in what object s to put
their code. Usually that’s not a big problemas long as the function is at least called once. But virtual
functions are special in the sense that they may very well never be explicitly called. On some architectures
(e.g., armel) the compiler may fail to compile such inline virtual functions. This may result
in missing symbols in programs using them. To make matters slightly more complex: the problem
may emerge when shared libraries are used, but not when static libraries are used.
To avoid all of these problems virtual functions should never be defined inline, but they should
always be defined out-of-line. I.e., they should be defined in source files.
7.9 Local classes: classes inside functions
Classes are usually defined at the global or namespace level. However, it is entirely possible to
define a local class, i.e., inside a function. Such classes are called local classes.
Local classes can be very useful in advanced applications involving inheritance or templates (cf.
section 13.8). At this point in the C++ Annotations they have limited use, although their main
features can be described. At the end of this section an example is provided.
158 CHAPTER 7. CLASSES
• Local classes may use almost all characteristics of normal classes. Theymay have constructors,
destructors, data members, and member functions;
• Local classes cannot define static data members. Static member functions, however, can be
defined.
• Since a local class may define static member functions, it is possible to define nested functions
in C++ somewhat comparable to the way programming languages like Pascal allow nested
functions to be defined.
• If a local class needs access to a constant integral value, a local enum can be used. The enum
may be anonymous, exposing only the enum values.
• Local classes cannot directly access the non-static variables of their surrounding context. For
example, in the example shown below the class Local cannot directly access main’s argc
parameter.
• Local classes may directly access global data and static variables defined by their surrounding
function. This includes variables defined in the anonymous namespace of the source file
containing the local class.
• Local class objects can be defined inside the function body, but they cannot leave the function
as objects of their own type. I.e., a local class name cannot be used for either the return type
or for the parameter types of its surrounding function.
• As a prelude to inheritance (chapter 13): a local class may be derived from an existing class
allowing the surrounding function to return a dynamically allocated locally constructed class
object, pointer or reference could be returned via a base class pointer or reference.
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char *argv[])
{
static size_t staticValue = 0;
class Local
{
int d_argc; // non-static data members OK
public:
enum // enums OK
{
VALUE = 5
};
Local(int argc) // constructors and member functions OK
: // in-class implementation required
d_argc(argc)
{
// global data: accessible
cout << "Local constructor\n";
// static function variables: accessible
staticValue += 5;
}
7.10. THE KEYWORD ‘MUTABLE’ 159
static void hello() // static member functions: OK
{
cout << "hello world\n";
}
};
Local::hello(); // call Local static member
Local loc(argc); // define object of a local class.
}
7.10 The keyword ‘mutable’
Earlier, in section 7.7, the concepts of const member functions and const objects were introduced.
C++ also allows the declaration of data members whichmay be modified, even by constmember function.
The declaration of such data members in the class interface start with the keyword mutable.
Mutable should be used for those data members that may be modified without logically changing
the object, which might therefore still be considered a constant object.
An example of a situation where mutable is appropriately used is found in the implementation of a
string class. Consider the std::string’s c_str and data members. The actual data returned by
the two members are identical, but c_str must ensure that the returned string is terminated by an
0-byte. As a string object has both a length and a capacity an easy way to implement c_str is to
ensure that the string’s capacity exceeds its length by at least one character. This invariant allows
c_str to be implemented as follows:
char const *string::c_str() const
{
d_data[d_length] = 0;
return d_data;
}
This implementation logically does not modify the object’s data as the bytes beyond the object’s
initial (length) characters have undefined values. But in order to use this implementation d_data
must be declared mutable:
mutable char *d_data;
The keyword mutable is also useful in classes implementing, e.g., reference counting. Consider a
class implementing reference counting for textstrings. The object doing the reference countingmight
be a const object, but the class may define a copy constructor. Since const objects can’t be modified,
how would the copy constructor be able to increment the reference count? Here the mutable keyword
may profitably be used, as it can be incremented and decremented, even though its object is a
const object.
The keyword mutable should sparingly be used. Data modified by const member functions should
never logically modify the object, and it should be easy to demonstrate this. As a rule of thumb: do
not use mutable unless there is a very clear reason (the object is logically not altered) for violating
this rule.
160 CHAPTER 7. CLASSES
7.11 Header file organization
In section 2.5.10 the requirements for header files when a C++ program also uses C functions were
discussed. Header files containing class interfaces have additional requirements.
First, source files. With the exception of the occasional classless function, source files contain the
code of member functions of classes. here there are basically two approaches:
• All required header files for a member function are included in each individual source file.
• All required header files (for all member functions of a class) are included in a header file that
is included by each of the source files defining class members.
The first alternative has the advantage of economy for the compiler: it only needs to read the header
files that are necessary for a particular source file. It has the disadvantage that the program developer
must include multiple header files again and again in sourcefiles: it both takes time to type the
include-directives and to think about the header files which are needed in a particular source file.
The second alternative has the advantage of economy for the program developer: the header file of
the class accumulates header files, so it tends to become more and more generally useful. It has the
disadvantage that the compiler frequently has to process many header files which aren’t actually
used by the function to compile.
With computers running faster and faster (and compilers getting smarter and smarter) I think the
second alternative is to be preferred over the first alternative. So, as a starting point source files of
a particular class MyClass could be organized according to the following example:
#include <myclass.h>
int MyClass::aMemberFunction()
{}
There is only one include-directive. Note that the directive refers to a header file in a directory
mentioned in the INCLUDE-file environment variable. Local header files (using #include
"myclass.h") could be used too, but that tends to complicate the organization of the class header
file itself somewhat.
The organization of the header file itself requires some attention. Consider the following example,
in which two classes File and String are used.
Assume the File class has a member gets(String &destination), while the class String has
a member function getLine(File &file). The (partial) header file for the class String is
then:
#ifndef STRING_H_
#define STRING_H_
#include <project/file.h> // to know about a File
class String
{
public:
void getLine(File &file);
};
#endif
7.11. HEADER FILE ORGANIZATION 161
Unfortunately a similar setup is required for the class File:
#ifndef FILE_H_
#define FILE_H_
#include <project/string.h> // to know about a String
class File
{
public:
void gets(String &string);
};
#endif
Now we have created a problem. The compiler, trying to compile the source file of the function
File::gets proceeds as follows:
• The header file project/file.h is opened to be read;
• FILE_H_ is defined
• The header file project/string.h is opened to be read
• STRING_H_ is defined
• The header file project/file.h is (again) opened to be read
• Apparently, FILE_H_ is already defined, so the remainder of project/file.h is skipped.
• The interface of the class String is now parsed.
• In the class interface a reference to a File object is encountered.
• As the class File hasn’t been parsed yet, a File is still an undefined type, and the compiler
quits with an error.
The solution to this problem is to use a forward class reference before the class interface, and to
include the corresponding class header file beyond the class interface. So we get:
#ifndef STRING_H_
#define STRING_H_
class File; // forward reference
class String
{
public:
void getLine(File &file);
};
#include <project/file.h> // to know about a File
#endif
162 CHAPTER 7. CLASSES
A similar setup is required for the class File:
#ifndef FILE_H_
#define FILE_H_
class String; // forward reference
class File
{
public:
void gets(String &string);
};
#include <project/string.h> // to know about a String
#endif
This works well in all situations where either references or pointers to other classes are involved
and with (non-inline) member functions having class-type return values or parameters.
This setup doesn’t work with composition, nor with in-class inline member functions. Assume the
class File has a composed data member of the class String. In that case, the class interface of the
class File must include the header file of the class String before the class interface itself, because
otherwise the compiler can’t tell how big a File object is. A File object contains a String member,
but the compiler can’t determine the size of that String data member and thus, by implication, it
can’t determine the size of a File object.
In cases where classes contain composed objects (or are derived from other classes, see chapter 13)
the header files of the classes of the composed objects must have been read before the class interface
itself. In such a case the class File might be defined as follows:
#ifndef FILE_H_
#define FILE_H_
#include <project/string.h> // to know about a String
class File
{
String d_line; // composition !
public:
void gets(String &string);
};
#endif
The class String can’t declare a File object as a composed member: such a situation would again
result in an undefined class while compiling the sources of these classes.
All remaining header files (appearing below the class interface itself) are required only because they
are used by the class’s source files.
This approach allows us to introduce yet another refinement:
• Header files defining a class interface should declare what can be declared before defining the
7.11. HEADER FILE ORGANIZATION 163
class interface itself. So, classes that are mentioned in a class interface should be specified
using forward declarations unless
– They are a base class of the current class (see chapter 13);
– They are the class types of composed data members;
– They are used in inline member functions.
In particular: additional actual header files are not required for:
– class-type return values of functions;
– class-type value parameters of functions.
Class header files of objects that are either composed or inherited or that are used in inline
functions, must be known to the compiler before the interface of the current class starts. The
information in the header file itself is protected by the #ifndef ... #endif construction
introduced in section 2.5.10.
• Program sources in which the class is used only need to include this header file. Lakos, (2001)
refines this process even further. See his book Large-Scale C++ Software Design for further
details. This header file should be made available in a well-known location, such as a directory
or subdirectory of the standard INCLUDE path.
• To implement member functions the class’s header file is required and usually additional
header files (like the string header file) as well. The class header file itself as well as these
additional header files should be included in a separate internal header file (for which the
extension .ih (‘internal header’) is suggested).
The .ih file should be defined in the same directory as the source files of the class. It has the
following characteristics:
– There is no need for a protective #ifndef .. #endif shield, as the header file is never
included by other header files.
– The standard .h header file defining the class interface is included.
– The header files of all classes used as forward references in the standard .h header file
are included.
– Finally, all other header files that are required in the source files of the class are included.
An example of such a header file organization is:
– First part, e.g., /usr/local/include/myheaders/file.h:
#ifndef FILE_H_
#define FILE_H_
#include <fstream> // for composed ’ifstream’
class Buffer; // forward reference
class File // class interface
{
std::ifstream d_instream;
public:
void gets(Buffer &buffer);
};
#endif
164 CHAPTER 7. CLASSES
– Second part, e.g., ~/myproject/file/file.ih, where all sources of the class File are
stored:
#include <myheaders/file.h> // make the class File known
#include <buffer.h> // make Buffer known to File
#include <string> // used by members of the class
#include <sys/stat.h> // File.
7.11.1 Using namespaces in header files
When entities from namespaces are used in header files, no using directive should be specified in
those header files if they are to be used as general header files declaring classes or other entities
from a library. When the using directive is used in a header file then users of such a header file are
forced to accept and use the declarations in all code that includes the particular header file.
For example, if in a namespace special an object Inserter cout is declared, then
special::cout is of course a different object than std::cout. Now, if a class Flaw is constructed,
in which the constructor expects a reference to a special::Inserter, then the class should be
constructed as follows:
class special::Inserter;
class Flaw
{
public:
Flaw(special::Inserter &ins);
};
Now the person designing the class Flawmay be in a lazymood, andmight get bored by continuously
having to prefix special:: before every entity from that namespace. So, the following construction
is used:
using namespace special;
class Inserter;
class Flaw
{
public:
Flaw(Inserter &ins);
};
This works fine, up to the point where somebody wants to include flaw.h in other source files:
because of the using directive, this latter person is now by implication also using namespace
special, which could produce unwanted or unexpected effects:
#include <flaw.h>
#include <iostream>
using std::cout;
int main()
7.12. SIZEOF APPLIED TO CLASS DATA MEMBERS 165
{
cout << "starting\n"; // won’t compile
}
The compiler is confronted with two interpretations for cout: first, because of the using directive
in the flaw.h header file, it considers cout a special::Inserter, then, because of the using
directive in the user program, it considers cout a std::ostream. Consequently, the compiler
reports an error.
As a rule of thumb, header files intended for general use should not contain using declarations.
This rule does not hold true for header files which are only included by the sources of a class: here
the programmer is free to apply as many using declarations as desired, as these directives never
reach other sources.
7.12 Sizeof applied to class data members
In C++ the well-known sizeof operator can be applied to data members of classes without the need
to specify an object as well. Consider:
class Data
{
std::string d_name;
...
};
To obtain the size of Data’s d_name member the following expression can be used:
sizeof(Data::d_name);
However, note that the compiler observes data protection here as well. Sizeof(Data::d_name)
can only be used where d_name may be visible as well, i.e., by Data’s member functions and friends.
166 CHAPTER 7. CLASSES

Chapter 8
Static Data And Functions
In the previous chapters we provided examples of classes where each object had its own set of data
members data. Each of the class’s member functions could access any member of any object of its
class.
In some situations it may be desirable to define common data fields, that may be accessed by all
objects of the class. For example, the name of the startup directory, used by a program that recursively
scans the directory tree of a disk. A second example is a variable that indicates whether some
specific initialization has occurred. In that case the object that was constructed first would perform
the initialization and would set the flag to ‘done’.
Such situations are also encountered in C, where several functions need to access the same variable.
A common solution in C is to define all these functions in one source file and to define the variable
static: the variable name is invisible outside the scope of the source file. This approach is quite
valid, but violates our philosophy of using only one function per source file. Another C-solution is
to give the variable in question an unusual name, e.g., _6uldv8, hoping that other program parts
won’t use this name by accident. Neither the first, nor the second legacy C solution is elegant.
C++ solves the problem by defining static members: data and functions, common to all objects
of a class and (when defined in the private section) inaccessible outside of the class. These static
members are this chapter’s topic.
Static members cannot be defined as virtual functions. A virtual member function is an ordinary
member in that it has a this pointer. As static member functions have no this pointer, they cannot
be declared virtual.
8.1 Static data
Any data member of a class can be declared static; be it in the public or private section of the
class interface. Such a data member is created and initialized only once, in contrast to non-static
data members which are created again and again for each object of the class.
Static data members are created as soon as the program starts. Even though they’re created at the
very beginning of a program’s execution cycle they are nevertheless true members of their classes.
It is suggested to prefix the names of static member with s_ so they may easily be distinguished (in
class member functions) from the class’s data members (which should preferably start with d_).
167
168 CHAPTER 8. STATIC DATA AND FUNCTIONS
Public static data members are global variables. They may be accessed by all of the program’s code,
simply by using their class names, the scope resolution operator and their member names. Example:
class Test
{
static int s_private_int;
public:
static int s_public_int;
};
int main()
{
Test::s_public_int = 145; // OK
Test::s_private_int = 12; // wrong, don’t touch
// the private parts
}
The example does not present an executable program. It merely illustrates the interface, and not
the implementation of static data members, which is discussed next.
8.1.1 Private static data
To illustrate the use of a static data member which is a private variable in a class, consider the
following:
class Directory
{
static char s_path[];
public:
// constructors, destructors, etc.
};
The data member s_path[] is a private static data member. During the program’s execution only
one Directory::s_path[] exists, even though multiple objects of the class Directory may exist.
This data member could be inspected or altered by the constructor, destructor or by any other
member function of the class Directory.
Since constructors are called for each new object of a class, static data members are not initialized
by constructors. At most they are modified. The reason for this is that static data members exist
before any constructor of the class has been called. Static data members are initialized when they
are defined, outside of any member function, exactly like the initialization of ordinary (non-class)
global variables.
The definition and initialization of a static data member usually occurs in one of the source files
of the class functions, preferably in a source file dedicated to the definition of static data members,
called data.cc.
The data member s_path[], used above, could thus be defined and initialized as follows in a file
data.cc:
include "directory.ih"
8.1. STATIC DATA 169
char Directory::s_path[200] = "/usr/local";
In the class interface the static member is actually only declared. In its implementation (definition)
its type and class name are explicitly mentioned. Note also that the size specification can be left out
of the interface, as shown above. However, its size is (either explicitly or implicitly) required when
it is defined.
Note that any source file could contain the definition of the static datamembers of a class. A separate
data.cc source file is advised, but the source file containing, e.g., main() could be used as well. Of
course, any source file defining static data of a class must also include the header file of that class,
in order for the static data member to be known to the compiler.
A second example of a useful private static data member is given below. Assume that a class
Graphics defines the communication of a program with a graphics-capable device (e.g., a VGA
screen). The initialization of the device, which in this case would be to switch from text mode to
graphics mode, is an action of the constructor and depends on a static flag variable s_nobjects.
The variable s_nobjects simply counts the number of Graphics objects which are present at one
time. Similarly, the destructor of the class may switch back from graphics mode to text mode when
the last Graphics object ceases to exist. The class interface for this Graphics class might be:
class Graphics
{
static int s_nobjects; // counts # of objects
public:
Graphics();
~Graphics(); // other members not shown.
private:
void setgraphicsmode(); // switch to graphics mode
void settextmode(); // switch to text-mode
}
The purpose of the variable s_nobjects is to count the number of objects existing at a particular
moment in time. When the first object is created, the graphics device is initialized. At the destruction
of the last Graphics object, the switch from graphics mode to text mode is made:
int Graphics::s_nobjects = 0; // the static data member
Graphics::Graphics()
{
if (!s_nobjects++)
setgraphicsmode();
}
Graphics::~Graphics()
{
if (!--s_nobjects)
settextmode();
}
Obviously, when the class Graphics would definemore than one constructor, each constructor would
need to increase the variable s_nobjects and would possibly have to initialize the graphics mode.
170 CHAPTER 8. STATIC DATA AND FUNCTIONS
8.1.2 Public static data
Data members could also be declared in the public section of a class. This, however, is deprecated
(as it violates the principle of data hiding). The static data member s_path[] (cf. section 8.1) could
be declared in the public section of the class definition. This would allow all the program’s code to
access this variable directly:
int main()
{
getcwd(Directory::s_path, 199);
}
A declaration is not a definition. Consequently the variable s_path still has to be defined. This
implies that some source file still needs to contain s_path[] array’s definition.
8.1.3 Initializing static const data
Static const data members should be initialized like any other static data member: in source files
defining these data members.
Usually, if these data members are of integral or built-in primitive data types the compiler accepts
in-class initializations of such datamembers. However, there is no formal rule requiring the compiler
to do so. Compilations may or may not succeed depending on the optimizations used by the compiler
(e.g., using -O2 may result in a successful compilation, but -O0 (no-optimalizations) may fail to
compile, but then maybe only when shared libraries are used...).
In-class initializations of integer constant values (e.g., of types char, int, long, etc, maybe
unsigned) is nevertheless possible using (e.g., anonymous) enums. The following example illustrates
how this can be done:
class X
{
public:
enum { s_x = 34 };
enum: size_t { s_maxWidth = 100 };
};
To avoid confusion caused by different compiler options static data members should always explicitly
be defined and initialized in a source file, whether or not const.
8.1.4 Generalized constant expressions (constexpr)
In C macros are often used to let the preprocessor perform simple calculations. These macro functions
may have arguments, as illustrated in the next example:
#define xabs(x) ((x) < 0 ? -(x) : (x))
The disadvantages of macros are well-known. The main reason for avoiding macros is that they are
not parsed by the compiler, but are processed by the preprocessor resulting in mere text replacements
and thus avoid type-safety or syntactic checks of the macro definition by itself. Furthermore,
8.1. STATIC DATA 171
since macros are processed by the preprocessor their use is unconditional, without acknowledging
the context in which they are applied. NULL is an infamous example. Ever tried to define an enum
symbol NULL? or EOF? Chances are that, if you did, the compiler threw strange error messages at
you.
Generalized const expressions can be used as an alternative.
Generalized const expressions are recognized by the modifier constexpr (a keyword), that is applied
to the expression’s type.
There is a small syntactic difference between the use of the const modifier and the use of the
constexpr modifier. While the const modifier can be applied to definitions and declarations alike,
the constexpr modifier can only be applied to definitions:
extern int const externInt; // OK: declaration of const int
extern int constexpr error; // ERROR: not a definition
Variables defined with the constexpr modifier have constant (immutable) values. But generalized
const expressions are not just used to define constant variables; they have other applications as
well. The constexpr keyword is usually applied to functions, turning the function into a constantexpression
function.
A constant-expression function should not be confused with a function returning a const value
(although a constant-expression function does return a (const) value). A constant expression function
has the following characteristics:
• it returns a value;
• its return type is given the constexpr modifier;
• its body consists of one single return statement (but see also the notes about C++14 at end of
this section)
Such functions are also called named constant expressions with parameters.
These constant expression functions may or may not be called with arguments that have been evaluated
at compile-time (not just ‘const arguments’, as a const parameter value is not evaluated at
compile-time). If they are called with compile-time evaluated arguments then the returned value is
considered a const value as well.
This allows us to encapsulate expressions that can be evaluated at compile-time in functions, and
it allows us to use these functions in situations where previously the expressions themselves had to
be used. The encapsulation reduces the number of occurrences of the expressions to one, simplifying
maintenance and reduces the probability of errors.
If arguments that could not be compile-time evaluated are passed to constant-expression functions,
then these functions act like any other function, in that their return values are no longer considered
constant expressions.
Assume some two-dimensional arrays must be converted to one-dimensional arrays. The onedimensional
array must have nrows _ ncols + nrows + ncols + 1 elements, to store row, column,
and total marginals, as well as the elements of the source array itself. Furthermore assume
that nrows and ncols have been defined as globally available size_t const values (they could be
a class’s static data). The one-dimensional arrays are data members of a class or struct, or they are
also defined as global arrays.
172 CHAPTER 8. STATIC DATA AND FUNCTIONS
Now that constant-expression functions are available the expression returning the number of the
required elements can be encapsulated in such a function:
size_t const nRows = 45;
size_t const nCols = 10;
size_t constexpr nElements(size_t rows, size_t cols)
{
return rows * cols + rows + cols + 1;
}
....
int intLinear[ nElements(nRows, nCols) ];
struct Linear
{
double d_linear[ nElements(nRows, nCols) ];
};
If another part of the program needs to use a linear array for an array of different sizes then the
constant-expression function can also be used. E.g.,
string stringLinear[ nElements(10, 4) ];
Constant-expression functions can be used in other constant expression functions as well. The following
constant-expression function returns half the value, rounded upwards, that is returned by
nElements:
size_t constexpr halfNElements(size_t rows, size_t cols)
{
return (nElements(rows, cols) + 1) >> 1;
}
Classes should not expose their data members to external software, so as to reduce coupling between
classes and external software. But if a class defines a static const size_t data member then
that member’s value could very well be used to define entities living outside of the class’s scope, like
the number of elements of an array or to define the value of some enum. In situations like these
constant-expression functions are the perfect tool to maintain proper data hiding:
class Data
{
static size_t const s_size = 7;
public:
static size_t constexpr size();
size_t constexpr mSize();
};
size_t constexpr Data::size()
{
return s_size;
8.1. STATIC DATA 173
}
size_t constexpr Data::mSize()
{
return size();
}
double data[ Data::size() ]; // OK: 7 elements
short data2[ Data().mSize() ]; // also OK: see below
Please note the following:
• Constant-expression functions are implicitly declared inline;
• Non-static constant-expression member functions are implicitly const, and a const member
modifier for them is optional;
• Constant values (e.g., static constant data members) used by constant-expression functions
must be known by the time the compiler encounters the functions’ definitions. That’s why
s_size was initialized in Data’s class interface.
The C++14 standard relaxes the characteristics of constexpr functions. Using the C++14 standard,
constexpr functions may
• define any kind of variable except for static or thread_local variables;
• define variables without initializers;
• use conditional statements (if and switch);
• use repetition statements statements, including the range-based for statement;
• use expressions changing the values of objects that are local to the constexpr function;
In addition, C++14 allows constexpr member functions to be non-const. But note that non-const
constexpr member functions can only modify data members of objects that were defined local to
the constexpr function calling the non-const constexpr member function.
8.1.4.1 Constant expression data
As we’ve seen, (member) functions and variables of primitive data types can defined with the
constexpr modifier. What about class-type objects?
Objects of classes are values of class type, and like values of primitive types they can be defined with
the constexpr specifier. Constant expression class-type objects must be initialized with constant
expression arguments; the constructor that is actually used must itself have been declared with the
constexpr modifier. Note again that the constexpr constructor’s definition must have been seen
by the compiler before the constexpr object can be constructed:
class ConstExpr
{
public:
constexpr ConstExpr(int x);
};
174 CHAPTER 8. STATIC DATA AND FUNCTIONS
ConstExpr ok(7); // OK: not declared as constexpr
constexpr ConstExpr err(7); // ERROR: constructor’s definition
// not yet seen
constexpr ConstExpr::ConstExpr(int x)
{}
constexpr ConstExpr ok(7); // OK: definition seen
constexpr ConstExpr okToo = ConstExpr(7); // also OK
A constant-expression constructor has the following characteristics:
• it is declared with the constexpr modifier;
• its member initializers only use constant expressions;
• its body is empty.
An object constructed with a constant-expression constructor is called a user-defined literal. Destructors
and copy constructors of user-defined literals must be trivial.
The constexpr characteristic of user-defined literals may or may not be maintained by its class’s
members. If a member is not declared with a constexpr return value, then using that member
does not result in a constant-expression. If a member does declare a constexpr return value then
that member’s return value considered a constexpr if it is by itself a constant expression function.
To maintain its constexpr characteristics it can refer to its classes data members only if its object
has been defined with the constexpr modifier, as illustrated by the example:
class Data
{
int d_x;
public:
constexpr Data(int x)
:
d_x(x)
{}
int constexpr cMember()
{
return d_x;
}
int member() const
{
return d_x;
}
};
Data d1(0); // OK, but not a constant expression
enum e1 {
8.2. STATIC MEMBER FUNCTIONS 175
ERR = d1.cMember() // ERROR: cMember(): no constant
}; // expression anymore
constexpr Data d2(0); // OK, constant expression
enum e2 {
OK = d2.cMember(), // OK: cMember(): now a constant
// expression
ERR = d2.member(), // ERR: member(): not a constant
}; // expression
8.2 Static member functions
In addition to static data members, C++ allows us to define static member functions. Similar to
static data that are shared by all objects of the class, static member functions also exist without any
associated object of their class.
Static member functions can access all static members of their class, but also the members (private
or public) of objects of their class if they are informed about the existence of these objects (as in
the upcoming example). As static member functions are not associated with any object of their class
they do not have a this pointer. In fact, a static member function is completely comparable to a
global function, not associated with any class (i.e., in practice they are. See the next section (8.2.1)
for a subtle note). Since static member functions do not require an associated object, static member
functions declared in the public section of a class interface may be called without specifying an object
of its class. The following example illustrates this characteristic of static member functions:
class Directory
{
string d_currentPath;
static char s_path[];
public:
static void setpath(char const *newpath);
static void preset(Directory &dir, char const *newpath);
};
inline void Directory::preset(Directory &dir, char const *newpath)
{
// see the text below
dir.d_currentPath = newpath; // 1
}
char Directory::s_path[200] = "/usr/local"; // 2
void Directory::setpath(char const *newpath)
{
if (strlen(newpath) >= 200)
throw "newpath too long";
strcpy(s_path, newpath); // 3
}
int main()
176 CHAPTER 8. STATIC DATA AND FUNCTIONS
{
Directory dir;
Directory::setpath("/etc"); // 4
dir.setpath("/etc"); // 5
Directory::preset(dir, "/usr/local/bin"); // 6
dir.preset(dir, "/usr/local/bin"); // 7
}
• at 1 a static member function modifies a private data member of an object. However, the object
whose member must be modified is given to the member function as a reference parameter.
Note that static member functions can be defined as inline functions.
• at 2 a relatively long array is defined to be able to accomodate long paths. Alternatively, a
string or a pointer to dynamic memory could be used.
• at 3 a (possibly longer, but not too long) new pathname is stored in the static data member
s_path[]. Note that only static members are used.
• at 4, setpath() is called. It is a static member, so no object is required. But the compilermust
know to which class the function belongs, so the class is mentioned using the scope resolution
operator.
• at 5, the same is implemented as in 4. Here dir is used to tell the compiler that we’re talking
about a function in the Directory class. Static member functions can be called as normal
member functions, but this does not imply that the static member function receives the object’s
address as a this pointer. Here the member-call syntax is used as an alternative for the
classname plus scope resolution operator syntax.
• at 6, currentPath is altered. As in 4, the class and the scope resolution operator are used.
• at 7, the same is implemented as in 6. But here dir is used to tell the compiler that we’re
talking about a function in the Directory class. Here in particular note that this is not using
preset() as an ordinary member function of dir: the function still has no this-pointer, so
dir must be passed as argument to informthe static member function preset about the object
whose currentPath member it should modify.
In the example only public static member functions were used. C++ also allows the definition of
private static member functions. Such functions can only be called by member functions of their
class.
8.2.1 Calling conventions
As noted in the previous section, static (public) member functions are comparable to classless functions.
However, formally this statement is not true, as the C++ standard does not prescribe the same
calling conventions for static member functions as for classless global functions.
In practice the calling conventions are identical, implying that the address of a static member function
could be used as an argument of functions having parameters that are pointers to (global)
functions.
If unpleasant surprises must be avoided at all cost, it is suggested to create global classless wrapper
functions around static member functions that must be used as call back functions for other
functions.
8.2. STATIC MEMBER FUNCTIONS 177
Recognizing that the traditional situations in which call back functions are used in C are tackled in
C++ using template algorithms (cf. chapter 19), let’s assume that we have a class Person having
data members representing the person’s name, address, phone and mass. Furthermore, assume we
want to sort an array of pointers to Person objects, by comparing the Person objects these pointers
point to. Keeping things simple, we assume that the following public static member exists:
int Person::compare(Person const *const *p1, Person const *const *p2);
A useful characteristic of this member is that it may directly inspect the required data members of
the two Person objects passed to the member function using pointers to pointers (double pointers).
Most compilers allow us to pass this function’s address as the address of the comparison function for
the standard C qsort() function. E.g.,
qsort
(
personArray, nPersons, sizeof(Person *),
reinterpret_cast<int(*)(void const *, void const *)>(Person::compare)
);
However, if the compiler uses different calling conventions for static members and for classless
functions, this might not work. In such a case, a classless wrapper function like the following may
be used profitably:
int compareWrapper(void const *p1, void const *p2)
{
return
Person::compare
(
static_cast<Person const *const *>(p1),
static_cast<Person const *const *>(p2)
);
}
resulting in the following call of the qsort() function:
qsort(personArray, nPersons, sizeof(Person *), compareWrapper);
Note:
• The wrapper function takes care of any mismatch in the calling conventions of static member
functions and classless functions;
• The wrapper function handles the required type casts;
• The wrapper function might perform small additional services (like dereferencing pointers if
the static member function expects references to Person objects rather than double pointers);
• As an aside: in C++ programs functions like qsort(), requiring the specification of call back
functions are seldom used. Instead using existing generic template algorithms is preferred (cf.
chapter 19).
178 CHAPTER 8. STATIC DATA AND FUNCTIONS

Chapter 9
Classes And Memory Allocation
In contrast to the set of functions that handle memory allocation in C (i.e., malloc etc.), memory
allocation in C++ is handled by the operators new and delete. Important differences between
malloc and new are:
• The function malloc doesn’t ‘know’ what the allocated memory will be used for. E.g., when
memory for ints is allocated, the programmer must supply the correct expression using a
multiplication by sizeof(int). In contrast, new requires a type to be specified; the sizeof
expression is implicitly handled by the compiler. Using new is therefore type safe.
• Memory allocated by malloc is initialized by calloc, initializing the allocated characters to
a configurable initial value. This is not very useful when objects are available. As operator
new knows about the type of the allocated entity it may (and will) call the constructor of an
allocated class type object. This constructor may be also supplied with arguments.
• All C-allocation functions must be inspected for NULL-returns. This is not required anymore
when new is used. In fact, new’s behavior when confronted with failing memory allocation is
configurable through the use of a new_handler (cf. section 9.2.2).
A comparable relationship exists between free and delete: delete makes sure that when an
object is deallocated, its destructor is automatically called.
The automatic calling of constructors and destructors when objects are created and destroyed has
consequences which we shall discuss in this chapter. Many problems encountered during C program
development are caused by incorrect memory allocation or memory leaks: memory is not allocated,
not freed, not initialized, boundaries are overwritten, etc.. C++ does not ‘magically’ solve these
problems, but it does provide us with tools to prevent these kinds of problems.
As a consequence of malloc and friends becoming deprecated the very frequently used str...
functions, like strdup, that are all malloc based, should be avoided in C++ programs. Instead, the
facilities of the string class and operators new and delete should be used.
Memory allocation procedures influence the way classes dynamically allocating their own memory
should be designed. Therefore, in this chapter these topics are discussed in addition to discussions
about operators new and delete. We’ll first cover the peculiarities of operators new and delete,
followed by a discussion about:
• the destructor: the member function that’s called when an object ceases to exist;
• the assignment operator, allowing us to assign an object to another object of its own class;
179
180 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
• the this pointer, allowing explicit references to the object for which a member function was
called;
• the copy constructor: the constructor creating a copy of an object;
• the move constructor: a constructor creating an object from an anonymous temporary object.
9.1 Operators ‘new’ and ‘delete’
C++ defines two operators to allocate memory and to return it to the ‘common pool’. These operators
are, respectively new and delete.
Here is a simple example illustrating their use. An int pointer variable points to memory allocated
by operator new. This memory is later released by operator delete.
int *ip = new int;
delete ip;
Here are some characteristics of operators new and delete:
• new and delete are operators and therefore do not require parentheses, as required for functions
like malloc and free;
• new returns a pointer to the kind of memory that’s asked for by its operand (e.g., it returns a
pointer to an int);
• new uses a type as its operand, which has the important benefit that the correct amount of
memory, given the type of the object to be allocated, is made available;
• as a consequence, new is a type safe operator as it always returns a pointer to the type that
was mentioned as its operand. In addition, the type of the receving pointer must match the
type specified with operator new;
• new may fail, but this is normally of no concern to the programmer. In particular, the program
does not have to test the success of the memory allocation, as is required for malloc and
friends. Section 9.2.2 delves into this aspect of new;
• delete returns void;
• for each call to new a matching delete should eventually be executed, lest a memory leak
occurs;
• delete can safely operate on a 0-pointer (doing nothing);
• otherwise delete must only be used to return memory allocated by new. It should not be used
to return memory allocated by malloc and friends.
• in C++ malloc and friends are deprecated and should be avoided.
Operator new can be used to allocate primitive types but also to allocate objects. When a primitive
type or a struct type without a constructor is allocated the allocated memory is not guaranteed to
be initialized to 0, but an initialization expression may be provided:
int *v1 = new int; // not guaranteed to be initialized to 0
int *v1 = new int(); // initialized to 0
int *v2 = new int(3); // initialized to 3
int *v3 = new int(3 * *v2); // initialized to 9
9.1. OPERATORS ‘NEW’ AND ‘DELETE’ 181
When a class-type object is allocated, the arguments of its constructor (if any) are specified immediately
following the type specification in the new expression and the object is initialized by to the
thus specified constructor. For example, to allocate string objects the following statements could
be used:
string *s1 = new string; // uses the default constructor
string *s2 = new string(); // same
string *s3 = new string(4, ’ ’); // initializes to 4 blanks.
In addition to using new to allocate memory for a single entity or an array of entities there is also
a variant that allocates raw memory: operator new(sizeInBytes). Raw memory is returned as
a void _. Here new allocates a block of memory for unspecified purpose. Although raw memory
may consist of multiple characters it should not be interpreted as an array of characters. Since
raw memory returned by new is returned as a void _ its return value can be assigned to a void _
variable. More often it is assigned to a char _ variable, using a cast. Here is an example:
char *chPtr = static_cast<char *>(operator new(numberOfBytes));
The use of raw memory is frequently encountered in combination with the placement new operator,
discussed in section 9.1.5.
9.1.1 Allocating arrays
Operator new[] is used to allocate arrays. The generic notation new[] is used in the C++ Annotations.
Actually, the number of elements to be allocated must be specified between the square
brackets and it must, in turn, be prefixed by the type of the entities that must be allocated. Example:
int *intarr = new int[20]; // allocates 20 ints
string *stringarr = new string[10]; // allocates 10 strings.
Operator new is a different operator than operator new[]. A consequence of this difference is discussed
in the next section (9.1.2).
Arrays allocated by operator new[] are called dynamic arrays. They are constructed during the
execution of a program, and their lifetime may exceed the lifetime of the function in which they
were created. Dynamically allocated arrays may last for as long as the program runs.
When new[] is used to allocate an array of primitive values or an array of objects, new[] must
be specified with a type and an (unsigned) expression between its square brackets. The type and
expression together are used by the compiler to determine the required size of the block of memory
to make available. When new[] is used the array’s elements are stored consecutively in memory. An
array index expressionmay thereafter be used to access the array’s individual elements: intarr[0]
represents the first int value, immediately followed by intarr[1], and so on until the last element
(intarr[19]). With non-class types (primitive types, struct types without constructors) the block
of memory returned by operator new[] is not guaranteed to be initialized to 0.
When operator new[] is used to allocate arrays of objects their constructors are automatically used.
Consequently new string[20] results in a block of 20 initialized string objects. When allocating
arrays of objects the class’s default constructor is used to initialize each individual object in turn. A
non-default constructor cannot be called, but often it is possible to work around that as discussed in
section 13.8.
182 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
The expression between brackets of operator new[] represents the number of elements of the array
to allocate. The C++ standard allows allocation of 0-sized arrays. The statement new int[0] is
correct C++. However, it is also pointless and confusing and should be avoided. It is pointless as it
doesn’t refer to any element at all, it is confusing as the returned pointer has a useless non-0 value.
A pointer intending to point to an array of values should be initialized (like any pointer that isn’t
yet pointing to memory) to 0, allowing for expressions like if (ptr) ...
Without using operator new[], arrays of variable sizes can also be constructed as local arrays. Such
arrays are not dynamic arrays and their lifetimes are restricted to the lifetime of the block in which
they were defined.
Once allocated, all arrays have fixed sizes. There is no simple way to enlarge or shrink arrays. C++
has no operator ‘renew’. Section 9.1.3 illustrates how to enlarge arrays.
9.1.2 Deleting arrays
Dynamically allocated arrays are deleted using operator delete[]. It expects a pointer to a block
of memory, previously allocated by operator new[].
When operator delete[]’s operand is a pointer to an array of objects two actions are performed:
• First, the class’s destructor is called for each of the objects in the array. The destructor, as
explained later in this chapter, performs all kinds of cleanup operations that are required by
the time the object ceases to exist.
• Second, the memory pointed at by the pointer is returned to the common pool.
Here is an example showing how to allocate and delete an array of 10 string objects:
std::string *sp = new std::string[10];
delete[] sp;
No special action is performed if a dynamically allocated array of primitive typed values is deleted.
Following int _it = new int[10] the statement delete[] it simply returns the memory
pointed at by it. Realize that, as a pointer is a primitive type, deleting a dynamically allocated
array of pointers to objects does not result in the proper destruction of the objects the array’s elements
point at. So, the following example results in a memory leak:
string **sp = new string *[5];
for (size_t idx = 0; idx != 5; ++idx)
sp[idx] = new string;
delete[] sp; // MEMORY LEAK !
In this example the only action performed by delete[] is to return an area the size of five pointers
to strings to the common pool.
Here’s how the destruction in such cases should be performed:
• Call delete for each of the array’s elements;
• Delete the array itself
9.1. OPERATORS ‘NEW’ AND ‘DELETE’ 183
Example:
for (size_t idx = 0; idx != 5; ++idx)
delete sp[idx];
delete[] sp;
One of the consequences is of course that by the time the memory is going to be returned not only
the pointer must be available but also the number of elements it contains. This can easily be accomplished
by storing pointer and number of elements in a simple class and then using an object of that
class.
Operator delete[] is a different operator than operator delete. The rule of thumb is: if new[]
was used, also use delete[].
9.1.3 Enlarging arrays
Once allocated, all arrays have fixed sizes. There is no simple way to enlarge or shrink arrays. C++
has no renew operator. The basic steps to take when enlarging an array are the following:
• Allocate a new block of memory of larger size;
• Copy the old array contents to the new array;
• Delete the old array;
• Let the pointer to the array point to the newly allocated array.
Static and local arrays cannot be resized. Resizing is only possible for dynamically allocated arrays.
Example:
#include <string>
using namespace std;
string *enlarge(string *old, unsigned oldsize, unsigned newsize)
{
string *tmp = new string[newsize]; // allocate larger array
for (size_t idx = 0; idx != oldsize; ++idx)
tmp[idx] = old[idx]; // copy old to tmp
delete[] old; // delete the old array
return tmp; // return new array
}
int main()
{
string *arr = new string[4]; // initially: array of 4 strings
arr = enlarge(arr, 4, 6); // enlarge arr to 6 elements.
}
The procedure to enlarge shown in the example also has several drawbacks.
• The new array requires newsize constructors to be called;
184 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
• Having initialized the strings in the new array, oldsize of them are immediately reassigned
to the corresponding values in the original array;
• All the objects in the old arrays are destroyed.
Depending on the context various solutions exist to improve the efficiency of this rather inefficient
procedure. An array of pointers could be used (requiring only the pointers to be copied, no destruction,
no superfluous initialization) or raw memory in combination with the placement new operator
could be used (an array of objects remains available, no destruction, no superfluous construction).
9.1.4 Managing ‘raw’ memory
As we’ve seen operator new allocates the memory for an object and subsequently initializes that
object by calling one of its constructors. Likewise, operator delete calls an object’s destructor and
subsequently returns the memory allocated by operator new to the common pool.
In the next section we’ll encounter another use of new, allowing us to initialize objects in so-called
raw memory: memory merely consisting of bytes that have been made available by either static or
dynamic allocation.
Raw memory is made available by operator new(sizeInBytes). This should not be interpreted
as an array of any kind but just a series of memory locations that were dynamically made available.
operator new returns a void _ so a (static) cast is required to use it as memory of some type.
Here are two examples:
// room for 5 ints
int *ip = static_cast<int *>(operator new(5 * sizeof(int)));
// room for 5 strings
string *sp = static_cast<string *>(operator new(5 * sizeof(string)));
As operator new has no concept of data types the size of the intended data type must be specified
when allocating raw memory for a certain number of objects of an intended type. The use of
operator new therefore somewhat resembles the use of malloc.
The counterpart of operator new is operator delete. Operator delete expects a void _ (so
a pointer to any type can be passed to it). The pointer is interpreted as a pointer to raw memory
which is returned to the common pool without any further action. In particular, no destructors are
called by operator delete. The use of operator delete therefore resembles the use of free.
To return the memory pointed at by the abovementioned variables ip and sp operator delete
should be used:
// delete raw memory allocated by operator new
operator delete(ip);
operator delete(sp);
9.1.5 The ‘placement new’ operator
A remarkable form of operator new is called the placement new operator. Before using placement
new the <memory> header file must be included.
Placement new is passed an existing block of memory into which new initializes an object or value.
The block of memory should be large enough to contain the object, but apart from that there are
9.1. OPERATORS ‘NEW’ AND ‘DELETE’ 185
no further requirements. It is easy to determine how much memory is used by en entity (object or
variable) of type Type: the sizeof operator returns the number of bytes used by an Type entity.
Entities may of course dynamically allocate memory for their own use. Dynamically allocated memory,
however, is not part of the entity’s memory ‘footprint’ but it is always made available externally
to the entity itself. This is why sizeof returns the same value when applied to different string
objects that return different length and capacity values.
The placement new operator uses the following syntax (using Type to indicate the used data type):
Type *new(void *memory) Type(arguments);
Here, memory is a block of memory of at least sizeof(Type) bytes and Type(arguments) is any
constructor of the class Type.
The placement new operator is useful in situations where classes set aside memory to be used later.
This is used, e.g., by std::string to change its capacity. Calling string::reserve may enlarge
that capacity without making memory beyond the string’s length immediately available to the
string object’s users. But the object itself may use its additional memory. E.g, when information is
added to a string object it can drawmemory fromits capacity rather than performing a reallocation
for each single character that is added to its contents.
Let’s apply that philosophy to a class Strings storing std::string objects. The class defines a
string _d_memory accessing the memory holding its d_size string objects as well as d_capacity
- d_size reserved memory. Assuming that a default constructor initializes d_capacity to 1,
doubling d_capacity whenever an additional string must be stored, the class must support the
following essential operations:
• doubling its capacity when all its spare memory (e.g., made available by reserve) has been
consumed;
• adding another string object
• properly deleting the installed strings and memory when a Strings object ceases to exist.
The private member void Strings::reserve is called when the current capacity must be enlarged
to d_capacity. It operates as follows: First new, raw, memory is allocated (line 1). This
memory is in no way initialized with strings. Then the available strings in the old memory are
copied into the newly allocated raw memory using placement new (line 2). Next, the old memory is
deleted (line 3).
void Strings::reserve()
{
using std::string;
string *newMemory = static_cast<string *>( // 1
operator new(d_capacity * sizeof(string)));
for (size_t idx = 0; idx != d_size; ++idx) // 2
new (newMemory + idx) string(d_memory[idx]);
destroy(); // 3
d_memory = newMemory;
}
The member append adds another string object to a Strings object. A (public) member
reserve(request) (enlarging d_capacity if necessary and if enlarged calling reserve()) en186
CHAPTER 9. CLASSES AND MEMORY ALLOCATION
sures that the String object’s capacity is sufficient. Then placement new is used to install the latest
string into the raw memory’s appropriate location:
void Strings::append(std::string const &next)
{
reserve(d_size + 1);
new (d_memory + d_size) std::string(next);
++d_size;
}
At the end of the String object’s lifetime, and during enlarging operations all currently used dynamically
allocated memory must be returned. This is made the responsibility of the member destroy,
which is called by the class’s destructor and by reserve(). More about the destructor itself in the
next section, but the implementation of the support member destroy is discussed below.
With placement new an interesting situation is encountered. Objects, possibly themselves allocating
memory, are installed in memory that may or may not have been allocated dynamically, but that is
usually not completely filled with such objects. So a simple delete[] can’t be used. On the other
hand, a delete for each of the objects that are available can’t be used either, since those delete
operations would also try to delete the memory of the objects themselves, which wasn’t dynamically
allocated.
This peculiar situation is solved in a peculiar way, only encountered in cases where placement new
is used: memory allocated by objects initialized using placement new is returned by explicitly calling
the object’s destructor. The destructor is declared as a member having as its name the class name
preceded by a tilde, not using any arguments. So, std::string’s destructor is named ~string. An
object’s destructor only returns memory allocated by the object itself and, despite of its name, does
not destroy its object. Any memory allocated by the strings stored in our class Strings is therefore
properly destroyed by explicitly calling their destructors. Following this d_memory is back to its
initial status: it again points to raw memory. This raw memory is then returned to the common pool
by operator delete:
void Strings::destroy()
{
for (std::string *sp = d_memory + d_size; sp-- != d_memory; )
sp->~string();
operator delete(d_memory);
}
So far, so good. All is well as long as we’re using but one object. What about allocating an array of
objects? Initialization is performed as usual. But as with delete, delete[] cannot be called when
the buffer was allocated statically. Instead, when multiple objects were initialized using placement
new in combination with a statically allocated buffer all the objects’ destructors must be called explicitly,
as in the following example:
using std::string;
char buffer[3 * sizeof(string)];
string *sp = new(buffer) string [3];
for (size_t idx = 0; idx < 3; ++idx)
sp[idx].~string();
9.2. THE DESTRUCTOR 187
9.2 The destructor
Comparable to the constructor, classes may define a destructor. This function is the constructor’s
counterpart in the sense that it is invoked when an object ceases to exist. A destructor is usually
called automatically, but that’s not always true. The destructors of dynamically allocated objects are
not automatically activated, but in addition to that: when a program is interrupted by an exit call,
only the destructors of already initialized global objects are called. In that situation destructors of
objects defined locally by functions are also not called. This is one (good) reason for avoiding exit
in C++ programs.
Destructors obey the following syntactical requirements:
• a destructor’s name is equal to its class name prefixed by a tilde;
• a destructor has no arguments;
• a destructor has no return value.
Destructors are declared in their class interfaces. Example:
class Strings
{
public:
Strings();
~Strings(); // the destructor
};
By convention the constructors are declared first. The destructor is declared next, to be followed by
other member functions.
A destructor’s main task is to ensure that memory allocated by an object is properly returned when
the object ceases to exist. Consider the following interface of the class Strings:
class Strings
{
std::string *d_string;
size_t d_size;
public:
Strings();
Strings(char const *const *cStrings, size_t n);
~Strings();
std::string const &at(size_t idx) const;
size_t size() const;
};
The constructor’s task is to initialize the data fields of the object. E.g, its constructors are defined as
follows:
Strings::Strings()
:
d_string(0),
188 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
d_size(0)
{}
Strings::Strings(char const *const *cStrings, size_t size)
:
d_string(new string[size]),
d_size(size)
{
for (size_t idx = 0; idx != size; ++idx)
d_string[idx] = cStrings[idx];
}
As objects of the class Strings allocate memory a destructor is clearly required. Destructors may
or may not be called automatically, but note that destructors are only called (or, in the case of dynamically
allocated objects: should only be called) for fully constructed objects.
C++ considers objects ‘fully constructed’ once at least one of its constructors could normally complete.
It used to be the constructor, but as C++ supports constructor delegation, multiple constructors can
be activated for a single object; hence ‘at least one constructor’. The remaining rules apply to fully
constructed objects;
• Destructors of local non-static objects are called automatically when the execution flow leaves
the block in which they were defined; the destructors of objects defined somewhere in the outer
block of a function are called just before the function terminates.
• Destructors of static or global objects are called when the program itself terminates.
• The destructor of a dynamically allocated object is called by delete using the object’s address
as its operand;
• The destructors of a dynamically allocated array of objects are called by delete[] using the
address of the array’s first element as its operand;
• The destructor of an object initialized by placement new is activated by explicitly calling the
object’s destructor.
The destructor’s task is to ensure that all memory that is dynamically allocated and controlled only
by the object itself is returned. The task of the Strings’s destructor would therefore be to delete
the memory to which d_string points. Its implementation is:
Strings::~Strings()
{
delete[] d_string;
}
The next example shows Strings at work. In process a Strings store is created, and its data
are displayed. It returns a dynamically allocated Strings object to main. A Strings _ receives
the address of the allocated object and deletes the object again. Another Strings object is then
created in a block of memory made available locally in main, and an explicit call to ~Strings is
required to return the memory allocated by that object. In the example only once a Strings object
is automatically destroyed: the local Strings object defined by process. The other two Strings
objects require explicit actions to prevent memory leaks.
#include "strings.h"
9.2. THE DESTRUCTOR 189
#include <iostream>
using namespace std;;
void display(Strings const &store)
{
for (size_t idx = 0; idx != store.size(); ++idx)
cout << store.at(idx) << ’\n’;
}
Strings *process(char *argv[], int argc)
{
Strings store(argv, argc);
display(store);
return new Strings(argv, argc);
}
int main(int argc, char *argv[])
{
Strings *sp = process(argv, argc);
delete sp;
char buffer[sizeof(Strings)];
sp = new (buffer) Strings(argv, argc);
sp->~Strings();
}
9.2.1 Object pointers revisited
Operators new and delete are used when an object or variable is allocated. One of the advantages
of the operators new and delete over functions like malloc and free is that new and delete call
the corresponding object constructors and destructors.
The allocation of an object by operator new is a two-step process. First the memory for the object
itself is allocated. Then its constructor is called, initializing the object. Analogously to the construction
of an object, the destruction is also a two-step process: first, the destructor of the class is called
deleting the memory controlled by the object. Then the memory used by the object itself is freed.
Dynamically allocated arrays of objects can also be handled by new and delete. When allocating
an array of objects using operator new the default constructor is called for each object in the array.
In cases like this operator delete[] must be used to ensure that the destructor is called for each of
the objects in array.
However, the addresses returned by new Type and new Type[size] are of identical types, in both
cases a Type _. Consequently it cannot be determined by the type of the pointer whether a pointer
to dynamically allocated memory points to a single entity or to an array of entities.
What happens if delete rather than delete[] is used? Consider the following situation, in which
the destructor ~Strings is modified so that it tells us that it is called. In a main function an array
of two Strings objects is allocated by new, to be deleted by delete []. Next, the same actions are
repeated, albeit that the delete operator is called without []:
#include <iostream>
#include "strings.h"
190 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
using namespace std;
Strings::~Strings()
{
cout << "Strings destructor called" << ’\n’;
}
int main()
{
Strings *a = new Strings[2];
cout << "Destruction with []’s" << ’\n’;
delete[] a;
a = new Strings[2];
cout << "Destruction without []’s" << ’\n’;
delete a;
}
/*
Generated output:
Destruction with []’s
Strings destructor called
Strings destructor called
Destruction without []’s
Strings destructor called
*/
From the generated output, we see that the destructors of the individual Strings objects are called
when delete[] is used, while only the first object’s destructor is called if the [] is omitted.
Conversely, if delete[] is called in a situation where delete should have been called the results
are unpredictable, and the program will most likely crash. This problematic behavior is caused by
the way the run-time system stores information about the size of the allocated array (usually right
before the array’s first element). If a single object is allocated the array-specific information is not
available, but it is nevertheless assumed present by delete[]. Thus this latter operator encounters
bogus values in the memory locations just before the array’s first element. It then dutifully
interprets the value it encounters there as size information, usually causing the program to fail.
If no destructor is defined, a trivial destructor is defined by the compiler. The trivial destructor
ensures that the destructors of composed objects (as well as the destructors of base classes if a
class is a derived class, cf. chapter 13) are called. This has serious implications: objects allocating
memory create memory leaks unless precautionary measures are taken (by defining an appropriate
destructor). Consider the following program:
#include <iostream>
#include "strings.h"
using namespace std;
Strings::~Strings()
{
cout << "Strings destructor called" << ’\n’;
}
int main()
9.2. THE DESTRUCTOR 191
{
Strings **ptr = new Strings* [2];
ptr[0] = new Strings[2];
ptr[1] = new Strings[2];
delete[] ptr;
}
This program produces no output at all. Why is this? The variable ptr is defined as a pointer to
a pointer. The dynamically allocated array therefore consists of pointer variables and pointers are
of a primitive type. No destructors exist for primitive typed variables. Consequently only the array
itself is returned, and no Strings destructor is called.
Of course, we don’t want this, but require the Strings objects pointed to by the elements of ptr to
be deleted too. In this case we have two options:
• In a for-statement visit all the elements of the ptr array, calling delete for each of the array’s
elements. This procedure was demonstrated in the previous section.
• A wrapper class is designed around a pointer (to, e.g., an object of some class, like Strings).
Rather than using a pointer to a pointer to Strings objects a pointer to an array of wrapperclass
objects is used. As a result delete[] ptr calls the destructor of each of the wrapper
class objects, in turn calling the Strings destructor for their d_strings members. Example:
#include <iostream>
using namespace std;
class Strings // partially implemented
{
public:
~Strings();
};
inline Strings::~Strings()
{
cout << "destructor called\n";
}
class Wrapper
{
Strings *d_strings;
public:
Wrapper();
~Wrapper();
};
inline Wrapper::Wrapper()
:
d_strings(new Strings())
{}
inline Wrapper::~Wrapper()
{
delete d_strings;
192 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
}
int main()
{
auto ptr = new Strings *[4];
// ... code assigning ‘new Strings’ to ptr’s elements
delete[] ptr; // memory leak: ~Strings() not called
cout << "===========\n";
delete[] new Wrapper[4]; // OK: 4 x destructor called
}
/*
Generated output:
===========
destructor called
destructor called
destructor called
destructor called
*/
9.2.2 The function set_new_handler()
The C++ run-time system ensures that when memory allocation fails an error function is activated.
By default this function throws a bad_alloc exception (see section 10.8), terminating the program.
Therefore it is not necessary to check the return value of operator new. Operator new’s default
behavior may be modified in various ways. One way to modify its behavior is to redefine the function
that’s called when memory allocation fails. Such a function must comply with the following
requirements:
• it has no parameters;
• its return type is void.
A redefined error function might, e.g., print a message and terminate the program. The user-written
error function becomes part of the allocation system through the function set_new_handler.
Such an error function is illustrated below1:
#include <iostream>
#include <string>
#include <cstring>
using namespace std;
void outOfMemory()
{
cout << "Memory exhausted. Program terminates." << ’\n’;
exit(1);
}
1 Thisimplementation applies to the Gnu C/C++ requirements. Actually using the program given in the next example is
not advised, as it probably enormously slows down your computer due to the resulting use of the operating system’s swap
area.
9.3. THE ASSIGNMENT OPERATOR 193
int main()
{
long allocated = 0;
set_new_handler(outOfMemory); // install error function
while (true) // eat up all memory
{
memset(new int [100000], 0, 100000 * sizeof(int));
allocated += 100000 * sizeof(int);
cout << "Allocated " << allocated << " bytes\n";
}
}
Once the new error function has been installed it is automatically invoked when memory allocation
fails, and the program is terminated. Memory allocation may fail in indirectly called code as well,
e.g., when constructing or using streams or when strings are duplicated by low-level functions.
So far for the theory. On some systems the ‘out of memory’ condition may actually never be reached,
as the operating system may interfere before the run-time support system gets a chance to stop the
program (see also this link2).
The standard C functions allocating memory (like strdup, malloc, realloc etc.) do not trigger
the new handler when memory allocation fails and should be avoided in C++ programs.
9.3 The assignment operator
In C++ struct and class type objects can be directly assigned new values in the same way as this
is possible in C. The default action of such an assignment for non-class type data members is a
straight byte-by-byte copy from one data member to another. For now we’ll use the following simple
class Person:
class Person
{
char *d_name;
char *d_address;
char *d_phone;
public:
Person();
Person(char const *name, char const *addr, char const *phone);
~Person();
private:
char *strdupnew(char const *src); // returns a copy of src.
};
// strdupnew is easily implemented, here is its inline implementation:
inline char *Person::strdupnew(char const *src)
{
return strcpy(new char [strlen(src) + 1], src);
}
2http://www.linuxdevcenter.com/pub/a/linux/2006/11/30/linux-out-of-memory.html
194 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
Person’s data members are initialized to zeroes or to copies of the NTB strings passed to Person’s
constructor, using some variant of strdup. The allocated memory is eventually returned by
Person’s destructor.
Now consider the consequences of using Person objects in the following example:
void tmpPerson(Person const &person)
{
Person tmp;
tmp = person;
}
Here’s what happens when tmpPerson is called:
• it expects a reference to a Person as its parameter person.
• it defines a local object tmp, whose data members are initialized to zeroes.
• the object referenced by person is copied to tmp: sizeof(Person) number of bytes are copied
from person to tmp.
Now a potentially dangerous situation has been created. The actual values in person are pointers,
pointing to allocated memory. After the assignment this memory is addressed by two objects:
person and tmp.
• The potentially dangerous situation develops into an acutely dangerous situation once the
function tmpPerson terminates: tmp is destroyed. The destructor of the class Person releases
the memory pointed to by the fields d_name, d_address and d_phone: unfortunately, this
memory is also pointed at by person....
This problematic assignment is illustrated in Figure 9.1.
Having executed tmpPerson, the object referenced by person now contains pointers to deleted
memory.
This is undoubtedly not a desired effect of using a function like tmpPerson. The deleted memory is
likely to be reused by subsequent allocations. The pointer members of person have effectively become
wild pointers, as they don’t point to allocated memory anymore. In general it can be concluded
that
every class containing pointer data members is a potential candidate for trouble.
Fortunately, it is possible to prevent these troubles, as discussed next.
9.3.1 Overloading the assignment operator
Obviously, the right way to assign one Person object to another, is not to copy the contents of the
object bytewise. A better way is to make an equivalent object. One having its own allocated memory
containing copies of the original strings.
The way to assign a Person object to another is illustrated in Figure 9.2. There are several ways to
assign a Person object to another. One way would be to define a special member function to handle
9.3. THE ASSIGNMENT OPERATOR 195
Figure 9.1: Private data and public interface functions of the class Person, using byte-by-byte assignment
Figure 9.2: Private data and public interface functions of the class Person, using the ‘correct’ assignment.
196 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
the assignment. The purpose of this member function would be to create a copy of an object having
its own name, address and phone strings. Such a member function could be:
void Person::assign(Person const &other)
{
// delete our own previously used memory
delete[] d_name;
delete[] d_address;
delete[] d_phone;
// copy the other Person’s data
d_name = strdupnew(other.d_name);
d_address = strdupnew(other.d_address);
d_phone = strdupnew(other.d_phone);
}
Using assign we could rewrite the offending function tmpPerson:
void tmpPerson(Person const &person)
{
Person tmp;
// tmp (having its own memory) holds a copy of person
tmp.assign(person);
// now it doesn’t matter that tmp is destroyed..
}
This solution is valid, although it only tackles a symptom. It requires the programmer to use a
specific member function instead of the assignment operator. The original problem (assignment
produces wild pointers) is still not solved. Since it is hard to ‘strictly adhere to a rule’ a way to solve
the original problem is of course preferred.
Fortunately a solution exists using operator overloading: the possibility C++ offers to redefine the
actions of an operator in a given context. Operator overloading was briefly mentioned earlier, when
the operators << and >> were redefined to be used with streams (like cin, cout and cerr), see
section 3.1.4.
Overloading the assignment operator is probably the most common form of operator overloading in
C++. A word of warning is appropriate, though. The fact that C++ allows operator overloading does
not mean that this feature should indiscriminately be used. Here’s what you should keep in mind:
• operator overloading should be used in situations where an operator has a defined action, but
this default action has undesired side effects in a given context. A clear example is the above
assignment operator in the context of the class Person.
• operator overloading can be used in situations where the operator is commonly applied and no
surprise is introduced when it’s redefined. An example where operator overloading is appropriately
used is found in the class std::string: assiging one string object to another provides
the destination string with a copy of the contents of the source string. No surprises here.
• in all other cases a member function should be defined instead of redefining an operator.
An operator should simply do what it is designed to do. The phrase that’s often encountered in the
context of operator overloading is do as the ints do. The way operators behave when applied to ints
9.3. THE ASSIGNMENT OPERATOR 197
is what is expected, all other implementations probably cause surprises and confusion. Therefore,
overloading the insertion (<<) and extraction (>>) operators in the context of streams is probably
ill-chosen: the stream operations have nothing in common with bitwise shift operations.
9.3.1.1 The member ’operator=()’
To add operator overloading to a class, the class interface is simply provided with a (usually public)
member function naming the particular operator. That member function is thereupon implemented.
To overload the assignment operator =, a member operator=(Class const &rhs) is added to the
class interface. Note that the function name consists of two parts: the keyword operator, followed
by the operator itself. When we augment a class interface with a member function operator=, then
that operator is redefined for the class, which prevents the default operator from being used. In the
previous section the function assign was provided to solve the problems resulting from using the
default assignment operator. Rather than using an ordinary member function C++ commonly uses
a dedicated operator generalizing the operator’s default behavior to the class in which it is defined.
The assign member mentioned before may be redefined as follows (the member operator= presented
below is a first, rather unsophisticated, version of the overloaded assignment operator. It will
shortly be improved):
class Person
{
public: // extension of the class Person
// earlier members are assumed.
void operator=(Person const &other);
};
Its implementation could be
void Person::operator=(Person const &other)
{
delete[] d_name; // delete old data
delete[] d_address;
delete[] d_phone;
d_name = strdupnew(other.d_name); // duplicate other’s data
d_address = strdupnew(other.d_address);
d_phone = strdupnew(other.d_phone);
}
This member’s actions are similar to those of the previously mentioned member assign, but this
member is automatically called when the assignment operator = is used. Actually there are two
ways to call overloaded operators as shown in the next example:
void tmpPerson(Person const &person)
{
Person tmp;
tmp = person;
tmp.operator=(person); // the same thing
}
198 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
Overloaded operators are seldom called explicitly, but explicit calls must be used (rather than using
the plain operator syntax) when you explicitly want to call the overloaded operator from a pointer to
an object (it is also possible to dereference the pointer first and then use the plain operator syntax,
see the next example):
void tmpPerson(Person const &person)
{
Person *tmp = new Person;
tmp->operator=(person);
*tmp = person; // yes, also possible...
delete tmp;
}
9.4 The ‘this’ pointer
A member function of a given class is always called in combination with an object of its class. There
is always an implicit ‘substrate’ for the function to act on. C++ defines a keyword, this, to reach
this substrate.
The this keyword is a pointer variable that always contains the address of the object for which
the member function was called. The this pointer is implicitly declared by each member function
(whether public, protected, or private). The this ponter is a constant pointer to an object of
the member function’s class. For example, the members of the class Person implicitly declare:
extern Person *const this;
A member function like Person::name could be implemented in two ways: with or without using
the this pointer:
char const *Person::name() const // implicitly using ‘this’
{
return d_name;
}
char const *Person::name() const // explicitly using ‘this’
{
return this->d_name;
}
The this pointer is seldomexplicitly used, but situations do exist where the this pointer is actually
required (cf. chapter 16).
9.4.1 Sequential assignments and this
C++’s syntax allows for sequential assignments, with the assignment operator associating fromright
to left. In statements like:
a = b = c;
9.5. THE COPY CONSTRUCTOR: INITIALIZATION VS. ASSIGNMENT 199
the expression b = c is evaluated first, and its result in turn is assigned to a.
The implementation of the overloaded assignment operator we’ve encountered thus far does not
permit such constructions, as it returns void.
This imperfection can easily be remedied using the this pointer. The overloaded assignment operator
expects a reference to an object of its class. It can also return a reference to an object of its class.
This reference can then be used as an argument in sequential assignments.
The overloaded assignment operator commonly returns a reference to the current object (i.e., _this).
The next version of the overloaded assignment operator for the class Person thus becomes:
Person &Person::operator=(Person const &other)
{
delete[] d_address;
delete[] d_name;
delete[] d_phone;
d_address = strdupnew(other.d_address);
d_name = strdupnew(other.d_name);
d_phone = strdupnew(other.d_phone);
// return current object as a reference
return *this;
}
Overloaded operators may themselves be overloaded. Consider the string class, having
overloaded assignment operators operator=(std::string const &rhs), operator=(char
const _rhs), and several more overloaded versions. These additional overloaded versions are
there to handle different situations which are, as usual, recognized by their argument types. These
overloaded versions all follow the same mold: when necessary dynamically allocated memory controlled
by the object is deleted; new values are assigned using the overloaded operator’s parameter
values and _this is returned.
9.5 The copy constructor: initialization vs. assignment
Consider the class Strings, introduced in section 9.2, once again. As it contains several primitive
type data members as well as a pointer to dynamically allocated memory it needs a constructor,
a destructor, and an overloaded assignment operator. In fact the class offers two constructors: in
addition to the default constructor it offers a constructor expecting a char const _const _ and a
size_t.
Now consider the following code fragment. The statement references are discussed following the
example:
int main(int argc, char **argv)
{
Strings s1(argv, argc); // (1)
Strings s2; // (2)
Strings s3(s1); // (3)
s2 = s1; // (4)
}
200 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
• At 1 we see an initialization. The object s1 is initialized using main’s parameters: Strings’s
second constructor is used.
• At 2 Strings’s default constructor is used, initializing an empty Strings object.
• At 3 yet another Strings object is created, using a constructor accepting an existing Strings
object. This form of initializations has not yet been discussed. It is called a copy construction
and the constructor performing the initialization is called the copy constructor. Copy constructions
are also encountered in the following form:
Strings s3 = s1;
This is a construction and therefore an initialization. It is not an assignment as an assignment
needs a left-hand operand that has already been defined. C++ allows the assignment syntax
to be used for constructors having only one parameter. It is somewhat deprecated, though.
• At 4 we see a plain assignment.
In the above example three objects were defined, each using a different constructor. The actually
used constructor was deduced from the constructor’s argument list.
The copy constructor encountered here is new. It does not result in a compilation error even though
it hasn’t been declared in the class interface. This takes us to the following rule:
A copy constructor is (almost) always available, even if it isn’t declared in the class’s
interface.
The reason for the ‘(almost)’ is given in section 9.7.1.
The copy constructor made available by the compiler is also called the trivial copy constructor. Its
use can easily be suppressed (using the = delete idiom). The trivial copy constructor performs
a byte-wise copy operation of the existing object’s primitive data to the newly created object, calls
copy constructors to intialize the object’s class data members from their counterparts in the existing
object and, when inheritance is used, calls the copy constructors of the base class(es) to initialize the
new object’s base classes.
Consequently, in the above example the trivial copy constructor is used. As it performs a byte-bybyte
copy operation of the object’s primitive type data members that is exactly what happens at
statement 3. By the time s3 ceases to exist its destructor deletes its array of strings. Unfortunately
d_string is of a primitive data type and so it also deletes s1’s data. Once again we encounter wild
pointers as a result of an object going out of scope.
The remedy is easy: instead of using the trivial copy constructor a copy constructor must explicitly
be added to the class’s interface and its definition must prevent the wild pointers, comparably to
the way this was realized in the overloaded assignment operator. An object’s dynamically allocated
memory is duplicated, so that it contains its own allocated data. The copy constructor is simpler than
the overloaded assignment operator in that it doesn’t have to delete previously allocated memory.
Since the object is going to be created no previously allocated memory already exists.
Strings’s copy constructor can be implemented as follows:
Strings::Strings(Strings const &other)
:
d_string(new string[other.d_size]),
d_size(other.d_size)
{
9.6. REVISING THE ASSIGNMENT OPERATOR 201
for (size_t idx = 0; idx != d_size; ++idx)
d_string[idx] = other.d_string[idx];
}
The copy constructor is always called when an object is initialized using another object of its class.
Apart fromthe plain copy construction that we encountered thus far, here are other situations where
the copy constructor is used:
• it is used when a function defines a class type value parameter rather than a pointer or a reference.
The function’s argument initializes the function’s parameter using the copy constructor.
Example:
void process(Strings store) // no pointer, no reference
{
store.at(3) = "modified"; // doesn’t modify ‘outer’
}
int main(int argc, char **argv)
{
Strings outer(argv, argc);
process(outer);
}
• it is used when a function defines a class type value return type. Example:
Strings copy(Strings const &store)
{
return store;
}
Here store is used to initialize copy’s return value. The returned Strings object is a temporary,
anonymous object that may be immediately used by code calling copy but no assumptions can be
made about its lifetime thereafter.
9.6 Revising the assignment operator
The overloaded assignment operator has characteristics also encountered with the copy constructor
and the destructor:
• The copying of (private) data occurs (1) in the copy constructor and (2) in the overloaded assignment
function.
• Allocatedmemory is deleted (1) in the overloaded assignment function and (2) in the destructor.
The copy constructor and the destructor clearly are required. If the overloaded assignment operator
also needs to return allocated memory and to assign new values to its data members couldn’t the
destructor and copy constructor be used for that?
As we’ve seen in our discussion of the destructor (section 9.2) the destructor can explicitly be called,
but that doesn’t hold true for the (copy) constructor. But let’s briefly summarize what an overloaded
assignment operator is supposed to do:
• It should delete the dynamically allocated memory controlled by the current object;
202 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
• It should reassign the current object’s data members using a provided existing object of its
class.
The second part surely looks a lot like copy construction. Copy construction becomes even more
attractive after realizing that the copy constructor also initializes any reference data members the
class might have. Realizing the copy construction part is easy: just define a local object and initialize
it using the assignment operator’s const reference parameter, like this:
Strings &operator=(Strings const &other)
{
Strings tmp(other);
// more to follow
return *this;
}
You may think the optimization operator=(String tmp) is attractive, but let’s postpone that for
a little while (at least until section 9.7).
Now that we’ve done the copying part, what about the deleting part? And isn’t there another slight
problem as well? After all we copied all right, but not into our intended (current, _this) object.
At this point it’s time to introduce swapping. Swapping two variables means that the two variables
exchange their values. We’ll discuss swapping in detail in the next section, but let’s for now assume
that we’ve added a member swap(Strings &other) to our class Strings. This allows us to
complete String’s operator= implementation:
Strings &operator=(Strings const &other)
{
Strings tmp(other);
swap(tmp);
return *this;
}
This implementation of operator= is generic: it can be applied to every class whose objects are
swappable. How does it work?
• The information in the other object is used to initialize a local tmp object. This takes care of
the copying part of the assignment operator;
• Calling swap ensures that the current object receives its new values (with tmp receiving the
current object’s original values);
• When operator= terminates its local tmp object ceases to exist and its destructor is called.
As it by now contains the data previously owned by the current object, the current object’s
original data are now destroyed, effectively completing the destruction part of the assignment
operation.
Nice?
9.6.1 Swapping
Many classes (e.g., std::string) offer swap members allowing us to swap two of their objects.
The Standard Template Library (STL, cf. chapter 18) offers various functions related to swapping.
9.6. REVISING THE ASSIGNMENT OPERATOR 203
1234
Before Swapping 2 and 3
1324
After Swapping 2 and 3
Figure 9.3: Swapping a linked list
There is even a swap generic algorithm (cf. section 19.1.61), which is commonly implemented using
the assignment operator. When implementing a swap member for our class Strings it could be
used, provided that all of String’s data members can be swapped. As this is true (why this is true
is discussed shortly) we can augment class Strings with a swap member:
void Strings::swap(Strings &other)
{
swap(d_string, other.d_string);
swap(d_size, other.d_size);
}
Having added this member to Strings the copy-and-swap implementation of String::operator=
can now be used.
When two variables (e.g., double one and double two) are swapped, each one holds the other
one’s value after the swap. So, if one == 12.50 and two == -3.14 then after swap(one, two)
one == -3.14 and two == 12.50.
Variables of primitive data types (pointers and the built-in types) can be swapped, class-type objects
can be swapped if their classes offer a swap member.
So should we provide our classes with a swap member, and if so, how should it be implemented?
The above example (Strings::swap) shows the standard way to implement a swap member: each
of its data members are swapped in turn. But there are situations where a class cannot implement a
swap member this way, even if the class only defines data members of primitive data types. Consider
the situation depicted in figure 9.3.
In this figure there are four objects, each object has a pointer pointing to the next object. The basic
organization of such a class looks like this:
class List
{
List *d_next;
...
};
Initially four objects have their d_next pointer set to the next object: 1 to 2, 2 to 3, 3 to 4. This is
shown in the upper half of the figure. At the bottom half it is shown what happens if objects 2 and 3
204 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
&
a
b
&b
&
c
d
&d
&
a
b
&b
&
c
d
&d
Before:
After:
Figure 9.4: Swapping objects with self-referential data
are swapped: 3’s d_next point is now at object 2, which still points to 4; 2’s d_next pointer points
to 3’s address, but 2’s d_next is now at object 3, which is therefore pointing to itself. Bad news!
Another situation where swapping of objects goes wrong happens with classes having data members
pointing or referring to data members of the same object. Such a situation is shown in figure 9.4.
Here, objects have two data members, as in the following class setup:
class SelfRef
{
size_t *d_ownPtr; // initialized to &d_data
size_t d_data;
};
The top-half of figure 9.4 shows two objects; their upper data members pointing to their lower data
members. But if these objects are swapped then the situation shown in the figure’s bottom half is
encountered. Here the values at addresses a and c are swapped, and so, rather than pointing to
their bottom data members they suddenly point to other object’s data members. Again: bad news.
The common cause of these failing swapping operations is easily recognized: simple swapping operations
must be avoided when data members point or refer to data that is involved in the swapping.
If, in figure 9.4 the a and c data members would point to information outside of the two objects (e.g.,
if they would point to dynamically allocated memory) then the simple swapping would succeed.
However, the difficulty encountered with swapping SelfRef objects does not imply that two
SelfRef objects cannot be swapped; it only means that we must be careful when designing swap
members. Here is an implementation of SelfRef::swap:
void SelfRef::swap(SelfRef &other)
{
swap(d_data, other.d_data);
}
9.6. REVISING THE ASSIGNMENT OPERATOR 205
In this implementation swapping leaves the self-referential data member as-is, and merely swaps
the remaining data. A similar swap member could be designed for the linked list shown in figure
9.3.
9.6.1.1 Fast swapping
As we’ve seen with placement new objects can be constructed in blocks ofmemory of sizeof(Class)
bytes large. And so, two objects of the same class each occupy sizeof(Class) bytes.
If objects of our class can be swapped, and if our class’s data members do not refer to data actually
involved in the swapping operation then a very fast swapping method that is based on the fact that
we know how large our objects are can be implemented.
In this fast-swap method we merely swap the contents of the sizeof(Class) bytes. This procedure
may be applied to classes whose objects may be swapped using a member-by-member swapping
operation and can (in practice, although this probabaly overstretches the allowed operations as
described by the C++ ANSI/ISO standard) also be used in classes having reference data members.
It simply defines a buffer of sizeof(Class) bytes and performs a circular memcpy operation. Here
is its implementation for a hypothetical class Class. It results in very fast swapping:
#include <cstring>
void Class::swap(Class &other)
{
char buffer[sizeof(Class)];
memcpy(buffer, &other, sizeof(Class));
memcpy(&other, this, sizeof(Class));
memcpy(this, buffer, sizeof(Class));
}
Here is a simple example of a class defining a reference data member and offering a swap member
implemented like the one above. The reference data members are initialized to external streams.
After running the program one contains two hello to 1 lines, two contains two hello to 2 lines (for
brevity all members of Reference are defined inline):
#include <fstream>
#include <cstring>
class Reference
{
std::ostream &d_out;
public:
Reference(std::ostream &out)
:
d_out(out)
{}
void swap(Reference &other)
{
char buffer[sizeof(Reference)];
memcpy(buffer, this, sizeof(Reference));
memcpy(this, &other, sizeof(Reference));
memcpy(&other, buffer, sizeof(Reference));
206 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
}
std::ostream &out()
{
return d_out;
}
};
int main()
{
std::ofstream one("one");
std::ofstream two("two");
Reference ref1(one); // ref1/ref2 hold references to
Reference ref2(two); // the streams
ref1.out() << "hello to 1\n"; // generate some output
ref2.out() << "hello to 2\n";
ref1.swap(ref2);
ref2.out() << "hello to 1\n"; // more output
ref1.out() << "hello to 2\n";
}
Fast swapping should only be used for self-defined classes for which it can be proven that fastswapping
does not corrupt its objects, when swapped.
9.7 Moving data
Traditionally, C++ offered two ways to assign the information pointed to by a data member of a
temporary object to an lvalue object. Either a copy constructor or reference counting had to be used.
In addition to these two methods C++ now also supports move semantics, allowing transfer of the
data pointed to by a temporary object to its destination.
Moving information is based on the concept of anonymous (temporary) data. Temporary values
are returned by functions like operator-() and operator+(Type const &lhs, Type const
&rhs), and in general by functions returning their results ‘by value’ instead of returning references
or pointers.
Anonymous values are always short-lived. When the returned values are primitive types (int,
double, etc.) nothing special happens, but if a class-type object is returned by value then its destructor
can be called immediately following the function call that produced the value. In any case,
the value itself becomes inaccessible immediately after the call. Of course, a temporary return value
may be bound to a reference (lvalue or rvalue), but as far as the compiler is concerned the value now
has a name, which by itself ends its status as a temporary value.
In this section we concentrate on anonymous temporary values and show how they can be used
to improve the efficiency of object construction and assignment. These special construction and
assignment methods are known as move construction and move assignment. Classes supporting
move operations are called move-aware.
Classes allocating their own memory usually benefit from becoming move-aware. But a class does
not have to use dynamic memory allocation before it can benefit from move operations. Most classes
9.7. MOVING DATA 207
using composition (or inheritance where the base class uses composition) can benefit from move
operations as well.
Movable parameters for class Class take the form Class &&tmp. The parameter is a rvalue reference,
and a rvalue reference only binds to an anonymous temporary value. The compiler is required
to call functions offering movable parameters whenever possible. This happens when the class defines
functions supporting Class && parameters and an anonymous temporary value is passed
to such functions. Once a temporary value has a name (which already happens inside functions
defining Class const & or Class &&tmp parameters as within such functions the names of these
parameters are available) it is no longer an anonymous temporary value, and within such functions
the compiler no longer calls functions expecting anonymous temporary values when the parameters
are used as arguments.
The next example (using inline member implementations for brevity) illustrates what happens if a
non-const object, a temporary object and a const object are passed to functions fun for which these
kinds of parameters were defined. Each of these functions call a function gun for which these kinds of
parameters were also defined. The first time fun is called it (as expected) calls gun(Class &). Then
fun(Class &&) is called as its argument is an anonymous (temporary) object. However, inside
fun the anonymous value has received a name, and so it isn’t anonymous anymore. Consequently,
gun(Class &) is called once again. Finally fun(Class const &) is called, and (as expected)
gun(Class const &) is now called.
#include <iostream>
using namespace std;
class Class
{
public:
Class()
{};
void fun(Class const &other)
{
cout << "fun: Class const &\n";
gun(other);
}
void fun(Class &other)
{
cout << "fun: Class &\n";
gun(other);
}
void fun(Class &&tmp)
{
cout << "fun: Class &&\n";
gun(tmp);
}
void gun(Class const &other)
{
cout << "gun: Class const &\n";
}
void gun(Class &other)
{
cout << "gun: Class &\n";
}
void gun(Class &&tmp)
208 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
{
cout << "gun: Class &&\n";
}
};
int main()
{
Class c1;
c1.fun(c1);
c1.fun(Class());
Class const c0;
c1.fun(c0);
}
Generally it is pointless to define a function having an rvalue reference return type. The compiler
decides whether or not to use an overloaded member expecting an rvalue reference on the basis of
the provided argument. If it is an anonymous temporary it calls the function defining the rvalue
reference parameter, if such a function is available. A rvalue reference return type is used, e.g.,
with the std::move call, to keep the rvalue reference nature of its argument, which is known to
be a temporary anonymous object. Such a situation can be exploited also in a situation where a
temporary object is passed to (and returned from) a function which must be able to modify the
temporary object. The alternative, passing a const &, is less attractive as it requires a const_cast
before the object can be modified. Here is an example:
std::string &&doubleString(std::string &&tmp)
{
tmp += tmp;
return std::move(tmp);
}
This allows us to do something like
std::cout << doubleString(std::string("hello "));
to insert hello hello into cout.
The compiler, when selecting a function to call applies a fairly simple algorithm, and also considers
copy elision. This is covered shortly (section 9.8).
9.7.1 The move constructor (dynamic data)
Our class Strings has, among other members a data member string _d_string. Clearly,
Strings should define a copy constructor, a destructor and an overloaded assignment operator.
Now consider the following function loadStrings(std::istream &in) extracting the strings for
a Strings object from in. Next, the Strings object filled by loadStrings is returned by value.
The function loadStrings returns a temporary object, which can then used to initialize an external
Strings object:
Strings loadStrings(std::istream &in)
9.7. MOVING DATA 209
{
Strings ret;
// load the strings into ’ret’
return ret;
}
// usage:
Strings store(loadStrings(cin));
In this example two full copies of a Strings object are required:
• initializing loadString’s value return type from its local Strings ret object;
• initializing store from loadString’s return value
We can improve the above procedure by defining a move constructor. Here is the declaration of the
Strings class move constructor:
Strings(Strings &&tmp);
Move constructors of classes using dynamic memory allocation are allowed to assign the values of
pointer data members to their own pointer data members without requiring them to make a copy of
the source’s data. Next, the temporary’s pointer value is set to zero to prevent its destructor from
destroying data now owned by the just constructed object. The move constructor has grabbed or
stolen the data from the temporary object. This is OK as the temporary object cannot be referred to
again (as it is anonymous, it cannot be accessed by other code) and the temporary objects cease to
exist shortly after the constructor’s call. Here is the implementation of Strings move constructor:
Strings::Strings(Strings &&tmp)
:
d_memory(tmp.d_memory),
d_size(tmp.d_size),
d_capacity(tmp.d_capacity)
{
tmp.d_memory = 0;
}
In section 9.5 it was stated that the copy constructor is almost always available. Almost always
as the declaration of a move constructor suppresses the default availability of the copy constructor.
The default copy constructor is also suppressed if a move assignment operator is declared (cf. section
9.7.3).
The following example shows a simple class Class, declaring a move constructor. In the main function
following the class interface a Class object is defined which is then passed to the constructor of
a second Class object. Compilation fails with the compiler reporting:
error: cannot bind ’Class’ lvalue to ’Class&&’
error: initializing argument 1 of ’Class::Class(Class&&)’
class Class
{
public:
Class() = default;
210 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
Class(Class &&tmp)
{}
};
int main()
{
Class one;
Class two(one);
}
The cure is easy: after declaring a (possibly default) copy constructor the error disappears:
class Class
{
public:
Class() = default;
Class(Class const &other) = default;
Class(Class &&tmp)
{}
};
int main()
{
Class one;
Class two(one);
}
9.7.2 The move constructor (composition)
Classes not using pointer members pointing to memory controlled by its objects (and not having
base classes doing so, see chapter 13) may also benefit from overloaded members expecting rvalue
references. The class benefits frommove operations when one ormore of the composed datamembers
themselves support move operations.
Move operations cannot be implemented if the class type of a composed data member does not support
moving or copying. Currently, stream classes fall into this category.
An example of amove-aware class is the class std:string. A class Person could use composition by
defining std::string d_name and std::string d_address. Its move constructor would then
have the following prototype:
Person(Person &&tmp);
However, the following implementation of this move constructor is incorrect:
Person::Person(Person &&tmp)
:
d_name(tmp.d_name),
d_address(tmp.d_address)
{}
It is incorrect as it string’s copy constructors rather than string’s move constructors are called.
If you’re wondering why this happens then remember that move operations are only performed for
9.7. MOVING DATA 211
anonymous objects. To the compiler anything having a name isn’t anonymous. And so, by implication,
having available a rvalue reference does not mean that we’re referring to an anonymous object.
But we know that the move constructor is only called for anonymous arguments. To use the corresponding
string move operations we have to inform the compiler that we’re talking about anonymous
data members as well. For this a cast could be used (e.g., static_cast<Person &&>(tmp)),
but the C++-0x standard provides the function std::move to anonymize a named object. The correct
implementation of Person’s move construction is, therefore:
Person::Person(Person &&tmp)
:
d_name( std::move(tmp.d_name) ),
d_address( std::move(tmp.d_address) )
{}
The function std::move is (indirectly) declared by many header files. If no header is already declaring
std::move then include utility.
When a class using composition not only contains class type data members but also other types of
data (pointers, references, primitive data types), then these other data types can be initialized as
usual. Primitive data type members can simply be copied; references can be initialized as usual en
pointers may use move operations as discussed in the previous section.
The compiler never calls move operations for variables having names. Let’s consider the implications
of this by looking at the next example, assuming Class offers a move constructor and a copy
constructor:
Class factory();
void fun(Class const &other); // a
void fun(Class &&tmp); // b
void callee(Class &&tmp);
{
fun(tmp); // 1
}
int main()
{
callee(factory());
}
• At 1 function a is called. At first sight this might be surprising, but fun’s argument is not an
anonymous temporary object but a named temporary object.
Realizing that fun(tmp) might be called twice the compiler’s choice is understandable. If tmp’s
data would have been grabbed at the first call, the second call would receive tmp without any data.
But at the last call we might know that tmp is never used again and so we might like to ensure that
fun(Class &&) is called. For this, once again, std::move is used:
fun(std::move(tmp)); // last call!
212 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
9.7.3 Move-assignment
In addition to the overloaded assignment operator a move assignment operator may be implemented
for classes supporting move operations. In this case, if the class supports swapping the implementation
is surprisingly simple. No copy construction is required and the move assignment operator can
simply be implemented like this:
Class &operator=(Class &&tmp)
{
swap(tmp);
return *this;
}
If swapping is not supported then the assignment can be performed for each of the data members in
turn, using std::move as shown in the previous section with a class Person. Here is an example
showing how to do this with that class Person:
Person &operator=(Person &&tmp)
{
d_name = std::move(tmp.d_name);
d_address = std::move(tmp.d_address);
return *this;
}
As noted previously (section 9.7.1) declaring a move assignment operator suppresses the default
availability of the copy constructor. It is made available again by declaring the copy constructor
in the class’s interface (and of course by providing an explicit implementation or by using the =
default default implementation).
9.7.4 Revising the assignment operator (part II)
Now that we’ve familiarized ourselves with the overloaded assignment operator and the moveassignment,
let’s once again have a look at their implementations for a class Class, supporting
swapping through its swap member. Here is the generic implementation of the overloaded assignment
operator:
Class &operator=(Class const &other)
{
Class tmp(other);
swap(tmp);
return *this;
}
and this is the move-assignment operator:
Class &operator=(Class &&tmp)
{
swap(tmp);
return *this;
}
9.7. MOVING DATA 213
They look remarkably similar in the sense that the overloaded assignment operator’s code is identical
to the move-assignment operator’s code once a copy of the other object is available. Since the
overloaded assignment operator’s tmp object really is nothing but a temporary Class object we can
use this fact by implementing the overloaded assignment operator in terms of the move-assignment.
Here is a second revision of the overloaded assignment operator:
Class &operator=(Class const &other)
{
Class tmp(other);
return *this = std::move(tmp);
}
9.7.5 Moving and the destructor
Once a class becomes a move-aware class one should realize that its destructor still performs its job
as implemented. Consequently, when moving pointer values from a temporary source to a destination
the move constructor should make sure that the temporary’s pointer value is set to zero, to
prevent doubly freeing memory.
If a class defines pointers to pointer data members there usually is not only a pointer that is moved,
but also a size_t defining the number of elements in the array of pointers.
Once again, consider the class Strings. Its destructor is implemented like this:
Strings::~Strings()
{
for (string **end = d_string + d_size; end-- != d_string; )
delete *end;
delete[] d_string;
}
The move constructor (and other move operations!) must realize that the destructor not only deletes
d_string, but also considers d_size. A member implementing move operations should therefore
not only set d_string to zero but also d_size. The previously shownmove constructor for Strings
is therefore incorrect. Its improved implementation is:
Strings::Strings(Strings &&tmp)
:
d_memory(tmp.d_memory),
d_size(tmp.d_size),
d_capacity(tmp.d_capacity)
{
tmp.d_memory = 0;
tmp.d_size = 0;
}
If operations by the destructor all depend on d_string having a non-zero value then variations of
the above approach are of course possible. The move operationsmerely could decide to set d_memory
to 0, and then test whether d_memory == 0 in the destructor (and if so, end the destructor’s actions),
saving some d_size assignments.
214 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
9.7.6 Move-only classes
Classes may very well allow move semantics without offering copy semantics. Most stream classes
belong to this category. Extending their definition with move semantics greatly enhances their
usability. Once move semantics becomes available for such classes, so called factory functions (functions
returning an object constructed by the function) can easily be implemented. E.g.,
// assume char *filename
ifstream inStream(openIstream(filename));
For this example to work an ifstream constructor must offer a move constructor. This ensures that
only one object refers to the open istream.
Once classes offer move semantics their objects can also safely be stored in standard containers (cf.
chapter 12). When such containers perform reallocations (e.g., when their sizes are enlarged) they
use the object’s move constructors rather than their copy constructors. As move-only classes suppress
copy semantics containers storing objects of move-only classes implement the correct behavior
in that it is impossible to assign such containers to each other.
9.7.7 Default move constructors and assignment operators
As we’ve seen, classes by default offer a copy constructor and assignment operator. These class
members are implemented so as to provide basic support: data members of primitive data types
are copied byte-by-byte, but for class type data members their corresponding coy constructors c.q.
assignment operators are called.
The compiler can provide default implementations for move constructors and move assignment operators.
However, except for the copy constructor, default implementations for constructors (c.q. assignment
operators) are no longer provided once a class declares at least one constructor (c.q. assignment operator),
while the default copy constructor is suppressed by declarations of either the move constructor
or the move assignment operator.
If default implementations should be available in these cases, it’s easy to add them to the class by
adding = default to the appropriate constructor and assignment operator declarations.
Here is an example of a class offering all defaults: constructor, copy constructor, move constructor,
assignment operator and move assignment operator:
class Defaults
{
int d_x;
Mov d_mov;
};
Assuming that Mov is a class offering move operations in addition to the standard deep copy operations,
then the following actions are performed on the destination’s d_mov and d_x:
Defaults factory();
int main()
{ Mov operation: d_x:
9.7. MOVING DATA 215
---------------------------
Defaults one; Mov(), undefined
Defaults two(one); Mov(Mov const &), one.d_x
Defaults three(factory()); Mov(Mov &&tmp), tmp.d_x
one = two; Mov::operator=( two.d_x
Mov const &),
one = factory(); Mov::operator=( tmp.d_x
Mov &&tmp)
}
If, however, Defaults declares at least one constructor (it could be the copy constructor) and one
assignment operator, then only those members and the copy constructor remain available. E.g.:
class Defaults
{
int d_x;
Mov d_mov;
public:
Defaults(int x);
Defaults operator=(Defaults &&tmp);
};
Defaults factory();
int main()
{ Mov operation: resulting d_x:
--------------------------------
Defaults one; ERROR: not available
Defaults two(one); Mov(Mov const &), one.d_x
Defaults three(factory()); ERROR: not available
one = two; ERROR: not available
one = factory(); Mov::operator=( tmp.d_x
Mov &&tmp)
}
To reestablish the defaults, append = default to the appropriate declarations:
class Defaults
{
int d_x;
Mov d_mov;
public:
Defaults() = default;
Defaults(Defaults &&tmp) = default;
Defaults(int x);
// Default(Default const &) remains available
Defaults operator=(Defaults const &rhs) = default;
216 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
Defaults operator=(Defaults &&tmp);
};
Be cautious, declaring defaults, as default implementations copy data members of primitive types
byte-by-byte from the source object to the destination object. This is likely to cause problems with
pointer type data members.
The = default suffix can only be used when declaring constructors or assignment operators in the
class’s public section.
9.7.8 Moving: implications for class design
Here are some general rules to apply when designing classes offering value semantics (i.e., classes
whose objects can be used to initialize other objectes of their class and that can be asssigned to other
objects of their class):
• Classes using pointers to dynamically allocated memory, owned by the class’s objects must be
provided with a copy constructor, an overloaded copy assignment operator and a destructor;
• Classes using pointers to dynamically allocated memory, owned by the class’s objects, should
be provided with a move constructor and a move assignment operator;
• Classes using compositionmay benefit frommove constructors and move assignment operators
as well. Some classes support neithermove nor copy construction and assignment (for example:
stream classes don’t). If your class contains data members of such class types then defining
move operations is pointless.
In the previous sections we’ve also encountered an important design principle that can be applied to
move-aware classes:
Whenever a member of a class receives a const & to an object of its own class and creates
a copy of that object to perform its actual actions on, then that function’s implementation
can be implemented by an overloaded function expecting an rvalue reference.
The former function can now call the latter by passing std::move(tmp) to it. The advantages of
this design principle should be clear: there is only one implementation of the actual actions, and the
class automatically becomes move-aware with respect to the involved function.
We’ve seen an initial example of the use of this principle in section 9.7.4. Of course, the principle
cannot be applied to the copy constructor itself, as you need a copy constructor to make a copy. The
copy- and move constructors must always be implemented independently from each other.
9.8 Copy Elision and Return Value Optimization
When the compiler selects a member function (or constructor) it applies a simple set of rules, matching
arguments with parameter types.
Below two tables are shown. The first table should be used in cases where a function argument has
a name, the second table should be used in cases where the argument is anonymous. In each table
select the const or non-const column and then use the topmost overloaded function that is available
having the specified parameter type.
9.8. COPY ELISION AND RETURN VALUE OPTIMIZATION 217
The tables do not handle functions defining value parameters. If a function has overloads expecting,
respectively, a value parameter and some form of reference parameter the compiler reports an ambiguity
when such a function is called. In the following selection procedure we may assume, without
loss of generality, that this ambiguity does not occur and that all parameter types are reference
parameters.
Parameter types matching a function’s argument of type T if the argument is:
• a named argument (an lvalue or a named rvalue)
the argument is:
non-const const
Use the topmost (T &)
available function (T const &) (T const &)
Example: for an int x argument a function fun(int &) is selected rather than a function
fun(int const &). If no fun(int &) is available the fun(int const &) function is used.
If neither is available (and fun int hasn’t been defined instead) the compiler reports an error.
• an anonymous argument (an anonymous temporary or a literal value)
the argument is:
non-const const
Use the topmost (T &&)
available function (T const &&) (T const &&)
available function (T const &) (T const &)
Example: when the return value of an int arg() function is passed to a function fun for
which various overloaded versions are available fun(int &&) is selected. If this function is
unavailable but fun(int const &) is, then the latter function is used. If none of these two
functions is available the compiler reports an error.
The tables show that eventually all arguments can be used with a function specifying a T const
& parameter. For anonymous arguments a similar catch all is available having a higher priority:
T const && matches all anonymous arguments. Functions having this signature are normally not
defined as their implementations are (should be) identical to the implementations of the functions
expecting a T const & parameter. Since the temporary can apparently not be modified a function
defining a T const && parameter has no alternative but to copy the temporary’s resources. As
this task is already performed by functions expecting a T const &, there is no need for implenting
functions expecting T const && parameters.
As we’ve seen the move constructor grabs the information from a temporary for its own use. That is
OK as the temporary is going to be destroyed after that anyway. It also means that the temporary’s
data members are modified.
Having defined appropriate copy and/or move constructors it may be somewhat surprising to learn
that the compiler may decide to stay clear of a copy or move operation. After all making no copy and
not moving is more efficient than copying or moving.
The option the compiler has to avoid making copies (or perform move operations) is called copy
elision or return value optimization. In all situations where copy or move constructions are appropriate
the compiler may apply copy elision. Here are the rules. In sequence the compiler considers
the following options, stopping once an option can be selected:
• if a copy or move constructor exists, try copy elision
218 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
• if a move constructor exists, move.
• if a copy constructor exists, copy.
• report an error
All modern compilers apply copy elision. Here are some examples where it may be encountered:
class Elide;
Elide fun() // 1
{
Elide ret;
return ret;
}
void gun(Elide par);
Elide elide(fun()); // 2
gun(fun()); // 3
• At 1 ret may never exist. Instead of using ret and copying ret eventually to fun’s return
value it may directly use the area used to contain fun’s return value.
• At 2 fun’s return value may never exist. Instead of defining an area containing fun’s return
value and copying that return value to elide the compiler may decide to use elide to create
fun’s return value in.
• At 3 the compiler may decide to do the same for gun’s par parameter: fun’s return value is
directly created in par’s area, thus eliding the copy operation from fun’s return value to par.
9.9 Plain Old Data
C++ inherited the struct concept fromC and extended it with the class concept. Structs are still used
in C++, mainly to store and pass around aggregates of different data types. A commonly term for
these structs is plain old data (pod). Plain old data is commonly used in C++ programs to aggregate
data. E.g., when a function needs to return a double, bool and std::string these three different
data types may be aggregated using a struct that merely exists to pass along values. Data
protection and functionality is hardly ever an issue. For such cases C and C++ use structs. But
as a C++ struct is just a class with special access rights some members (constructors, destructor,
overloaded assignment operator) may implicitly be defined. The plain old data capitalizes on this
concept by requiring that its definition remains as simple as possible. Pod is considered any class or
struct having the following characteristics:
• it has a trivial default constructor.
If a type has some trivial member then the type (or its base class(es), cf. chapter 13) does
not explicitly define that member. Rather, it is supplied by the compiler. A trivial default
constructor leaves all its non-class data members unitialized and calls the default constructors
of all its class data members. A class having a trivial default constructor does not define any
constructor at all (nor does/do its base class/classes). It may also define the default constructor
using the default constructor syntax introduced in section 7.6;
9.10. CONCLUSION 219
• it has a trivial copy constructor.
A trivial copy constructor byte-wise copies the non-class data members from the provided existing
class object and uses copy constructors to initialize its base class(es) and class data
members with the information found in the provided existing class object;
• it has a trivial overloaded assignment operator.
A trivial assignment operator performs a byte-wise copy of the non-class data members of
the provided right-hand class object and uses overloaded assignment operators to assign new
values to its class data members using the corresponding members of the provided right-hand
class object;
• it has a trivial destructor.
A trivial destructor calls the destructors of its base class(es) and class-type data members;
• it has a standard layout.
A standard-layout class or struct
• has only non-static data members that are themselves showing the standard-layout;
• has identical access control (public, private, protected) for all its non-static members;
Furthermore, in the context of class derivation (cf. chapters 14 and 13), a standard-layout class or
struct:
• has only base classes that themselves show the standard-layout;
• has at most one (in)direct base class having non-static members;
• has no base classes of the same type as its first non-static data member;
• has no virtual base classes;
• has no virtual members.
9.10 Conclusion
Four important extensions to classes were introduced in this chapter: the destructor, the copy constructor,
the move constructor and the overloaded assignment operator. In addition the importance
of swapping, especially in combination with the overloaded assignment operator, was stressed.
Classes having pointer data members, pointing to dynamically allocated memory controlled by the
objects of those classes, are potential sources of memory leaks. The extensions introduced in this
chapter implement the standard defense against such memory leaks.
Encapsulation (data hiding) allows us to ensure that the object’s data integrity is maintained. The
automatic activation of constructors and destructors greatly enhance our capabilities to ensure the
data integrity of objects doing dynamic memory allocation.
A simple conclusion is therefore that classes whose objects allocate memory controlled by themselves
must at least implement a destructor, an overloaded assignment operator and a copy constructor.
Implementing a move constructor remains optional, but it allows us to use factory functions with
classes not allowing copy construction and/or assignment.
In the end, assuming the availability of at least a copy or move constructor, the compiler might
avoid them using copy elision. The compiler is free to use copy elision wherever possible; it is,
220 CHAPTER 9. CLASSES AND MEMORY ALLOCATION
however, never a requirement. The compiler may therefore always decide not to use copy elision. In
all situations where otherwise a copy or move constructor would have been used the compiler may
consider to use copy elision.

Chapter 10
Exceptions
C supports several ways for a program to react to situations breaking the normal unhampered flow
of a program:
• The function may notice the abnormality and issue a message. This is probably the least
disastrous reaction a program may show.
• The function in which the abnormality is observed may decide to stop its intended task, returning
an error code to its caller. This is a great example of postponing decisions: now the
calling function is faced with a problem. Of course the calling function may act similarly, by
passing the error code up to its caller.
• The function may decide that things are going out of hand, and may call exit to terminate the
program completely. A tough way to handle a problem if only because the destructors of local
objects aren’t activated.
• The function may use a combination of the functions setjmp and longjmp to enforce non-local
exits. This mechanism implements a kind of goto jump, allowing the program to continue at
an outer level, skipping the intermediate levels which would have to be visited if a series of
returns from nested functions would have been used.
In C++ all these flow-breaking methods are still available. However, of the mentioned alternatives,
setjmp and longjmp isn’t frequently encountered in C++ (or even in C) programs, due to the fact
that the program flow is completely disrupted.
C++ offers exceptions as the preferred alternative to, e.g., setjmp and longjmp. Exceptions allow
C++ programs to perform a controlled non-local return, without the disadvantages of longjmp and
setjmp.
Exceptions are the proper way to bail out of a situation which cannot be handled easily by a function
itself, but which is not disastrous enough for a program to terminate completely. Also, exceptions
provide a flexible layer of control between the short-range return and the crude exit.
In this chapter exceptions are covered. First an example is given of the different impact exceptions
and the setjmp/longjmp combination have on programs. This example is followed by a discussion
of the formal aspects of exceptions. In this part the guarantees our software should be able to offer
when confronted with exceptions are presented. Exceptions and their guarantees have consequences
for constructors and destructors. We’ll encounter these consequences at the end of this chapter.
221
222 CHAPTER 10. EXCEPTIONS
10.1 Exception syntax
Before contrasting the traditional C way of handling non-local gotos with exceptions let’s introduce
the syntactic elements that are involved when using exceptions.
• Exceptions are generated by a throw statement. The keyword throw, followed by an expression
of a certain type, throws the expression value as an exception. In C++ anything having
value semantics may be thrown as an exception: an int, a bool, a string, etc. However,
there also exists a standard exception type (cf. section 10.8) that may be used as base class (cf.
chapter 13) when defining new exception types.
• Exceptions are generated within a well-defined local environment, called a try-block. The
run-time support system ensures that all of the program’s code is itself surrounded by a global
try block. Thus, every exception generated by our code will always reach the boundary of at
least one try-block. A program terminates when an exception reaches the boundary of the
global try block, and when this happens destructors of local and global objects that were alive
at the point where the exception was generated are not called. This is not a desirable situation
and therefore all exceptions should be generated within a try-block explicitly defined by the
program. Here is an example of a string exception thrown from within a try-block:
try
{
// any code can be defined here
if (someConditionIsTrue)
throw string("this is the std::string exception");
// any code can be defined here
}
• catch: Immediately following the try-block, one or more catch-clauses must be defined. A
catch-clause consists of a catch-header defining the type of the exception it can catch followed
by a compound statement defining what to do with the caught exception:
catch (string const &msg)
{
// statements in which the caught string object are handled
}
Multiple catch clauses may appear underneath each other, one for each exception type that
has to be caught. In general the catch clauses may appear in any order, but there are exceptions
requiring a specific order. To avoid confusion it’s best to put a catch clause for the most
general exception last. At most one exception clause will be activated. C++ does not support a
Java-style finally-clause activated after completing a catch clause.
10.2 An example using exceptions
In the following examples the same basic program is used. The program uses two classes, Outer
and Inner.
First, an Outer object is defined in main, and its member Outer::fun is called. Then, in
Outer::fun an Inner object is defined. Having defined the Inner object, its member Inner::fun
is called.
10.2. AN EXAMPLE USING EXCEPTIONS 223
That’s about it. The function Outer::fun terminates calling inner’s destructor. Then the program
terminates, activating outer’s destructor. Here is the basic program:
#include <iostream>
using namespace std;
class Inner
{
public:
Inner();
~Inner();
void fun();
};
Inner::Inner()
{
cout << "Inner constructor\n";
}
Inner::~Inner()
{
cout << "Inner destructor\n";
}
void Inner::fun()
{
cout << "Inner fun\n";
}
class Outer
{
public:
Outer();
~Outer();
void fun();
};
Outer::Outer()
{
cout << "Outer constructor\n";
}
Outer::~Outer()
{
cout << "Outer destructor\n";
}
void Outer::fun()
{
Inner in;
cout << "Outer fun\n";
in.fun();
}
int main()
{
Outer out;
out.fun();
}
224 CHAPTER 10. EXCEPTIONS
/*
Generated output:
Outer constructor
Inner constructor
Outer fun
Inner fun
Inner destructor
Outer destructor
*/
After compiling and running, the program’s output is entirely as expected: the destructors are called
in their correct order (reversing the calling sequence of the constructors).
Now let’s focus our attention on two variants in which we simulate a non-fatal disastrous event in
the Inner::fun function. This event must supposedly be handled near main’s end.
We’ll consider two variants. In the first variant the event is handled by setjmp and longjmp; in
the second variant the event is handled using C++’s exception mechanism.
10.2.1 Anachronisms: ‘setjmp’ and ‘longjmp’
The basic program from the previous section is slightly modified to contain a variable jmp_buf
jmpBuf used by setjmp and longjmp.
The function Inner::fun calls longjmp, simulating a disastrous event, to be handled near main’s
end. In main a target location for the long jump is defined through the function setjmp. Setjmp’s
zero return indicates the initialization of the jmp_buf variable, in which case Outer::fun is called.
This situation represents the ‘normal flow’.
The program’s return value is zero only if Outer::fun terminates normally. The program, however,
is designed in such a way that this won’t happen: Inner::fun calls longjmp. As a result the
execution flow returns to the setjmp function. In this case it does not return a zero return value.
Consequently, after calling Inner::fun from Outer::fun main’s if-statement is entered and the
program terminates with return value 1. Try to follow these steps when studying the following
program source, which is a direct modification of the basic program given in section 10.2:
#include <iostream>
#include <setjmp.h>
#include <cstdlib>
using namespace std;
jmp_buf jmpBuf;
class Inner
{
public:
Inner();
~Inner();
void fun();
};
Inner::Inner()
{
10.2. AN EXAMPLE USING EXCEPTIONS 225
cout << "Inner constructor\n";
}
void Inner::fun()
{
cout << "Inner fun\n";
longjmp(jmpBuf, 0);
}
Inner::~Inner()
{
cout << "Inner destructor\n";
}
class Outer
{
public:
Outer();
~Outer();
void fun();
};
Outer::Outer()
{
cout << "Outer constructor\n";
}
Outer::~Outer()
{
cout << "Outer destructor\n";
}
void Outer::fun()
{
Inner in;
cout << "Outer fun\n";
in.fun();
}
int main()
{
Outer out;
if (setjmp(jmpBuf) != 0)
return 1;
out.fun();
}
/*
Generated output:
Outer constructor
Inner constructor
Outer fun
Inner fun
Outer destructor
*/
This program’s output clearly shows that inner’s destructor is not called. This is a direct conse226
CHAPTER 10. EXCEPTIONS
quence of the non-local jump performed by longjmp. Processing proceeds immediately from the
longjmp call inside Inner::fun to setjmp in main. There, its return value is unequal zero, and
the program terminates with return value 1. Because of the non-local jump Inner::~Inner is
never executed: upon return to main’s setjmp the existing stack is simply broken down disregarding
any destructors waiting to be called.
This example illustrates that the destructors of objects can easily be skipped when longjmp and
setjmp are used and C++ programs should therefore avoid those functions like the plague.
10.2.2 Exceptions: the preferred alternative
Exceptions are C++’s answer to the problems caused by setjmp and longjmp. Here is an example
using exceptions. The program is once again derived from the basic program of section 10.2:
#include <iostream>
using namespace std;
class Inner
{
public:
Inner();
~Inner();
void fun();
};
Inner::Inner()
{
cout << "Inner constructor\n";
}
Inner::~Inner()
{
cout << "Inner destructor\n";
}
void Inner::fun()
{
cout << "Inner fun\n";
throw 1;
cout << "This statement is not executed\n";
}
class Outer
{
public:
Outer();
~Outer();
void fun();
};
Outer::Outer()
{
cout << "Outer constructor\n";
}
Outer::~Outer()
{
10.3. THROWING EXCEPTIONS 227
cout << "Outer destructor\n";
}
void Outer::fun()
{
Inner in;
cout << "Outer fun\n";
in.fun();
}
int main()
{
Outer out;
try
{
out.fun();
}
catch (int x)
{}
}
/*
Generated output:
Outer constructor
Inner constructor
Outer fun
Inner fun
Inner destructor
Outer destructor
*/
Inner::fun now throws an int exception where a longjmp was previously used. Since in.fun is
called by out.fun, the exception is generated within the try block surrounding the out.fun call.
As an int value was thrown this value reappears in the catch clause beyond the try block.
Now Inner::fun terminates by throwing an exception instead of calling longjmp. The exception is
caught in main, and the program terminates. Now we see that inner’s destructor is properly called.
It is interesting to note that Inner::fun’s execution really terminates at the throw statement: The
cout statement, placed just beyond the throw statement, isn’t executed.
What did this example teach us?
• Exceptions provide a means to break a function’s (and program’s) normal flow without having
to use a cascade of return-statements, and without the need to terminate the program using
blunt tools like the function exit.
• Exceptions do not disrupt the proper activation of destructors. Since setjmp and longjmp do
distrupt the proper activation of destructors their use is strongly deprecated in C++.
10.3 Throwing exceptions
Exceptions are generated by throw statements. The throw keyword is followed by an expression,
defining the thrown exception value. Example:
throw "Hello world"; // throws a char *
228 CHAPTER 10. EXCEPTIONS
throw 18; // throws an int
throw string("hello"); // throws a string
Local objects cease to exist when a function terminates. This is no different for exceptions.
Objects defined locally in functions are automatically destroyed once exceptions thrown by these
functions leave these functions. This also happens to objects thrown as exceptions. However, just
before leaving the function context the object is copied and it is this copy that eventually reaches the
appropriate catch clause.
The following examples illustrates this process. Object::fun defines a local Object toThrow,
that is thrown as an exception. The exception is caught in main. But by then the object originally
thrown doesn’t exist anymore, and main received a copy:
#include <iostream>
#include <string>
using namespace std;
class Object
{
string d_name;
public:
Object(string name)
:
d_name(name)
{
cout << "Constructor of " << d_name << "\n";
}
Object(Object const &other)
:
d_name(other.d_name + " (copy)")
{
cout << "Copy constructor for " << d_name << "\n";
}
~Object()
{
cout << "Destructor of " << d_name << "\n";
}
void fun()
{
Object toThrow("’local object’");
cout << "Calling fun of " << d_name << "\n";
throw toThrow;
}
void hello()
{
cout << "Hello by " << d_name << "\n";
}
};
int main()
{
Object out("’main object’");
try
10.3. THROWING EXCEPTIONS 229
{
out.fun();
}
catch (Object o)
{
cout << "Caught exception\n";
o.hello();
}
}
Object’s copy constructor is special in that it defines its name as the other object’s name to which
the string " (copy)" is appended. This allow us to monitor the construction and destruction of
objects more closely. Object::fun generates an exception, and throws its locally defined object.
Just before throwing the exception the program has produced the following output:
Constructor of ’main object’
Constructor of ’local object’
Calling fun of ’main object’
When the exception is generated the next line of output is produced:
Copy constructor for ’local object’ (copy)
The local object is passed to throw where it is treated as a value argument, creating a copy of
toThrow. This copy is thrown as the exception, and the local toThrow object ceases to exist. The
thrown exception is now caught by the catch clause, defining an Object value parameter. Since
this is a value parameter yet another copy is created. Thus, the program writes the following text:
Destructor of ’local object’
Copy constructor for ’local object’ (copy) (copy)
The catch block now displays:
Caught exception
Following this o’s hello member is called, showing us that we indeed received a copy of the copy of
the original toThrow object:
Hello by ’local object’ (copy) (copy)
Then the program terminates and its remaining objects are now destroyed, reversing their order of
creation:
Destructor of ’local object’ (copy) (copy)
Destructor of ’local object’ (copy)
Destructor of ’main object’
The copy created by the catch clause clearly is superfluous. It can be avoided by defining object
reference parameters in catch clauses: ‘catch (Object &o)’. The program now produces the
following output:
Constructor of ’main object’
230 CHAPTER 10. EXCEPTIONS
Constructor of ’local object’
Calling fun of ’main object’
Copy constructor for ’local object’ (copy)
Destructor of ’local object’
Caught exception
Hello by ’local object’ (copy)
Destructor of ’local object’ (copy)
Destructor of ’main object’
Only a single copy of toThrow was created.
It’s a bad idea to throw a pointer to a locally defined object. The pointer is thrown, but the object
to which the pointer refers ceases to exist once the exception is thrown. The catcher receives a wild
pointer. Bad news....
Let’s summarize the above findings:
• Local objects are thrown as copied objects;
• Don’t throw pointers to local objects;
• It is possible to throw pointers to dynamically generated objects. In this case one must take
care that the generated object is properly deleted by the exception handler to prevent amemory
leak.
Exceptions are thrown in situations where a function can’t complete its assigned task, but the program
is still able to continue. Imagine a program offering an interactive calculator. The program
expects numeric expressions, which are evaluated. Expressions may show syntactic errors or it may
be mathematically impossible to evaluate them. Maybe the calculator allows us to define and use
variables and the user might refer to non-existing variables: plenty of reasons for the expression
evaluation to fail, and so many reasons for exceptions to be thrown. None of those should terminate
the program. Instead, the program’s user is informed about the nature of the problem and is invited
to enter another expression. Example:
if (!parse(expressionBuffer)) // parsing failed
throw "Syntax error in expression";
if (!lookup(variableName)) // variable not found
throw "Variable not defined";
if (divisionByZero()) // unable to do division
throw "Division by zero is not defined";
Where these throw statements are located is irrelevant: they may be found deeply nested inside the
program, or at a more superficial level. Furthermore, functions may be used to generate the exception
to be thrown. An Exception object might support stream-like insertion operations allowing us
to do, e.g.,
if (!lookup(variableName))
throw Exception() << "Undefined variable ’" << variableName << "’;
10.3. THROWING EXCEPTIONS 231
10.3.1 The empty ‘throw’ statement
Sometimes it is required to inspect a thrown exception. An exception catcher may decide to ignore
the exception, to process the exception, to rethrow it after inspection or to change it into another
kind of exception. For example, in a server-client application the client may submit requests to the
server by entering them into a queue. Normally every request is eventually answered by the server.
The server may reply that the request was successfully processed, or that some sort of error has
occurred. On the other hand, the server may have died, and the client should be able to discover this
calamity, by not waiting indefinitely for the server to reply.
In this situation an intermediate exception handler is called for. A thrown exception is first inspected
at the middle level. If possible it is processed there. If it is not possible to process the exception at the
middle level, it is passed on, unaltered, to a more superficial level, where the really tough exceptions
are handled.
By placing an empty throw statement in the exception handler’s code the received exception is
passed on to the next level that might be able to process that particular type of exception. The
rethrown exception is never handled by one of its neighboring exception handlers; it is always transferred
to an exception handler at a more superficial level.
In our server-client situation a function
initialExceptionHandler(string &exception)
could be designed to handle the string exception. The received message is inspected. If it’s
a simple message it’s processed, otherwise the exception is passed on to an outer level. In
initialExceptionHandler’s implementation the empty throw statement is used:
void initialExceptionHandler(string &exception)
{
if (!plainMessage(exception))
throw;
handleTheMessage(exception);
}
Below (section 10.5), the empty throw statement is used to pass on the exception received by a
catch-block. Therefore, a function like initialExceptionHandler can be used for a variety of
thrown exceptions, as long as their types match initialExceptionHandler’s parameter, which is
a string.
The next example jumps slightly ahead, using some of the topics covered in chapter 14. The example
may be skipped, though, without loss of continuity.
A basic exception handling class can be constructed from which specific exception types are
derived. Suppose we have a class Exception, having a member function ExceptionType
Exception::severity. This member function tells us (little wonder!) the severity of a thrown
exception. It might be Info, Notice, Warning, Error or Fatal. The information contained in
the exception depends on its severity and is processed by a function handle. In addition, all exceptions
support a member function like textMsg, returning textual information about the exception
in a string.
By defining a polymorphic function handle it can be made to behave differently, depending on the
nature of a thrown exception, when called from a basic Exception pointer or reference.
232 CHAPTER 10. EXCEPTIONS
In this case, a program may throw any of these five exception types. Assuming that the classes
Message and Warning were derived from the class Exception, then the handle function matching
the exception type will automatically be called by the following exception catcher:
//
catch(Exception &ex)
{
cout << e.textMsg() << ’\n’;
if
(
ex.severity() != ExceptionType::Warning
&&
ex.severity() != ExceptionType::Message
)
throw; // Pass on other types of Exceptions
ex.handle(); // Process a message or a warning
}
Now anywhere in the try block preceding the exception handler Exception objects or objects of
one of its derived classes may be thrown. All those exceptions will be caught by the above handler.
E.g.,
throw Info();
throw Warning();
throw Notice();
throw Error();
throw Fatal();
10.4 The try block
The try-block surrounds throw statements. Remember that a program is always surrounded by
a global try block, so throw statements may appear anywhere in your code. More often, though,
throw statements are used in function bodies and such functions may be called from within try
blocks.
A try block is defined by the keyword try followed by a compound statement. This block, in turn,
must be followed by at least one catch handler:
try
{
// any statements here
}
catch(...) // at least one catch clause here
{}
Try-blocks are commonly nested, creating exception levels. For example, main’s code is surrounded
by a try-block, forming an outer level handling exceptions. Within main’s try-block functions are
called which may also contain try-blocks, forming the next exception level. As we have seen (section
10.3.1), exceptions thrown in inner level try-blocks may or may not be processed at that level. By
10.5. CATCHING EXCEPTIONS 233
placing an empty throw statement in an exception handler, the thrown exception is passed on to
the next (outer) level.
10.5 Catching exceptions
A catch clause consists of the keyword catch followed by a parameter list defining one parameter
specifying type and (parameter) name of the exception caught by that particular catch handler.
This name may then be used as a variable in the compound statement following the catch clause.
Example:
catch (string &message)
{
// code to handle the message
}
Primitive types and objects may be thrown as exceptions. It’s a bad idea to throw a pointer or reference
to a local object, but a pointer to a dynamically allocated object may be thrown if the exception
handler deletes the allocated memory to prevent a memory leak. Nevertheless, throwing such a
pointer is dangerous as the exception handler won’t be able to distinguish dynamically allocated
memory from non-dynamically allocated memory, as illustrated by the next example:
try
{
static int x;
int *xp = &x;
if (condition1)
throw xp;
xp = new int(0);
if (condition2)
throw xp;
}
catch (int *ptr)
{
// delete ptr or not?
}
Close attention should be paid to the nature of the parameter of the exception handler, to make
sure that when pointers to dynamically allocated memory are thrown the memory is returned once
the handler has processed the pointer. In general pointers should not be thrown as exceptions. If
dynamically allocated memory must be passed to an exception handler then the pointer should be
wrapped in a smart pointer, like unique_ptr or shared_ptr (cf. sections 18.3 and 18.4).
Multiple catch handlers may follow a try block, each handler defining its own exception type.
The order of the exception handlers is important. When an exception is thrown, the first exception
handler matching the type of the thrown exception is used and remaining exception handlers are
ignored. Eventually at most one exception handler following a try-block is activated. Normally this
is of no concern as each exception has its own unique type.
Example: if exception handlers are defined for char _s and void _s then NTB strings are caught
by the former handler. Note that a char _ can also be considered a void _, but the exception type
234 CHAPTER 10. EXCEPTIONS
matching procedure is smart enough to use the char _ handler with the thrown NTBS. Handlers
should be designed very type specific to catch the correspondingly typed exception. For example,
int-exceptions are not caught by double-catchers, char-exceptions are not caught by int-catchers.
Here is a little example illustrating that the order of the catchers is not important for types not
having any hierarchal relationship to each other (i.e., int is not derived from double; string is
not derived from an NTBS):
#include <iostream>
using namespace std;
int main()
{
while (true)
{
try
{
string s;
cout << "Enter a,c,i,s for ascii-z, char, int, string "
"exception\n";
getline(cin, s);
switch (s[0])
{
case ’a’:
throw "ascii-z";
case ’c’:
throw ’c’;
case ’i’:
throw 12;
case ’s’:
throw string();
}
}
catch (string const &)
{
cout << "string caught\n";
}
catch (char const *)
{
cout << "ASCII-Z string caught\n";
}
catch (double)
{
cout << "isn’t caught at all\n";
}
catch (int)
{
cout << "int caught\n";
}
catch (char)
{
cout << "char caught\n";
}
}
}
10.5. CATCHING EXCEPTIONS 235
Rather than defining specific exception handlers a specific class can be designed whose objects contain
information about the exception. Such an approach was mentioned earlier, in section 10.3.1.
Using this approach, there’s only one handler required, since we know we don’t throw other types of
exceptions:
try
{
// code throws only Exception pointers
}
catch (Exception &ex)
{
ex.handle();
}
When the code of an exception handler has been processed, execution continues beyond the last
exception handler directly following the matching try-block (assuming the handler doesn’t itself
use flow control statements (like return or throw) to break the default flow of execution). The
following cases can be distinguished:
• If no exception was thrown within the try-block no exception handler is activated, and execution
continues from the last statement in the try-block to the first statement beyond the last
catch-block.
• If an exception was thrown within the try-block but neither the current level nor another level
contains an appropriate exception handler, the program’s default exception handler is called,
aborting the program.
• If an exception was thrown from the try-block and an appropriate exception handler is available,
then the code of that exception handler is executed. Following that, the program’s execution
continues at the first statement beyond the last catch-block.
All statements in a try block following an executed throw-statement are ignored. However, objects
that were successfully constructed within the try block before executing the throw statement are
destroyed before any exception handler’s code is executed.
10.5.1 The default catcher
At a certain level of the program only a limited set of handlers may actually be required. Exceptions
whose types belong to that limited set are processed, all other exceptions are passed on to exception
handlers of an outer level try block.
An intermediate type of exception handling may be implemented using the default exception handler,
which must be (due to the hierarchal nature of exception catchers, discussed in section 10.5)
placed beyond all other, more specific exception handlers.
This default exception handler cannot determine the actual type of the thrown exception and cannot
determine the exception’s value but it may execute some statements, and thus do some default
processing. Moreover, the caught exception is not lost, and the default exception handler may use
the empty throw statement (see section 10.3.1) to pass the exception on to an outer level, where it’s
actually processed. Here is an example showing this use of a default exception handler:
#include <iostream>
using namespace std;
236 CHAPTER 10. EXCEPTIONS
int main()
{
try
{
try
{
throw 12.25; // no specific handler for doubles
}
catch (int value)
{
cout << "Inner level: caught int\n";
}
catch (...)
{
cout << "Inner level: generic handling of exceptions\n";
throw;
}
}
catch(double d)
{
cout << "Outer level may use the thrown double: " << d << ’\n’;
}
}
/*
Generated output:
Inner level: generic handling of exceptions
Outer level may use the thrown double: 12.25
*/
The program’s output illustrates that an empty throw statement in a default exception handler
throws the received exception to the next (outer) level of exception catchers, keeping type and value
of the thrown exception.
Thus, basic or generic exception handling can be accomplished at an inner level, while specific handling,
based on the type of the thrown expression, can be provided at an outer level. Additionally,
particularly in multi-threaded programs (cf. chapter 20), thrown exceptions can be transferred between
threads after converting std::exception objects to std::exception_ptr objects. This
proceduce can even be used from inside the default catcher. Refer to section 20.13.1 for further
coverage of the class std::exception_ptr.
10.6 Declaring exception throwers (deprecated)
Functions defined elsewhere may be linked to code that uses these functions. Such functions are
normally declared in header files, either as standalone functions or as class member functions.
Those functions may of course throw exceptions. Declarations of such functions may contain a (now
deprecated, see also section 23.7) function throw list or exception specification list specifying the
types of the exceptions that can be thrown by the function. For example, a function that may throw
‘char _’ and ‘int’ exceptions can be declared as
void exceptionThrower() throw(char *, int);
10.6. DECLARING EXCEPTION THROWERS (DEPRECATED) 237
A function throw list immediately follows the function header (and it also follows a possible const
specifier). Throw lists may be empty. It has the following general form:
throw([type1 [, type2, type3, ...]])
If a function is guaranteed not to throw exceptions an empty function throw list may be used. E.g.,
void noExceptions() throw ();
In all cases, the function header used in the function definition must exactly match the function
header used in the declaration, including a possibly empty function throw list.
A function for which a function throw list is specified may only throw exceptions of the types mentioned
in its throw list. A run-time error occurs if it throws other types of exceptions than those
mentioned in the function throw list. Example: the function charPintThrower shown below clearly
throws a char const _ exception. Since intThrower may throw an int exception, the function
throw list of charPintThrower must also contain int.
#include <iostream>
using namespace std;
void charPintThrower() throw(char const *, int);
class Thrower
{
public:
void intThrower(int) const throw(int);
};
void Thrower::intThrower(int x) const throw(int)
{
if (x)
throw x;
}
void charPintThrower() throw(char const *, int)
{
int x;
cerr << "Enter an int: ";
cin >> x;
Thrower().intThrower(x);
throw "this text is thrown if 0 was entered";
}
void runTimeError() throw(int)
{
throw 12.5;
}
int main()
{
238 CHAPTER 10. EXCEPTIONS
try
{
charPintThrower();
}
catch (char const *message)
{
cerr << "Text exception: " << message << ’\n’;
}
catch (int value)
{
cerr << "Int exception: " << value << ’\n’;
}
try
{
cerr << "Generating a run-time error\n";
runTimeError();
}
catch(...)
{
cerr << "not reached\n";
}
}
A function without a throw list may throw any kind of exception. Without a function throw list the
program’s designer is responsible for providing the correct handlers.
For various reason declaring exception throwers is now deprecated. Declaring exception throwers
does not imply that the compiler checks whether an improper exception is thrown. Rather, the
function will be surrounded by additional code in which the actual exception that is thrown is processed.
Instead of compile time checks one gets run-time overhead, resulting in additional code (and
execution time) thay is added to the function’s code. One could write, e.g.,
void fun() throw (int)
{
// code of this function, throwing exceptions
}
but the function would be compiled to something like the following (cf. section 10.11 for the
use of try immediately following the function’s header and section 10.8 for a description of
bad_exception):
void fun()
try // this code resulting from throw(int)
{
// the function’s code, throwing all kinds of exceptions
}
catch (int) // remaining code resulting from throw(int)
{
throw; // rethrow the exception, so it can be caught by the
// ‘intended’ handler
}
catch (...) // catch any other exception
{
throw bad_exception{};
10.7. IOSTREAMS AND EXCEPTIONS 239
}
Run-time overhead is caused by doubling the number of thrown and caught exceptions. Without a
throw list a thrown int is simply caught by its intended handler; with a throw list the int is first
caught by the ‘safeguarding’ handler added to the function. In there it is rethrown to be caught by
its intended handler next.
10.7 Iostreams and exceptions
The C++ I/O library was used well before exceptions were available in C++. Hence, normally the
classes of the iostream library do not throw exceptions. However, it is possible to modify that behavior
using the ios::exceptions member function. This function has two overloaded versions:
• ios::iostate exceptions():
this member returns the state flags for which the stream will throw exceptions;
• void exceptions(ios::iostate state)
this member causes the stream to throw an exception when state state is observed.
In the I/O library, exceptions are objects of the class ios::failure, derived from
ios::exception. A std::string const &message may be specified when defining a failure
object. Its message may then be retrieved using its virtual char const _what() const member.
Exceptions should be used in exceptional circumstances. Therefore, we think it is questionable
to have stream objects throw exceptions for fairly normal situations like EOF. Using exceptions to
handle input errors might be defensible (e.g., in situations where input errors should not occur and
imply a corrupted file) but often aborting the program with an appropriate error message would
probably be the more appropriate action. As an example consider the following interactive program
using exceptions to catch incorrect input:
#include <iostream>
#include <climits>
using namespace::std;
int main()
{
cin.exceptions(ios::failbit); // throw exception on fail
while (true)
{
try
{
cout << "enter a number: ";
int value;
cin >> value;
cout << "you entered " << value << ’\n’;
}
catch (ios::failure const &problem)
{
cout << problem.what() << ’\n’;
240 CHAPTER 10. EXCEPTIONS
cin.clear();
cin.ignore(INT_MAX, ’\n’); // ignore the faulty line
}
}
}
By default, exceptions raised from within ostream objects are caught by these objects, which set
their ios::badbit as a result. See also the paragraph on this issue in section 14.8.
10.8 Standard Exceptions
All data types may be thrown as exceptions. Several additional exception classes are now defined
by the C++ standard. Before using those additional exception classes the <stdexcept> header file
must be included. All of these standard exceptions are class types by themselves, but also offer all
facilities of the std::exception class and objects of the standard exception classes may also be
considered objects of the std::exception class.
The std::exception class offers the member
char const *what() const;
describing in a short textual message the nature of the exception.
C++ defines the following standard exception classes:
• std::bad_alloc (this requires the <new> header file): thrown when operator new fails;
• std::bad_exception (this requires the header file <exception> header file): thrown when
a function tries to generate another type of exception than declared in its function throw list;
• std::bad_cast (this requires the <typeinfo> header file): thrown in the context of polymorphism
(see section 14.6.1);
• std::bad_typeid (this requires the <typeinfo> header file): also thrown in the context of
polymorphism (see section 14.6.2);
All additional exception classes were derived from std::exception. The constructors of all these
additional classes accept std::string const & arguments summarizing the reason for the exception
(retrieved by the exception::what member). The additionally defined exception classes
are:
• std::domain_error: a (mathematical) domain error is detected;
• std::invalid_argument: the argument of a function has an invalid value;
• std::length_error: thrown when an object would have exceeded its maximum permitted
length;
• std::logic_error: a logic error should be thrown when a problemis detected in the internal
logic of the program. Example: a function like C’s printf is called with more arguments than
there are format specifiers in its format string;
• std::out_of_range: thrown when an argument exceeds its permitted range. Example:
thrown by at members when their arguments exceed the range of admissible index values;
10.9. SYSTEM ERROR, ERROR CODE AND ERROR CATEGORY 241
• std::overflow_error: an overflow error should be thrown when an arithmetic overflow is
detected. Example: dividing a value by a very small value;
• std::range_error: a range error should be thrown when an internal computation results in
a value exceeding a permissible range;
• std::runtime_error: a runtime error should be thrown when a problem is encountered that
can only be detected while the program is being executed. Example: a non-integral is entered
when the program’s input expects an integral value.
• std::underflow_error: an underflow error should be thrown when an arithmetic underflow
is detected. Example: dividing a very small value by a very large value.
10.9 System error, error code and error category
A std::system_error can be thrown when an error occurs that has an associated error code. Such
errors are typically encountered when calling low-level (like operating system) functions.
Before using system_error the <system_error> header file must be included.
A system_error object can be constructed using the standard textual description of the nature of
the encountered error, but in addition accepts an error_code or error_category object (see the next
two sections), further specifying the nature of the error. The error_code and error_category
classes are also declared in the system_error header file.
The header file system_error also defines an enum class errc whose values are equal to and
describe in a less cryptic way the traditional error code values as offered by C macros, e.g.,
enum class errc
{
address_family_not_supported, // EAFNOSUPPORT
address_in_use, // EADDRINUSE
address_not_available, // EADDRNOTAVAIL
already_connected, // EISCONN
argument_list_too_long, // E2BIG
argument_out_of_domain, // EDOM
bad_address, // EFAULT
...
};
In addition to the standard what member, the system_error class also offers a member code
returning a const reference to the exception’s error code. Here is the class’s public interface:
class system_error: public runtime_error
{
public:
system_error(error_code ec, string const &what_arg);
system_error(error_code ec, char const *what_arg);
system_error(error_code ec);
system_error(int ev, error_category const &ecat,
string const &what_arg);
system_error(int ev, error_category const &ecat,
char const *what_arg);
242 CHAPTER 10. EXCEPTIONS
system_error(int ev, error_category const &ecat);
error_code const &code() const noexcept;
char const *what() const noexcept;
}
The NTBS returned by its what member may be formatted by a system_error object like this:
what_arg + ": " + code().message()
Note that, although system_error was derived fromruntime_error, you’ll lose the codemember
when catching a std::exception object. Of course, downcasting is always possible, but that’s a
stopgap. Therefore, if a system_error is thrown, a matching catch(system_error const &)
clause should be provided (for a flexible alternative, see the class FBB::Exception in the author’s
Bobcat library1.)
10.9.1 The class ‘std::error_code’
Objects of the class std:error_code hold error code values, which may be defined by the operating
system or comparable low-level functions.
Before using error_code the <system_error> header file must be included.
The class offers the following constructors, members, and free functions:
Constructors:
• error_code() noexcept:
the default construction initializes the error code with an error value 0 and an error
category set to &system_category();
• error_code(ErrorCodeEnum e) noexcept:
this is a member template (cf. section 22.1.3), defining template <class
ErrorCodeEnum>. It initializes the object with the return value of
make_error_code(e).
The copy constructor is also available.
Members:
• void assign(int val, const error_category& cat):
assigns new values to the current object’s value and category data members;
• error_category const &category() const noexcept:
returns a reference to the object’s error category;
• void clear() noexcept:
after calling this member value is set to 0 and the object’s error category set to
&system_category();
1http://bobcat.sourceforge.net
10.9. SYSTEM ERROR, ERROR CODE AND ERROR CATEGORY 243
• error_condition default_error_condition() const noexcept:
returns category().default_error_condition(value());
• string message() const:
returns category().message(value());
• errorcode& operator=(ErrorCodeEnum e) noexcept:
a member template defining template <class ErrorCodeEnum>. It assigns the
return value of make_error_code(e) to the current object;
• explicit operator bool() const noexcept:
returns value() != 0;
• int value() const noexcept:
returns the object’s error value.
Free functions:
• error_code make_error_code(errc e) noexcept:
returns error_code(static_cast<int>(e), generic_category());
• bool operator<(error_code const &lhs, error_code const &rhs) noexcept:
returns
lhs.category() < rhs.category()
||
lhs.category() == rhs.category() && lhs.value() < rhs.value();
• std::ostream &operator<<(std::ostream & os, error_code const &ec):
inserts the following text into os:
os << ec.category().name() << ’:’ << ec.value().
10.9.2 The class ‘std::error_category’
The class std::error_category serves as a base class for types that identify the source and
encoding of a particular categories of error codes.
Before using error_category the <system_error> header file must be included.
Classes that are derived from error_category should merely support categories of errors in addition
to those that are already available in C++, and the behavior of such derived classes should
not differ from the be behavior of the error_category class itself. Moreover, such derived classes
should not alter errno’s value, or error states provided by other libraries.
The equality of error_category objects is deducted from the equality of their addresses. As
error_category objects are passed by reference, programs using objects of classes derived from
error_category should ensure that only a single object of each such type is actually used: the class
is designed as a Singleton (cf. Singleton Design Pattern (cf. Gamma et al. (1995) Design Patterns,
Addison-Wesley)): looking at the class’s public interface it becomes clear that no error_category
object can immediately be constructed. There is no public constructor. Nor is it possible to copy
244 CHAPTER 10. EXCEPTIONS
an existing error_category object, as the copy constructor and overloaded assignment operators
have been deleted. Derived classes should enforce these singleton characteristics as well. Here is
the error_category’s non-private class interface:
class error_category
{
public:
error_category(error_category const &) = delete;
virtual ~error_category() noexcept;
error_category& operator=(error_category const &) = delete;
virtual char const *name() const noexcept = 0;
virtual string message(int ev) const = 0;
virtual error_condition
default_error_condition(int ev) const noexcept;
virtual bool equivalent(int code,
error_condition const &condition
) const noexcept;
virtual bool equivalent(error_code const &code,
int condition
) const noexcept;
bool operator==(error_category const &rhs) const noexcept;
bool operator!=(error_category const &rhs) const noexcept;
bool operator<(error_category const &rhs) const noexcept;
protected:
error_category() noexcept;
};
error_category const &generic_category() noexcept;
error_category const &system_category() noexcept;
Members:
• char const _name() const noexcept:
must be overridden, and should return a textual name of the error category;
• string message(int ev) const:
must be overridden, and should return a string describing the error condition denoted
by ev;
• error_condition default_error_condition(int ev) const noexcept:
returns error_condition(ev, _this) (An object of type error_condition that
corresponds to ev);
• bool equivalent(int code, error_condition const &condition) const
noexcept:
returns default_error_condition(code) == condition (true if, for the category
of error represented by _this, code is considered equivalent to condition;
otherwise false);
10.10. EXCEPTION GUARANTEES 245
• bool equivalent(error_code const &code, int condition) const noexcept:
returns _this == code.category() && code.value() == condition (true
if, for the category of error represented by _this, code is considered equivalent to
condition; otherwise false);
• bool operator<(error_category const &rhs) const noexcept:
returns less<const error_category_>()(this, &rhs).
Free functions:
• error_category const &generic_category() noexcept:
returns a reference to an object of a type derived from the class error_category.
Since error_category and its derived classes should be singleton classes, calls to
this function must return references to the same object. The returned object’s name
member shall return a pointer to the string "generic";
• error_category const &system_category() noexcept:
returns a reference to an object of a type derived from the class error_category.
Since error_category and its derived classes should be singleton classes, calls to
this function must return references to the same object. The object’s name member
shall return a pointer to the string "system". If the argument ev corresponds to
a POSIX errno value ‘posv’, then the object’s default_error_condition member
should return error-condition(posv, generic_category()). Otherwise,
error_condition(ev, system_category()) shall be returned.
10.10 Exception guarantees
Software should be exception safe: the program should continue to work according to its specifications
in the face of exceptions. It is not always easy to realize exception safety. In this section some
guidelines and terminology is introduced when discussing exception safety.
Since exceptions may be generated from within all C++ functions, exceptions may be generated in
many situations. Not all of these situations are immediately and intuitively recognized as situations
where exceptions can be thrown. Consider the following function and ask yourself at which points
exceptions may be thrown:
void fun()
{
X x;
cout << x;
X *xp = new X(x);
cout << (x + *xp);
delete xp;
}
If it can be assumed that cout as used above does not throw an exception there are at least 13
opportunities for exceptions to be thrown:
• X x: the default constructor could throw an exception (#1)
246 CHAPTER 10. EXCEPTIONS
• cout << x: the overloaded insertion operator could throw an exception (#2), but its rvalue
argument might not be an X but, e.g., an int, and so X::operator int() const could be
called which offers yet another opportunity for an exception (#3).
• _xp = new X(x): the copy constructor may throw an exception (#4) and operator new (#5a)
too. But did you realize that this latter exception might not be thrown from ::new, but from,
e.g., X’s own overload of operator new? (#5b)
• cout << (x + _xp): we might be seduced into thinking that two X objects are added. But
it doesn’t have to be that way. A separate class Y might exist and X may have a conversion
operator operator Y() const, and operator+(Y const &lhs, X const &rhs),
operator+(X const &lhs, Y const &rhs), and operator+(X const &lhs, X const
&rhs) might all exist. So, if the conversion operator exists, then depending on the kind of overload
of operator+ that is defined either the addition’s left-hand side operand (#6), right-hand
side operand (#7), or operator+ itself (#8) may throw an exception. The resulting value may
again be of any type and so the overloaded cout << return-type-of-operator+ operator
may throw an exception (#9). Since operator+ returns a temporary object it is destroyed
shortly after its use. X’s destructor could throw an exception (#10).
• delete xp: whenever operator new is overloaded operator delete should be overloaded
as well and may throw an exception (#11). And of course, X’s destructor might again throw an
exception (#12).
• }: when the function terminates the local x object is destroyed: again an exception could be
thrown (#13).
It is stressed here (and further discussed in section 10.12) that although it is possible for exceptions
to leave destructors this would violate the C++ standard and so it must be prevented in wellbehaving
C++ programs.
How can we expect to create working programs when exceptions might be thrown at this many
situations?
Exceptions may be generated in a great many situations, but serious problems are prevented when
we’re able to provide at least one of the following exception guarantees:
• The basic guarantee: no resources are leaked. In practice this means: all allocated memory is
properly returned when exceptions are thrown.
• The strong guarantee: the program’s state remains unaltered when an exception is thrown (as
an example: the canonical formof the overloaded assignment operator provides this guarantee)
• The nothrow guarantee: this applies to code for which it can be proven that no exception can
be thrown from it.
10.10.1 The basic guarantee
The basic guarantee dictates that functions that fail to complete their assigned tasks must return
all allocated resources, usually memory, before terminating. Since practically all functions and operators
may throw exceptions and since a function may repeatedly allocate resources the blueprint of
a function allocating resources shown below defines a try block to catch all exceptions that might be
thrown. The catch handler’s task is to return all allocated resources and then rethrow the exception.
void allocator(X **xDest, Y **yDest)
{
10.10. EXCEPTION GUARANTEES 247
X *xp = 0; // non-throwing preamble
Y *yp = 0;
try // this part might throw
{
xp = new X[nX]; // alternatively: allocate one object
yp = new Y[nY];
}
catch(...)
{
delete xp;
throw;
}
delete[] *xDest; // non-throwing postamble
*xDest = xp;
delete[] *yDest;
*yDest = yp;
}
In the pre-try code the pointers to receive the addresses returned by the operator new calls are initialized
to 0. Since the catch handlermust be able to return allocatedmemory theymust be available
outside of the try block. If the allocation succeeds the memory pointed to by the destination pointers
is returned and then the pointers are given new values.
Allocation and or initialization might fail. If allocation fails new throws a std::bad_alloc exception
and the catch handler simply deletes 0 pointers which is OK.
If allocation succeeds but the construction of (some) of the objects fails by throwing an exception
then the following is guaranteed to happen:
• The destructors of all successfully allocated objects are called;
• The dynamically allocated memory to contain the objects is returned
Consequently, there is no memory leak when new fails. Inside the above try block new X may fail:
this does not affect the 0-pointers and so the catch handler merely deletes 0 pointers. When new
Y fails xp points to allocated memory and so it must be returned. This happens inside the catch
handler. The final pointer (here: yp) will only be unequal zero when new Y properly completes, so
there’s no need for the catch handler to return the memory pointed at by yp.
10.10.2 The strong guarantee
The strong guarantee dictates that an object’s state should not change in the face of exceptions. This
is realized by performing all operations that might throw on a separate copy of the data. If all this
succeeds then the current object and its (now successfully modified) copy are swapped. An example
of this approach can be observed in the canonical overloaded assignment operator:
Class &operator=(Class const &other)
{
Class tmp(other);
swap(tmp);
return *this;
248 CHAPTER 10. EXCEPTIONS
}
The copy construction might throw an exception, but this keeps the current object’s state intact. If
the copy construction succeeds swap swaps the current object’s contents with tmp’s contents and
returns a reference to the current object. For this to succeed it must be guaranteed that swap won’t
throw an exception. Returning a reference (or a value of a primitive data type) is also guaranteed
not to throw exceptions. The canonical form of the overloaded assignment operator therefore meets
the requirements of the strong guarantee.
Some rules of thumb were formulated that relate to the strong guarantee (cf. Sutter, H., Exceptional
C++, Addison-Wesley, 2000). E.g.,
• All the code that might throw an exception affecting the current state of an object should perform
its tasks separately from the data controlled by the object. Once this code has performed
its tasks without throwing an exception replace the object’s data by the new data.
• Member functions modifying their object’s data should not return original (contained) objects
by value.
The canonical assignment operator is a good example of the first rule of thumb. Another example is
found in classes storing objects. Consider a class PersonDb storing multiple Person objects. Such
a class might offer a member void add(Person const &next). A plain implementation of this
function (merely intended to show the application of the first rule of thumb, but otherwise completely
disregarding efficiency considerations) might be:
void PersonDb::newAppend(Person const &next)
{
Person *tmp = 0;
try
{
tmp = new Person[d_size + 1];
for (size_t idx = 0; idx < d_size; ++idx)
tmp[idx] = d_data[idx];
tmp[d_size] = next;
}
catch (...)
{
delete[] tmp;
throw;
}
}
void PersonDb::add(Person const &next)
{
Person *tmp = newAppend(next);
delete[] d_data;
d_data = tmp;
++d_size;
}
The (private) newAppendmember’s task is to create a copy of the currently allocated Person objects,
including the data of the next Person object. Its catch handler catches any exception that might
be thrown during the allocation or copy process and returns all memory allocated so far, rethrowing
the exception at the end. The function is exception neutral as it propagates all its exceptions to its
10.10. EXCEPTION GUARANTEES 249
caller. The function also doesn’t modify the PersonDb object’s data, so it meets the strong exception
guarantee. Returning from newAppend the member add may now modify its data. Its existing data
are returned and its d_data pointer is made to point to the newly created array of Person objects.
Finally its d_size is incremented. As these three steps don’t throw exceptions add too meets the
strong guarantee.
The second rule of thumb (member functions modifying their object’s data should not return original
(contained) objects by value) may be illustrated using a member PersonDb::erase(size_t idx).
Here is an implementation attempting to return the original d_data[idx] object:
Person PersonData::erase(size_t idx)
{
if (idx >= d_size)
throw string("Array bounds exceeded");
Person ret(d_data[idx]);
Person *tmp = copyAllBut(idx);
delete[] d_data;
d_data = tmp;
--d_size;
return ret;
}
Although copy elision usually prevents the use of the copy constructor when returning ret, this is
not guaranteed to happen. Furthermore, a copy constructormay throw an exception. If that happens
the function has irrevocably mutated the PersonDb’s data, thus losing the strong guarantee.
Rather than returning d_data[idx] by value it might be assigned to an external Person object
before mutating PersonDb’s data:
void PersonData::erase(Person *dest, size_t idx)
{
if (idx >= d_size)
throw string("Array bounds exceeded");
*dest = d_data[idx];
Person *tmp = copyAllBut(idx);
delete[] d_data;
d_data = tmp;
--d_size;
}
This modification works, but changes the original assignment of creating a member returning the
original object. However, both functions suffer from a task overload as they modify PersonDb’s data
and also return an original object. In situations like these the one-function-one-responsibility rule of
thumb should be kept in mind: a function should have a single, well defined responsibility.
The preferred approach is to retrieve PersonDb’s objects using a member like Person
const &at(size_t idx) const and to erase an object using a member like void
PersonData::erase(size_t idx).
10.10.3 The nothrow guarantee
Exception safety can only be realized if some functions and operations are guaranteed not to throw
exceptions. This is called the nothrow guarantee. An example of a function that must offer the
250 CHAPTER 10. EXCEPTIONS
nothrow guarantee is the swap function. Consider once again the canonical overloaded assignment
operator:
Class &operator=(Class const &other)
{
Class tmp(other);
swap(tmp);
return *this;
}
If swap were allowed to throw exceptions then it would most likely leave the current object in a
partially swapped state. As a result the current object’s state would most likely have been changed.
As tmp has been destroyed by the time a catch handler receives the thrown exception it becomes
very difficult (as in: impossible) to retrieve the object’s original state. Losing the strong guarantee
as a consequence.
The swap function must therefore offer the nothrow guarantee. It must have been designed as if
using the following prototype (see also section 23.7):
void Class::swap(Class &other) noexcept;
Likewise, operator delete and operator delete[] offer the nothrow guarantee, and according
to the C++ standard destructors may themselves not throw exceptions (if they do their behavior is
formally undefined, see also section 10.12 below).
Since the C programming language does not define the exception concept functions from the standard
C library offer the nothrow guarantee by implication. This allowed us to define the generic
swap function in section 9.6 using memcpy.
Operations on primitive types offer the nothrow guarantee. Pointers may be reassigned, references
may be returned etc. etc. without having to worry about exceptions that might be thrown.
10.11 Function try blocks
Exceptions may be generated while a constructor is initializing its members. How can exceptions
generated in such situations be caught by the constructor itself, rather than outside the constructor?
The intuitive solution, nesting the object construction in a try block does not solve the problem.
The exception by then has left the constructor and the object we intended to construct isn’t visible
anymore.
Using a nested try block is illustrated in the next example, where main defines an object of class
PersonDb. Assuming that PersonDb’s constructor throws an exception, there is no way we can access
the resources that might have been allocated by PersonDb’s constructor from the catch handler
as the pdb object is out of scope:
int main(int argc, char **argv)
{
try
{
PersonDb pdb(argc, argv); // may throw exceptions
... // main()’s other code
}
10.11. FUNCTION TRY BLOCKS 251
catch(...) // and/or other handlers
{
... // pdb is inaccessible from here
}
}
Although all objects and variables defined inside a try block are inaccessible from its associated
catch handlers, object data members were available before starting the try block and so they may be
accessed froma catch handler. In the following example the catch handler in PersonDb’s constructor
is able to access its d_data member:
PersonDb::PersonDb(int argc, char **argv)
:
d_data(0),
d_size(0)
{
try
{
initialize(argc, argv);
}
catch(...)
{
// d_data, d_size: accessible
}
}
Unfortunately, this does not help us much. The initialize member is unable to reassign d_data
and d_size if PersonDb const pdb was defined; the initialize member should at least offer
the basic exception guarantee and return any resources it has acquired before terminating due to
a thrown exception; and although d_data and d_size offer the nothrow guarantee as they are
of primitive data types a class type data member might throw an exception, possibly resulting in
violation of the basic guarantee.
In the next implementation of PersonDb assume that constructor receives a pointer to an already
allocated block of Person objects. The PersonDb object takes ownership of the allocated memory
and it is therefore responsible for the allocated memory’s eventual destruction. Moreover, d_data
and d_size are also used by a composed object PersonDbSupport, having a constructor expecting
a Person const _ and size_t argument. Our next implementation may then look something like
this:
PersonDb::PersonDb(Person *pData, size_t size)
:
d_data(pData),
d_size(size),
d_support(d_data, d_size)
{
// no further actions
}
This setup allows us to define a PersonDb const &pdb. Unfortunately, PersonDb cannot offer the
basic guarantee. If PersonDbSupport’s constructor throws an exception it isn’t caught although
d_data already points to allocated memory.
The function try block offers a solution for this problem. A function try block consists of a try
block and its associated handlers. The function try block starts immediately after the function
252 CHAPTER 10. EXCEPTIONS
header, and its block defines the function body. With constructors base class and data member
initializers may be placed between the try keyword and the opening curly brace. Here is our final
implementation of PersonDb, now offering the basic guarantee:
PersonDb::PersonDb(Person *pData, size_t size)
try
:
d_data(pData),
d_size(size),
d_support(d_data, d_size)
{}
catch (...)
{
delete[] d_data;
}
Let’s have a look at a stripped-down example. A constructor defines a function try block. The
exception thrown by the Throw object is initially caught by the object itself. Then it is rethrown. The
surrounding Composer’s constructor also defines a function try block, Throw’s rethrown exception
is properly caught by Composer’s exception handler, even though the exception was generated from
within its member initializer list:
#include <iostream>
class Throw
{
public:
Throw(int value)
try
{
throw value;
}
catch(...)
{
std::cout << "Throw’s exception handled locally by Throw()\n";
throw;
}
};
class Composer
{
Throw d_t;
public:
Composer()
try // NOTE: try precedes initializer list
:
d_t(5)
{}
catch(...)
{
std::cout << "Composer() caught exception as well\n";
}
};
10.12. EXCEPTIONS IN CONSTRUCTORS AND DESTRUCTORS 253
int main()
{
Composer c;
}
When running this example, we’re in for a nasty surprise: the program runs and then breaks with
an abort exception. Here is the output it produces, the last two lines being added by the system’s
final catch-all handler, catching all remaining uncaught exceptions:
Throw’s exception handled locally by Throw()
Composer() caught exception as well
terminate called after throwing an instance of ’int’
Abort
The reason for this is documented in the C++ standard: at the end of a catch-handler belonging to a
constructor or destructor function try block, the original exception is automatically rethrown.
The exception is not rethrown if the handler itself throws another exception, offering the constructor
or destructor a way to replace a thrown exception by another one. The exception is only rethrown if
it reaches the end of the catch handler of a constructor or destructor function try block. Exceptions
caught by nested catch handlers are not automatically rethrown.
As only constructors and destructors rethrow exceptions caught by their function try block catch
handlers the run-time error encountered in the above example may simply be repaired by providing
main with its own function try block:
int main()
try
{
Composer c;
}
catch (...)
{}
Now the program runs as planned, producing the following output:
Throw’s exception handled locally by Throw()
Composer() caught exception as well
A final note: if a function defining a function try block also declares an exception throw list then
only the types of rethrown exceptions must match the types mentioned in the throw list.
10.12 Exceptions in constructors and destructors
Object destructors are only activated for completely constructed objects. Although this may sound
like a truism, there is a subtlety here. If the construction of an object fails for some reason, the
object’s destructor is not called when the object goes out of scope. This could happen if an exception
that is generated by the constructor is not caught by the constructor. If the exception is thrown when
the object has already allocated some memory, then that memory is not returned: its destructor isn’t
called as the object’s construction wasn’t successfully completed.
254 CHAPTER 10. EXCEPTIONS
The following example illustrates this situation in its prototypical form. The constructor of the class
Incomplete first displays a message and then throws an exception. Its destructor also displays a
message:
class Incomplete
{
public:
Incomplete()
{
cerr << "Allocated some memory\n";
throw 0;
}
~Incomplete()
{
cerr << "Destroying the allocated memory\n";
}
};
Next, main() creates an Incomplete object inside a try block. Any exception that may be generated
is subsequently caught:
int main()
{
try
{
cerr << "Creating ‘Incomplete’ object\n";
Incomplete();
cerr << "Object constructed\n";
}
catch(...)
{
cerr << "Caught exception\n";
}
}
When this program is run, it produces the following output:
Creating ‘Incomplete’ object
Allocated some memory
Caught exception
Thus, if Incomplete’s constructor would actually have allocated some memory, the program would
suffer from a memory leak. To prevent this from happening, the following counter measures are
available:
• Prevent the exceptions from leaving the constructor.
If part of the constructor’s body may generate exceptions, then this part may be surrounded
by a try block, allowing the exception to be caught by the constructor itself. This approach is
defensible when the constructor is able to repair the cause of the exception and to complete its
construction as a valid object.
• If an exception is generated by a base class constructor or by a member initializing constructor
then a try block within the constructor’s body won’t be able to catch the thrown exception.
10.12. EXCEPTIONS IN CONSTRUCTORS AND DESTRUCTORS 255
This always results in the exception leaving the constructor and the object is not considered to
have been properly constructed. A try block may include the member initializers, and the try
block’s compound statement becomes the constructor’s body as in the following example:
class Incomplete2
{
Composed d_composed;
public:
Incomplete2()
try
:
d_composed(/* arguments */)
{
// body
}
catch (...)
{}
};
An exception thrown by either the member initializers or the body results in the execution
never reaching the body’s closing curly brace. Instead the catch clause is reached. Since the
constructor’s body isn’t properly completed the object is not considered properly constructed
and eventually the object’s destructor won’t be called.
The catch clause of a constructor’s function try block behaves slightly different than a catch clause
of an ordinary function try block. An exception reaching a constructor’s function try block may be
transformed into another exception (which is thrown from the catch clause) but if no exception is
explicitly thrown from the catch clause the exception originally reaching the catch clause is always
rethrown. Consequently, there’s no way to confine an exception thrown from a base class constructor
or from a member initializer to the constructor: such an exception always propagates to a more
shallow block and in that case the object’s construction is always considered incomplete.
Consequently, if incompletely constructed objects throw exceptions then the constructor’s catch
clause is responsible for preventing memory (generally: resource) leaks. There are several ways
to realize this:
• When multiple inheritance is used: if initial base classes have properly been constructed and a
later base class throws, then the initial base class objects are automatically destroyed (as they
are themselves fully constructed objects)
• When composition is used: already constructed composed objects are automatically destroyed
(as they are fully constructed objects)
• Instead of using plain pointers smart pointers (cf. section 18.4) should be used to manage
dynamically allocated memory. In this case, if the constructor throws either before or after the
allocation of the dynamic memory, then allocated memory is properly returned as shared_ptr
objects are, after all, objects.
• If plain pointer data members must be used then the constructor’s body should first, in its
member initialization section, initialize its plain pointer data members. Then, in its body it
can dynamically allocate memory, reassigning the plain pointer datamembers. The constructor
must be provided with a function try block whose generic catch clause deletes the memory
pointed at by the class’s plain pointer data members. Example:
class Incomplete2
256 CHAPTER 10. EXCEPTIONS
{
Composed d_composed;
char *d_cp; // plain pointers
int *d_ip;
public:
Incomplete2(size_t nChars, size_t nInts)
try
:
d_composed(/* arguments */), // might throw
d_cp(0),
d_ip(0)
{
preamble(); // might throw
d_cp = new char[nChars]; // might throw
d_ip = new int[nChars]; // might throw
postamble(); // might throw
}
catch (...)
{
delete[] d_cp; // clean up
delete[] d_ip;
}
};
On the other hand, since C++ supports constructor delegation an object may have been completely
constructed according to the C++ run-time system, but yet its constructor may have thrown an
exception. This happens if a delegated constructor successfully completes (after which the object is
considered ‘completely constructed’), but the constructor itself throws an exception, as illustrated by
the next example:
class Delegate
{
public:
Delegate()
:
Delegate(0)
{
throw 12; // throws but completely constructed
}
Delegate(int x) // completes OK
{}
};
int main()
try
{
Delegate del; // throws
} // del’s destructor is called here
catch (...)
{}
In this example it is the responsibility of Delegate’s designer to ensure that the throwing default
constructor does not invalidate the actions performed by Delegate’s destructor. E.g., if the del10.12.
EXCEPTIONS IN CONSTRUCTORS AND DESTRUCTORS 257
egated constructor allocates memory to be deleted by the destructor, then the default constructor
should either leave the memory as-is, or it can delete the memory and set the corresponding pointer
to zero thereafter. In any case, it is Delegate’s responsibility to ensure that the object remains in a
valid state, even though it throws an exception.
According to the C++ standard exceptions thrown by destructors may not leave their bodies. Providing
a destructor with a function try block is therefore a violation of the standard: exceptions caught
by a function try block’s catch clause have already left the destructor’s body. If –in violation of the
standard– the destructor is provided with a function try block and an exception is caught by the
try block then that exception is rethrown, similar to what happens in catch clauses of constructor
functions’ try blocks.
The consequences of an exception leaving the destructor’s body is not defined, and may result in
unexpected behavior. Consider the following example:
Assume a carpenter builds a cupboard having a single drawer. The cupboard is finished, and a
customer, buying the cupboard, finds that the cupboard can be used as expected. Satisfied with the
cupboard, the customer asks the carpenter to build another cupboard, this time having two drawers.
When the second cupboard is finished, the customer takes it home and is utterly amazed when the
second cupboard completely collapses immediately after it is used for the first time.
Weird story? Then consider the following program:
int main()
{
try
{
cerr << "Creating Cupboard1\n";
Cupboard1();
cerr << "Beyond Cupboard1 object\n";
}
catch (...)
{
cerr << "Cupboard1 behaves as expected\n";
}
try
{
cerr << "Creating Cupboard2\n";
Cupboard2();
cerr << "Beyond Cupboard2 object\n";
}
catch (...)
{
cerr << "Cupboard2 behaves as expected\n";
}
}
When this program is run it produces the following output:
Creating Cupboard1
Drawer 1 used
Cupboard1 behaves as expected
Creating Cupboard2
Drawer 2 used
Drawer 1 used
258 CHAPTER 10. EXCEPTIONS
terminate called after throwing an instance of ’int’
Abort
The final Abort indicates that the program has aborted instead of displaying a message like
Cupboard2 behaves as expected.
Let’s have a look at the three classes involved. The class Drawer has no particular characteristics,
except that its destructor throws an exception:
class Drawer
{
size_t d_nr;
public:
Drawer(size_t nr)
:
d_nr(nr)
{}
~Drawer()
{
cerr << "Drawer " << d_nr << " used\n";
throw 0;
}
};
The class Cupboard1 has no special characteristics at all. It merely has a single composed Drawer
object:
class Cupboard1
{
Drawer left;
public:
Cupboard1()
:
left(1)
{}
};
The class Cupboard2 is constructed comparably, but it has two composed Drawer objects:
class Cupboard2
{
Drawer left;
Drawer right;
public:
Cupboard2()
:
left(1),
right(2)
{}
};
When Cupboard1’s destructor is called Drawer’s destructor is eventually called to destroy its composed
object. This destructor throws an exception, which is caught beyond the program’s first try
block. This behavior is completely as expected.
10.12. EXCEPTIONS IN CONSTRUCTORS AND DESTRUCTORS 259
A subtlety here is that Cupboard1’s destructor (and hence Drawer’s destructor) is activated immediately
subsequent to its construction. Its destructor is called immediately subsequent to its
construction as Cupboard1() defines an anonymous object. As a result the Beyond Cupboard1
object text is never inserted into std::cerr.
Because of Drawer’s destructor throwing an exception a problem occurs when Cupboard2’s destructor
is called. Of its two composed objects, the second Drawer’s destructor is called first. This destructor
throws an exception, which ought to be caught beyond the program’s second try block. However,
although the flow of control by then has left the context of Cupboard2’s destructor, that object hasn’t
completely been destroyed yet as the destructor of its other (left) Drawer still has to be called.
Normally that would not be a big problem: once an exception is thrown from Cupboard2’s destructor
any remaining actions would simply be ignored, albeit that (as both drawers are properly constructed
objects) left’s destructor would still have to be called.
This happens here too and left’s destructor also needs to throw an exception. But as we’ve already
left the context of the second try block, the current flow control is now thoroughly mixed up, and
the program has no other option but to abort. It does so by calling terminate(), which in turn calls
abort(). Here we have our collapsing cupboard having two drawers, even though the cupboard
having one drawer behaves perfectly.
The program aborts since there are multiple composed objects whose destructors throw exceptions
leaving the destructors. In this situation one of the composed objects would throw an exception by
the time the program’s flow control has already left its proper context causing the program to abort.
The C++ standard therefore understandably stipulates that exceptions may never leave destructors.
Here is the skeleton of a destructor whose code might throw exceptions. No function try block but
all the destructor’s actions are encapsulated in a try block nested under the destructor’s body.
Class::~Class()
{
try
{
maybe_throw_exceptions();
}
catch (...)
{}
}
260 CHAPTER 10. EXCEPTIONS

Chapter 11
More Operator Overloading
Having covered the overloaded assignment operator in chapter 9, and having shown several examples
of other overloaded operators as well (i.e., the insertion and extraction operators in chapters 3
and 6), we now take a look at operator overloading in general.
11.1 Overloading ‘operator[]()’
As our next example of operator overloading, we introduce a class IntArray encapsulating an array
of ints. Indexing the array elements is possible using the standard array index operator [],
but additionally checks for array bounds overflow are performed. Furthermore, the index operator
(operator[]) is interesting in that it can be used in expressions as both lvalue and as rvalue.
Here is an example showing the basic use of the class:
int main()
{
IntArray x(20); // 20 ints
for (int i = 0; i < 20; i++)
x[i] = i * 2; // assign the elements
for (int i = 0; i <= 20; i++) // produces boundary overflow
cout << "At index " << i << ": value is " << x[i] << ’\n’;
}
First, the constructor is used to create an object containing 20 ints. The elements stored in the
object can be assigned or retrieved. The first for-loop assigns values to the elements using the
index operator, the second for-loop retrieves the values but also results in a run-time error once the
non-existing value x[20] is addressed. The IntArray class interface is:
#include <cstddef>
class IntArray
{
int *d_data;
size_t d_size;
261
262 CHAPTER 11. MORE OPERATOR OVERLOADING
public:
IntArray(size_t size = 1);
IntArray(IntArray const &other);
~IntArray();
IntArray const &operator=(IntArray const &other);
// overloaded index operators:
int &operator[](size_t index); // first
int const &operator[](size_t index) const; // second
void swap(IntArray &other); // trivial
private:
void boundary(size_t index) const;
int &operatorIndex(size_t index) const;
};
This class has the following characteristics:
• One of its constructors has a size_t parameter having a default argument value, specifying
the number of int elements in the object.
• The class internally uses a pointer to reach allocated memory. Hence, the necessary tools are
provided: a copy constructor, an overloaded assignment operator and a destructor.
• Note that there are two overloaded index operators. Why are there two?
The first overloaded index operator allows us to reach and modify the elements of non-constant
IntArray objects. This overloaded operator’s prototype is a function returning a reference to
an int. This allows us to use expressions like x[10] as rvalues or lvalues.
With non-const IntArray objects operator[] can therefore be used to retrieve and to assign
values. The return value of the non-const operator[] member is not an int const &, but
an int &. In this situation we don’t use const, as we must be able to modify the element we
want to access when the operator is used as lvalue.
This whole scheme fails if there’s nothing to assign. Consider the situation where we have
an IntArray const stable(5). Such an object is an immutable const object. The compiler
detects this and refuses to compile this object definition if only the non-const operator[] is
available. Hence the second overloaded index operator is added to the class’s interface. Here
the return value is an int const &, rather than an int &, and the member function itself is
a const member function. This second form of the overloaded index operator is only used with
const objects. It is used for value retrieval instead of value assignment. That, of course, is
precisely what we want when using const objects. In this situation members are overloaded
only by their const attribute. This form of function overloading was introduced earlier in the
C++ Annotations (sections 2.5.4 and 7.7).
Since IntArray stores values of a primitive type IntArray’s operator[] const could also
have defined a value return type. However, with objects one usually doesn’t want the extra
copying that’s implied with value return types. In those cases const & return values are
preferred for const member functions. So, in the IntArray class an int return value could
have been used as well, resulting in the following prototype:
int IntArray::operator[](size_t index) const;
• As there is only one pointer data member, the destruction of the memory allocated by the object
is a simple delete[] data.
11.1. OVERLOADING ‘OPERATOR[]()’ 263
Now, the implementation of the members (omitting the trivial implementation of swap, cf. chapter
9) are:
#include "intarray.ih"
IntArray::IntArray(size_t size)
:
d_size(size)
{
if (d_size < 1)
throw string("IntArray: size of array must be >= 1");
d_data = new int[d_size];
}
IntArray::IntArray(IntArray const &other)
:
d_size(other.d_size),
d_data(new int[d_size])
{
memcpy(d_data, other.d_data, d_size * sizeof(int));
}
IntArray::~IntArray()
{
delete[] d_data;
}
IntArray const &IntArray::operator=(IntArray const &other)
{
IntArray tmp(other);
swap(tmp);
return *this;
}
int &IntArray::operatorIndex(size_t index) const
{
boundary(index);
return d_data[index];
}
int &IntArray::operator[](size_t index)
{
return operatorIndex(index);
}
int const &IntArray::operator[](size_t index) const
{
return operatorIndex(index);
}
void IntArray::boundary(size_t index) const
{
if (index < d_size)
264 CHAPTER 11. MORE OPERATOR OVERLOADING
return;
ostringstream out;
out << "IntArray: boundary overflow, index = " <<
index << ", should be < " << d_size << ’\n’;
throw out.str();
}
Note how the operator[]members were implemented: as non-constmembersmay call const member
functions and as the implementation of the const member function is identical to the non-const
member function’s implementation both operator[]members could be defined inline using an auxiliary
function int &operatorIndex(size_t index) const. A const member function may
return a non-const reference (or pointer) return value, referring to one of the data members of its
object. Of course, this is a potentially dangerous backdoor that may break data hiding. However,
the members in the public interface prevent this breach and so the two public operator[] members
may themselves safely call the same int &operatorIndex() const member, that defines a
private backdoor.
11.2 Overloading the insertion and extraction operators
Classes may be adapted in such a way that their objects may be inserted into and extracted from,
respectively, a std::ostream and std::istream.
The class std::ostream defines insertion operators for primitive types, such as int, char _,
etc.. In this section we learn how to extend the existing functionality of classes (in particular
std::istream and std::ostream) in such a way that they can be used also in combination with
classes developed much later in history.
In particular we will show how the insertion operator can be overloaded allowing the insertion of
any type of object, say Person (see chapter 9), into an ostream. Having defined such an overloaded
operator we’re able to use the following code:
Person kr("Kernighan and Ritchie", "unknown", "unknown");
cout << "Name, address and phone number of Person kr:\n" << kr << ’\n’;
The statement cout << kr uses operator<<. This member function has two operands: an
ostream & and a Person &. The required action is defined in an overloaded free function
operator<< expecting two arguments:
// declared in ‘person.h’
std::ostream &operator<<(std::ostream &out, Person const &person);
// defined in some source file
ostream &operator<<(ostream &out, Person const &person)
{
return
out <<
"Name: " << person.name() << ", "
"Address: " << person.address() << ", "
"Phone: " << person.phone();
}
11.2. OVERLOADING THE INSERTION AND EXTRACTION OPERATORS 265
The free function operator<< has the following noteworthy characteristics:
• The function returns a reference to an ostream object, to enable ‘chaining’ of the insertion
operator.
• The two operands of operator<< are passed to the free function as its arguments. In the
example, the parameter out was initialized by cout, the parameter person by kr.
In order to overload the extraction operator for, e.g., the Person class, members are needed modifying
the class’s private data members. Such modifiers are normally offered by the class interface. For
the Person class these members could be the following:
void setName(char const *name);
void setAddress(char const *address);
void setPhone(char const *phone);
These members may easily be implemented: the memory pointed to by the corresponding data member
must be deleted, and the data member should point to a copy of the text pointed to by the
parameter. E.g.,
void Person::setAddress(char const *address)
{
delete[] d_address;
d_address = strdupnew(address);
}
A more elaborate function should check the reasonableness of the new address (address also
shouldn’t be a 0-pointer). This however, is not further pursued here. Instead, let’s have a look
at the final operator>>. A simple implementation is:
istream &operator>>(istream &in, Person &person)
{
string name;
string address;
string phone;
if (in >> name >> address >> phone) // extract three strings
{
person.setName(name.c_str());
person.setAddress(address.c_str());
person.setPhone(phone.c_str());
}
return in;
}
Note the stepwise approach that is followed here. First, the required information is extracted using
available extraction operators. Then, if that succeeds, modifiers are used to modify the data
members of the object to be extracted. Finally, the stream object itself is returned as a reference.
266 CHAPTER 11. MORE OPERATOR OVERLOADING
11.3 Conversion operators
A class may be constructed around a built-in type. E.g., a class String, constructed around the
char _ type. Such a class may define all kinds of operations, like assignments. Take a look at the
following class interface, designed after the string class:
class String
{
char *d_string;
public:
String();
String(char const *arg);
~String();
String(String const &other);
String const &operator=(String const &rvalue);
String const &operator=(char const *rvalue);
};
Objects of this class can be initialized from a char const _, and also from a String itself. There is
an overloaded assignment operator, allowing the assignment from a String object and from a char
const _1.
Usually, in classes that are less directly linked to their data than this String class, there will be
an accessor member function, like a member char const _String::c_str() const. However,
the need to use this latter member doesn’t appeal to our intuition when an array of String objects
is defined by, e.g., a class StringArray. If this latter class provides the operator[] to access
individual String members, it would most likely offer at least the following class interface:
class StringArray
{
String *d_store;
size_t d_n;
public:
StringArray(size_t size);
StringArray(StringArray const &other);
StringArray const &operator=(StringArray const &rvalue);
~StringArray();
String &operator[](size_t index);
};
This interface allows us to assign String elements to each other:
StringArray sa(10);
sa[4] = sa[3]; // String to String assignment
But it is also possible to assign a char const _ to an element of sa:
1Note that the assignment from a char const _ also allows the null-pointer. An assignment like stringObject = 0 is
perfectly in order.
11.3. CONVERSION OPERATORS 267
sa[3] = "hello world";
Here, the following steps are taken:
• First, sa[3] is evaluated. This results in a String reference.
• Next, the String class is inspected for an overloaded assignment, expecting a char const
_ to its right-hand side. This operator is found, and the string object sa[3] receives its new
value.
Now we try to do it the other way around: how to access the char const _ that’s stored in sa[3]?
The following attempt fails:
char const *cp = sa[3];
It fails since we would need an overloaded assignment operator for the ’class char const _’. Unfortunately,
there isn’t such a class, and therefore we can’t build that overloaded assignment operator
(see also section 11.13). Furthermore, casting won’t work as the compiler doesn’t know how to cast
a String to a char const _. How to proceed?
One possibility is to define an accessor member function c_str():
char const *cp = sa[3].c_str()
This compiles fine but looks clumsy.... A far better approach would be to use a conversion operator.
A conversion operator is a kind of overloaded operator, but this time the overloading is used to cast
the object to another type. In class interfaces, the general form of a conversion operator is:
operator <type>() const;
Conversion operators usually are const member functions: they are automatically called when
their objects are used as rvalues in expressions having a type lvalue. Using a conversion operator
a String object may be interpreted as a char const _ rvalue, allowing us to perform the above
assignment.
Conversion operators are somewhat dangerous. The conversion is automatically performed by the
compiler and unless its use is perfectly transparent it may confuse those who read code in which
conversion operators are used. E.g., novice C++ programmers are frequently confused by statements
like ‘if (cin) ...’.
As a rule of thumb: classes should define at most one conversion operator. Multiple conversion operators
may be defined but frequently result in ambiguous code. E.g., if a class defines operator
bool() const and operator int() const then passing an object of this class to a function expecting
a size_t argument results in an ambiguity as an int and a bool may both be used to
initialize a size_t.
In the current example, the class String could define the following conversion operator for char
const _:
String::operator char const *() const
{
return d_string;
}
268 CHAPTER 11. MORE OPERATOR OVERLOADING
Notes:
• Conversion operators do not define return types. The conversion operator returns a value of
the type specified beyond the operator keyword.
• In certain situations (e.g., when a String argument is passed to a function specifying an
ellipsis parameter) the compiler needs a hand to disambiguate our intentions. A static_cast
solves the problem.
• With template functions conversion operators may not work immediately as expected. For example,
when defining a conversion operator X::operator std::string const() const
then cout << X() won’t compile. The reason for this is explained in section 21.9, but a shortcut
allowing the conversion operator to work is to define the following overloaded operator<<
function:
std::ostream &operator<<(std::ostream &out, std::string const &str)
{
return out.write(str.data(), str.length());
}
Conversion operators are also used when objects of classes defining conversion operators are inserted
into streams. Realize that the right hand sides of insertion operators are function parameters that
are initialized by the operator’s right hand side arguments. The rules are simple:
• If a class X defining a conversion operator also defines an insertion operator accepting an X
object the insertion operator is used;
• Otherwise, if the type returned by the conversion operator is insertable then the conversion
operator is used;
• Otherwise, a compilation error results. Note that this happens if the type returned by the
conversion operator itself defines a conversion operator to a type that may be inserted into a
stream.
In the following example an object of class Insertable is directly inserted; an object of the class
Convertor uses the conversion operator; an object of the class Error cannot be inserted since
it does not define an insertion operator and the type returned by its conversion operator cannot be
inserted either (Text does define an operator int() const, but the fact that a Text itself cannot
be inserted causes the error):
#include <iostream>
#include <string>
using namespace std;
struct Insertable
{
operator int() const
{
cout << "op int()\n";
}
};
ostream &operator<<(ostream &out, Insertable const &ins)
{
return out << "insertion operator";
11.4. THE KEYWORD ‘EXPLICIT’ 269
}
struct Convertor
{
operator Insertable() const
{
return Insertable();
}
};
struct Text
{
operator int() const
{
return 1;
}
};
struct Error
{
operator Text() const
{
return Text();
}
};
int main()
{
Insertable insertable;
cout << insertable << ’\n’;
Convertor convertor;
cout << convertor << ’\n’;
Error error;
cout << error << ’\n’;
}
Some final remarks regarding conversion operators:
• A conversion operator should be a ‘natural extension’ of the facilities of the object. For example,
the stream classes define operator bool(), allowing constructions like if (cin).
• A conversion operator should return an rvalue. It should do so to enforce data-hiding and
because it is the intended use of the conversion operator. Defining a conversion operator as an
lvalue (e.g., defining an operator int &() conversion operator) opens up a back door, and
the operator can only be used as lvalue when explicitly called (as in: x.operator int&() =
5). Don’t use it.
• Conversion operators should be defined as const member functions as they don’t modify their
object’s data members.
• Conversion operators returning composed objects should return const references to these objects
whenever possible to avoid calling the composed object’s copy constructor.
11.4 The keyword ‘explicit’
Conversions are not only performed by conversion operators, but also by constructors accepting one
argument (i.e., constructors having one or multiple parameters, specifying default argument values
270 CHAPTER 11. MORE OPERATOR OVERLOADING
for all parameters or for all but the first parameter).
Assume a data base class DataBase is defined in which Person objects can be stored. It defines a
Person _d_data pointer, and so it offers a copy constructor and an overloaded assignment operator.
In addition to the copy constructor DataBase offers a default constructor and several additional
constructors:
• DataBase(Person const &): the DataBase initially contains a single Person object;
• DataBase(istream &in): the data about multiple persons are read from in.
• DataBase(size_t count, istream &in = cin): the data of count persons are read from
in, by default the standard input stream.
The above constructors all are perfectly reasonable. But they also allow the compiler to compile the
following code without producing any warning at all:
DataBase db;
DataBase db2;
Person person;
db2 = db; // 1
db2 = person; // 2
db2 = 10; // 3
db2 = cin; // 4
Statement 1 is perfectly reasonable: db is used to redefine db2. Statement 2 might be understandable
since we designed DataBase to contain Person objects. Nevertheless, we might question the
logic that’s used here as a Person is not some kind of DataBase. The logic becomes even more
opaque when looking at statements 3 and 4. Statement 3 in effect waits for the data of 10 persons
to appear at the standard input stream. Nothing like that is suggested by db2 = 10.
All four statements are the result of implicit promotions. Since constructors accepting, respectively a
Person, an istream, and a size_t and an istream have been defined for DataBase and since the
assignment operator expects a DataBase right-hand side (rhs) argument the compiler first converts
the rhs arguments to anonymous DataBase objects which are then assigned to db2.
It is good practice to prevent implicit promotions by using the explicit modifier when declaring
a constructor. Constructors using the explicit modifier can only be used to construct objects
explicitly. Statements 2-4 would not have compiled if the constructors expecting one argument would
have been declared using explicit. E.g.,
explicit DataBase(Person const &person);
explicit DataBase(size_t count, std:istream &in);
Having declared all constructors accepting one argument as explicit the above assignments would
have required the explicit specification of the appropriate constructors, thus clarifying the programmer’s
intent:
DataBase db;
DataBase db2;
Person person;
11.5. OVERLOADING THE INCREMENT AND DECREMENT OPERATORS 271
db2 = db; // 1
db2 = DataBase(person); // 2
db2 = DataBase(10); // 3
db2 = DataBase(cin); // 4
As a rule of thumb prefix one argument constructors with the explicit keyword unless implicit
promotions are perfectly natural (string’s char const _ accepting constructor is a case in point).
11.4.1 Explicit conversion operators
In addition to explicit constructors, C++ supports explicit conversion operators.
For example, a class might define operator bool() const returning true if an object of that
class is in a usable state and false if not. Since the type bool is an arithmetic type this could
result in unexpected or unintended behavior. Consider:
class StreamHandler
{
public:
operator bool() const; // true: object is fit for use
...
};
int fun(StreamHandler &sh)
{
int sx;
if (sh) // intended use of operator bool()
... use sh as usual; also use ‘sx’
process(sh); // typo: ‘sx’ was intended
}
In this example process unintentionally receives the value returned by operator bool using the
implicit conversion from bool to int.
With explicit conversion operators implicit conversions like the one shown in the example are
prevented and such conversion operators can only be used in situations where the converted type is
explicitly required. E.g., in the condition sections of if or repetition statements where a bool value
is expected. In such cases an explicit operator bool() conversion operator would automatically
be used.
11.5 Overloading the increment and decrement operators
Overloading the increment operator (operator++) and decrement operator (operator−−) introduces
a small problem: there are two versions of each operator, as they may be used as postfix
operator (e.g., x++) or as prefix operator (e.g., ++x).
Used as postfix operator, the value’s object is returned as an rvalue, temporary const object and
the post-incremented variable itself disappears from view. Used as prefix operator, the variable
is incremented, and its value is returned as lvalue and it may be altered again by modifying the
272 CHAPTER 11. MORE OPERATOR OVERLOADING
prefix operator’s return value. Whereas these characteristics are not required when the operator is
overloaded, it is strongly advised to implement these characteristics in any overloaded increment or
decrement operator.
Suppose we define a wrapper class around the size_t value type. Such a class could offer the
following (partially shown) interface:
class Unsigned
{
size_t d_value;
public:
Unsigned();
explicit Unsigned(size_t init);
Unsigned &operator++();
}
The class’s last member declares the prefix overloaded increment operator. The returned lvalue is
Unsigned &. The member is easily implemented:
Unsigned &Unsigned::operator++()
{
++d_value;
return *this;
}
To define the postfix operator, an overloaded version of the operator is defined, expecting a (dummy)
int argument. This might be considered a kludge, or an acceptable application of function overloading.
Whatever your opinion in this matter, the following can be concluded:
• Overloaded increment and decrement operators without parameters are prefix operators, and
should return references to the current object.
• Overloaded increment and decrement operators having an int parameter are postfix operators,
and should return a value which is a copy of the object at the point where its postfix operator
is used.
The postfix increment operator is declared as follows in the class Unsigned’s interface:
Unsigned operator++(int);
It may be implemented as follows:
Unsigned Unsigned::operator++(int)
{
Unsigned tmp(*this);
++d_value;
return tmp;
}
Note that the operator’s parameter is not used. It is only part of the implementation to disambiguate
the prefix- and postfix operators in implementations and declarations.
11.6. OVERLOADING BINARY OPERATORS 273
In the above example the statement incrementing the current object offers the nothrow guarantee
as it only involves an operation on a primitive type. If the initial copy construction throws then the
original object is not modified, if the return statement throws the object has safely been modified.
But incrementing an object could itself throw exceptions. How to implement the increment operators
in that case? Once again, swap is our friend. Here are the pre- and postfix operators offering the
strong guarantee when the member increment performing the increment operation may throw:
Unsigned &Unsigned::operator++()
{
Unsigned tmp(*this);
tmp.increment();
swap(tmp);
return *this;
}
Unsigned Unsigned::operator++(int)
{
Unsigned tmp(*this);
tmp.increment();
swap(tmp);
return tmp;
}
The postfix increment operator first creates a copy of the current object. That copy is incremented
and then swapped with the current object. If increment throws the current object remains unaltered;
the swap operation ensures that the original object is returned and the current object becomes
the incremented object.
When calling the increment or decrement operator using its full member function name then any
int argument passed to the function results in calling the postfix operator. Omitting the argument
results in calling the prefix operator. Example:
Unsigned uns(13);
uns.operator++(); // prefix-incrementing uns
uns.operator++(0); // postfix-incrementing uns
11.6 Overloading binary operators
In various classes overloading binary operators (like operator+) can be a very natural extension
of the class’s functionality. For example, the std::string class has various overloaded forms of
operator+.
Most binary operators come in two flavors: the plain binary operator (like the + operator) and the
compound assignment variant (like the += operator). Whereas the plain binary operators return
values, the compound assignment operators return a reference to the object to which the operator
was applied. For example, with std::string objects the following code (annotations below the
example) may be used:
std::string s1;
std::string s2;
std::string s3;
274 CHAPTER 11. MORE OPERATOR OVERLOADING
s1 = s2 += s3; // 1
(s2 += s3) + " postfix"; // 2
s1 = "prefix " + s3; // 3
"prefix " + s3 + "postfix"; // 4
• at // 1 the contents of s3 is added to s2. Next, s2 is returned, and its new contents are
assigned to s1. Note that += returns s2 itself.
• at // 2 the contents of s3 is also added to s2, but as += returns s2 itself, it’s possible to add
some more to s2
• at // 3 the + operator returns a std::string containing the concatenation of the text
prefix and the contents of s3. This string returned by the + operator is thereupon assigned
to s1.
• at // 4 the + operator is applied twice. The effect is:
1. The first + returns a std::string containing the concatenation of the text prefix and
the contents of s3.
2. The second + operator takes this returned string as its left hand value, and returns a
string containing the concatenated text of its left and right hand operands.
3. The string returned by the second + operator represents the value of the expression.
Consider the following code, in which a class Binary supports an overloaded operator+:
class Binary
{
public:
Binary();
Binary(int value);
Binary operator+(Binary const &rvalue);
};
int main()
{
Binary b1;
Binary b2(5);
b1 = b2 + 3; // 1
b1 = 3 + b2; // 2
}
Compilation of this little program fails for statement // 2, with the compiler reporting an error
like:
error: no match for ’operator+’ in ’3 + b2’
Why is statement // 1 compiled correctly whereas statement // 2 won’t compile?
In order to understand this remember promotions. As we have seen in section 11.4, constructors
expecting a single argument may be implicitly activated when an argument of an appropriate type
is provided. We’ve encountered this repeatedly with std::string objects, where an NTBS may be
used to initialize a std::string object.
11.6. OVERLOADING BINARY OPERATORS 275
Analogously, in statement // 1, the + operator is called for the b2 object. This operator expects
another Binary object as its right hand operand. However, an int is provided. As a constructor
Binary(int) exists, the int value is first promoted to a Binary object. Next, this Binary object
is passed as argument to the operator+ member.
In statement // 2 no promotions are available: here the + operator is applied to an lvalue that is
an int. An int is a primitive type and primitive types have no concept of ‘constructors’, ‘member
functions’ or ‘promotions’.
How, then, are promotions of left-hand operands implemented in statements like "prefix " +
s3? Since promotions are applied to function arguments, we must make sure that both operands of
binary operators are arguments. This implies that plain binary operators supporting promotions for
either their left-hand side operand or right-hand side operand should be declared as free operators,
also called free functions.
Functions like the plain binary operators conceptually belong to the class for which they implement
the binary operator. Consequently they should be declared in the class’s header file. We cover
their implementations shortly, but here is our first revision of the declaration of the class Binary,
declaring an overloaded + operator as a free function:
class Binary
{
public:
Binary();
Binary(int value);
};
Binary operator+(Binary const &lhs, Binary const &rhs);
By defining binary operators as free functions, the following promotions are possible:
• If the left-hand operand is of the intended class type, the right hand argument is promoted
whenever possible;
• If the right-hand operand is of the intended class type, the left hand argument is promoted
whenever possible;
• No promotions occur when none of the operands are of the intended class type;
• An ambiguity occurs when promotions to different classes are possible for the two operands.
For example:
class A;
class B
{
public:
B(A const &a);
};
class A
{
public:
A();
A(B const &b);
};
276 CHAPTER 11. MORE OPERATOR OVERLOADING
A operator+(A const &a, B const &b);
B operator+(B const &b, A const &a);
int main()
{
A a;
a + a;
};
Here, both overloaded + operators are possible when compiling the statement a + a. The
ambiguity must be solved by explicitly promoting one of the arguments, e.g., a + B(a) allows
the compiler to resolve the ambiguity to the first overloaded + operator.
The next step is to implement the corresponding overloaded binary compound assignment operators,
having the form @=, where @ represents a binary operator. As this operator always has a lefthand
operand which is an object of its own class, it is implemented as a true member function.
Furthermore, the compound assignment operator should return a reference to the object to which
the binary operation applies, as the object might be modified in the same statement. E.g., (s2 +=
s3) + " postfix". Here is our second revision of the class Binary, showing both the declaration
of the plain binary operator and the corresponding compound assignment operator:
class Binary
{
public:
Binary();
Binary(int value);
Binary &operator+=(Binary const &rhs);
};
Binary operator+(Binary const &lhs, Binary const &rhs);
How should the compound addition assignment operator be implemented? When implementing the
compound assignment operator the strong guarantee should again be kept inmind. Use a temporary
object and swap if the add member might throw. Example:
Binary &operator+=(Binary const &other)
{
Binary tmp(*this);
tmp.add(other); // this may throw
swap(tmp);
return *this;
}
It’s easy to implement the plain binary operator for classes offering the matching compound assignment
operator: the lhs argument is copied into a Binary tmp to which the rhs operand is added.
Then tmp is returned. The copy construction and two statements could be contracted into one single
return statement, but then compilers usually aren’t able to apply copy elision in this case. But copy
elision is usually used when the steps are taken separately:
class Binary
11.6. OVERLOADING BINARY OPERATORS 277
{
public:
Binary();
Binary(int value);
Binary &operator+=(Binary const &other);
};
Binary operator+(Binary const &lhs, Binary const &rhs)
{
Binary tmp(lhs);
tmp += rhs;
return tmp;
}
But wait! Remember the design principle for move-aware classes that was given in section 9.7.8?
When implementing binary operators we’re doing exactly what was mentioned in that design principle.
A temporay object is constructed and the compound assignment operation is applied to the
temporary object. In the next section we’ll have a look at how we can use this design principle to our
advantage.
If the class Binary is a move-aware class then we can add move-aware binary operators to our
class. The actual work, as mentioned, is performed by the compound addition assignment operator.
Applying the format of the traditional binary operator (receiving two const references) to the moveaware
addition operator we get the following signature:
Binary operator+(Binary &&lhs, Binary const &rhs);
To implement this, we realize that we already have a temporary object, so we can return lhs after
having added rhs to it. Since lhs already is a temporary, we can avoid a copy construction by
wrapping the compund addition in a std::move call:
Binary operator+(Binary &&lhs, Binary const &rhs)
{
return std::move(lhs += rhs);
}
When executing an expression like (all Binary objects) b1 + b2 + b3 the following functions are
called:
copy operator+ = b1 + b2
Copy constructor = tmp(b1)
Copy += = tmp += b2
Copy constructor = tmp2(tmp)
+= operation = tmp2.add(b3), swap(tmp2)
move operator+ = tmp + b3
Copy += = tmp += b3
Copy constructor = tmp2(tmp)
+= operation = tmp2.add(b3), swap(tmp2)
Move constructor = return std::move(tmp)
278 CHAPTER 11. MORE OPERATOR OVERLOADING
There’s at least some gain: if the std::move wrap is omitted, then the copy constructor is called.
But since we already have a temporary object, wouldn’t it be nice if we could lure the compiler into
using return value optimization? We can, by telling the compiler that operator+’s return value is
a temporary. We do that by explicitly stating that its return value is an rvalue reference:
Binary &&operator+(Binary &&lhs, Binary const &rhs)
{
return std::move(lhs += rhs);
}
And, realizing that our ‘traditional’ binary operator also returns a temporary, we do the same for
that operator:
Binary &&operator+(Binary const &lhs, Binary const &rhs)
{
Binary tmp(lhs);
return std::move(tmp += rhs);
}
Now the compiler applies return value optimization, and returns the temporaries, rather than constructing
new objects, and the final call of the move constructor disappears.
But we’re not there yet: in the next section we’ll encounter possibilities for some additional and
interesting optimizations.
11.6.1 Member function reference bindings (& and &&)
In the previous section we saw that plain binary operators (like operator+) can be implemented
very efficiently if the left-hand side operand is an anonymous temporary object, for which an rvaluereference
parameter is used. Moreover, by specifying an rvalue reference return type we can allow
the compiler to use return value optimization for the returned temporary.
But in cases where the lhs operand is a temporary the rhs operand can directly be added to the
lhs operand, and we don’t need the additional temporary value, created by operator+= anymore.
Our implementation of operator+= thus far looks like this:
Binary &operator+=(Binary const &rhs)
{
Binary tmp(*this);
tmp.add(rhs);
swap(tmp);
return *this;
}
However, when implementing operator+ we either already have a temporary object (when
using operator+(Binary &&lhs, ...)), or just created a temporary object (when using
operator+(Binary const &lhs, ...)). In our current implementation lots of additional temporaries
are being constructed, each of them requiring a copy construction. E.g, in an expression
like
Binary{} + varB + varC + varD
11.6. OVERLOADING BINARY OPERATORS 279
a temporary is constructed when computing Binary{} + varB, then another one for Binary{} +
varC, and yet one more for Binary{} + varD. In addition, each addition also performs a swap,
even though we already have a temporary (i.e., Binary{}) in our hands.
How to tell the compiler that we don’t need these temporaries?
For that we need a way to inform the compiler that operator+= is either called by a standard lvalue
left-hand side operand, or by an rvalue reference (i.e., temporary object). This can be realized using
reference bindings a.k.a. reference qualifiers. Reference bindings, which may be used by all member
functions, not just by overloaded operators, consist of a reference token (&) or an rvalue reference
token (&&) which is affixed immediately to the function’s head (this applies to the declaration and
the implementation alike). Functions provided with rvalue reference bindings are used when called
by anonymous temporary objects (i.e., rvalues), whereas functions provided with lvalue reference
bindings are used when called by other types of objects. Where appropriate the const qualifier can
be applied in addition (although it wouldn’t make much sense in combination with rvalue reference
bindings, since rvalue references don’t refer to const objects).
Now we’re in a position to fine-tune our implementations of operator+(). First we make a distinction
between operator+= when called from a temporary and operator+= when called from
another object. In the latter case we need a temporary, to which rhs is added:
Binary &operator+=(Binary const &rhs) &&
{
// directly add rhs to *this,
return *this;
}
Binary &operator+=(Binary const &rhs) &
{
Binary tmp(*this);
std::move(tmp) += rhs; // directly add rhs to tmp
swap(tmp);
return *this;
}
Next we look at the two implementations of operator+. When using Binary &&lhs we can directly
call operator++() &&, otherwise we first create a temporary, and then call operator+=() &&
from the temporary.
Binary &&operator+(Binary &&lhs, Binary const &rhs)
{
return move(move(lhs) += rhs);
}
Binary &&operator+(Binary const &lhs, Binary const &rhs)
{
Binary tmp(lhs);
return move(tmp) + rhs;
}
So, why do we still need operator+=() &? Well, only in situations where we want to add something
to an existing Binary object.
And this is what we call when using the above implementations for the expression b1 + b2 + b3
Copy constructor = tmp(b1)
280 CHAPTER 11. MORE OPERATOR OVERLOADING
Move += = tmp += b2
Move += = tmp += b3
It’s even faster when the first operand already is a temporary (e.g., Binary{} + b2 + b3):
Move += = Binary{} += b2
Move += = Binary{} += b3
It might be illustrative to compare these actions to the ones shown in the previous section, using
traditional implementations.
Summarizing:
• Reference bindings are used to informthe compiler for what type of references functions should
be called;
• Binary operators can capitalize on these functions to minimize the number of copy constructions
that have to be performed;
• By specifying rvalue references as return types of functions returning temporaries the compiler
can apply additional return value optimizations.
11.7 Overloading ‘operator new(size_t)’
When operator new is overloaded, it must define a void _ return type, and its first parameter
must be of type size_t. The default operator new defines only one parameter, but overloaded
versionsmay definemultiple parameters. The first one is not explicitly specified but is deducted from
the size of objects of the class for which operator new is overloaded. In this section overloading
operator new is discussed. Overloading new[] is discussed in section 11.9.
It is possible to define multiple versions of the operator new, as long as each version defines its
own unique set of arguments. When overloaded operator newmembersmust dynamically allocate
memory they can do so using the global operator new, applying the scope resolution operator ::.
In the next example the overloaded operator new of the class String initializes the substrate of
dynamically allocated String objects to 0-bytes:
#include <cstring>
#include <iosfwd>
class String
{
std::string *d_data;
public:
void *operator new(size_t size)
{
return memset(::operator new(size), 0, size);
}
bool empty() const
{
return d_data == 0;
}
};
11.7. OVERLOADING ‘OPERATOR NEW(SIZE_T)’ 281
The above operator new is used in the following program, illustrating that even though String’s
default constructor does nothing the object’s data are initialized to zeroes:
#include "string.h"
#include <iostream>
using namespace std;
int main()
{
String *sp = new String;
cout << boolalpha << sp->empty() << ’\n’; // shows: true
}
At new String the following took place:
• First, String::operator new was called, allocating and initializing a block of memory, the
size of a String object.
• Next, a pointer to this block of memory was passed to the (default) String constructor. Since
no constructor was defined, the constructor itself didn’t do anything at all.
As String::operator new initialized the allocated memory to zero bytes the allocated String
object’s d_data member had already been initialized to a 0-pointer by the time it started to exist.
Allmember functions (including constructors and destructors) we’ve encountered so far define a (hidden)
pointer to the object on which they should operate. This hidden pointer becomes the function’s
this pointer.
In the next example of pseudo C++ code, the pointer is explicitly shown to illustrate what’s happening
when operator new is used. In the first part a String object str is directly defined, in the
second part of the example the (overloaded) operator new is used:
String::String(String *const this); // real prototype of the default
// constructor
String *sp = new String; // This statement is implemented
// as follows:
String *sp = static_cast<String *>( // allocation
String::operator new(sizeof(String))
);
String::String(sp); // initialization
In the above fragment the member functions were treated as object-less member functions of the
class String. Such members are called static member functions (cf. chapter 8). Actually, operator
new is such a static member function. Since it has no this pointer it cannot reach data members of
the object for which it is expected to make memory available. It can only allocate and initialize the
allocated memory, but cannot reach the object’s data members by name as there is as yet no data
object layout defined.
Following the allocation, the memory is passed (as the this pointer) to the constructor for further
processing.
282 CHAPTER 11. MORE OPERATOR OVERLOADING
Operator new can have multiple parameters. The first parameter is initialized as an implicit argument
and is always a size_t parameter. Additional overloaded operators may define additional
parameters. An interesting additional operator new is the placement new operator. With the
placement new operator a block of memory has already been set aside and one of the class’s constructors
is used to initialize that memory. Overloading placement new requires an operator new
having two parameters: size_t and char _, pointing to the memory that was already available.
The size_t parameter is implicitly initialized, but the remaining parameters must explicitly be
initialized using arguments to operator new. Hence we reach the familiar syntactical form of the
placement new operator in use:
char buffer[sizeof(String)]; // predefined memory
String *sp = new(buffer) String; // placement new call
The declaration of the placement new operator in our class String looks like this:
void *operator new(size_t size, char *memory);
It could be implemented like this (also initializing the String’s memory to 0-bytes):
void *String::operator new(size_t size, char *memory)
{
return memset(memory, 0, size);
}
Any other overloaded version of operator new could also be defined. Here is an example showing
the use and definition of an overloaded operator new storing the object’s address immediately in
an existing array of pointers to String objects (assuming the array is large enough):
// use:
String *next(String **pointers, size_t *idx)
{
return new(pointers, (*idx)++) String;
}
// implementation:
void *String::operator new(size_t size, String **pointers, size_t idx)
{
return pointers[idx] = ::operator new(size);
}
11.8 Overloading ‘operator delete(void _)’
The delete operator may also be overloaded. In fact it’s good practice to overload operator
delete whenever operator new is also overloaded.
Operator delete must define a void _ parameter. A second overloaded version defining a second
parameter of type size_t is related to overloading operator new[] and is discussed in section
11.9.
Overloaded operator delete members return void.
11.9. OPERATORS ‘NEW[]’ AND ‘DELETE[]’ 283
The ‘home-made’ operator delete is called when deleting a dynamically allocated object after
executing the destructor of the associated class. So, the statement
delete ptr;
with ptr being a pointer to an object of the class String for which the operator delete was overloaded,
is a shorthand for the following statements:
ptr->~String(); // call the class’s destructor
// and do things with the memory pointed to by ptr
String::operator delete(ptr);
The overloaded operator delete may do whatever it wants to do with the memory pointed to by
ptr. It could, e.g., simply delete it. If that would be the preferred thing to do, then the default
delete operator can be called using the :: scope resolution operator. For example:
void String::operator delete(void *ptr)
{
// any operation considered necessary, then, maybe:
::delete ptr;
}
To declare the above overloaded operator delete simply add the following line to the class’s interface:
void operator delete(void *ptr);
Like operator new operator delete is a static member function (see also chapter 8).
11.9 Operators ‘new[]’ and ‘delete[]’
In sections 9.1.1, 9.1.2 and 9.2.1 operator new[] and operator delete[] were introduced. Like
operator new and operator delete the operators new[] and delete[] may be overloaded.
As it is possible to overload new[] and delete[] as well as operator new and operator delete,
one should be careful in selecting the appropriate set of operators. The following rule of thumb
should always be applied:
If new is used to allocate memory, delete should be used to deallocate memory. If new[]
is used to allocate memory, delete[] should be used to deallocate memory.
By default these operators act as follows:
• operator new is used to allocate a single object or primitive value. With an object, the object’s
constructor is called.
• operator delete is used to return the memory allocated by operator new. Again, with
class-type objects, the class’s destructor is called.
284 CHAPTER 11. MORE OPERATOR OVERLOADING
• operator new[] is used to allocate a series of primitive values or objects. If a series of objects
is allocated, the class’s default constructor is called to initialize each object individually.
• operator delete[] is used to delete the memory previously allocated by new[]. If objects
were previously allocated, then the destructor is called for each individual object. Be careful,
though, when pointers to objects were allocated. If pointers to objects were allocated the destructors
of the objects to which the allocated pointers point won’t automatically be called. A
pointer is a primitive type and so no further action is taken when it is returned to the common
pool.
11.9.1 Overloading ‘new[]’
To overload operator new[] in a class (e.g., in the class String) add the following line to the
class’s interface:
void *operator new[](size_t size);
The member’s size parameter is implicitly provided and is initialized by C++’s run-time system
to the amount of memory that must be allocated. Like the simple one-object operator new it
should return a void _. The number of objects that must be initialized can easily be computed from
size / sizeof(String) (and of course replacing String by the appropriate class name when
overloading operator new[] for another class). The overloaded new[] member may allocate raw
memory using e.g., the default operator new[] or the default operator new:
void *operator new[](size_t size)
{
return ::operator new[](size);
// alternatively:
// return ::operator new(size);
}
Before returning the allocated memory the overloaded operator new[] has a chance to do something
special. It could, e.g., initialize the memory to zero-bytes.
Once the overloaded operator new[] has been defined, it is automatically used in statements like:
String *op = new String[12];
Like operator new additional overloads of operator new[] may be defined. One opportunity for
an operator new[] overload is overloading placement new specifically for arrays of objects. This
operator is available by default but becomes unavailable once at least one overloaded operator
new[] is defined. Implementing placement new is not difficult. Here is an example, initializing the
available memory to 0-bytes before returning:
void *String::operator new[](size_t size, char *memory)
{
return memset(memory, 0, size);
}
To use this overloaded operator, the second parameter must again be provided, as in:
char buffer[12 * sizeof(String)];
String *sp = new(buffer) String[12];
11.9. OPERATORS ‘NEW[]’ AND ‘DELETE[]’ 285
11.9.2 Overloading ‘delete[]’
To overload operator delete[] in a class String add the following line to the class’s interface:
void operator delete[](void *memory);
Its parameter is initialized to the address of a block of memory previously allocated by
String::new[].
There are some subtleties to be aware of when implementing operator delete[]. Although the
addresses returned by new and new[] point to the allocated object(s), there is an additional size_t
value available immediately before the address returned by new and new[]. This size_t value is
part of the allocated block and contains the actual size of the block. This of course does not hold true
for the placement new operator.
When a class defines a destructor the size_t value preceding the address returned by new[] does
not contain the size of the allocated block, but the number of objects specified when calling new[].
Normally that is of no interest, but when overloading operator delete[] itmight become a useful
piece of information. In those cases operator delete[] does not receive the address returned by
new[] but rather the address of the initial size_t value. Whether this is at all useful is not clear.
By the time delete[]’s code is executed all objects have already been destroyed, so operator
delete[] is only to determine how many objects were destroyed but the objects themselves cannot
be used anymore.
Here is an example showing this behavior of operator delete[] for a minimal Demo class:
struct Demo
{
size_t idx;
Demo()
{
cout << "default cons\n";
}
~Demo()
{
cout << "destructor\n";
}
void *operator new[](size_t size)
{
return ::operator new(size);
}
void operator delete[](void *vp)
{
cout << "delete[] for: " << vp << ’\n’;
::operator delete[](vp);
}
};
int main()
{
Demo *xp;
cout << ((int *)(xp = new Demo[3]))[-1] << ’\n’;
cout << xp << ’\n’;
cout << "==================\n";
286 CHAPTER 11. MORE OPERATOR OVERLOADING
delete[] xp;
}
// This program displays (your 0x?????? addresses might differ, but
// the difference between the two should be sizeof(size_t)):
// default cons
// default cons
// default cons
// 3
// 0x8bdd00c
// ==================
// destructor
// destructor
// destructor
// delete[] for: 0x8bdd008
Having overloaded operator delete[] for a class String, it will be used automatically in statements
like:
delete[] new String[5];
Operator delete[] may also be overloaded using an additional size_t parameter:
void operator delete[](void *p, size_t size);
Here size is automatically initialized to the size (in bytes) of the block of memory to which void
_p points. If this form is defined, then void operator[](void _) should not be defined, to avoid
ambiguities. An example of this latter form of operator delete[] is:
void String::operator delete[](void *p, size_t size)
{
cout << "deleting " << size << " bytes\n";
::operator delete[](ptr);
}
Additional overloads of operator delete[] may be defined, but to use them they must explicitly
be called as static member functions (cf. chapter 8). Example:
// declaration:
void String::operator delete[](void *p, ostream &out);
// usage:
String *xp = new String[3];
String::operator delete[](xp, cout);
11.9.3 C++14: the ‘operator delete(void _, size_t)’ family
As we’ve seen classes may overload their operator delete and operator delete[] members.
The C++14 standard also supports overloading the global void operator delete(void _,
size_t size) and void operator delete[](void _, size_t size) functions.
When a global sized deallocation function is defined, it is automatically used
instead of the default, non-sized deallocation function. The performance
11.9. OPERATORS ‘NEW[]’ AND ‘DELETE[]’ 287
of programs may improve if a sized deallocation function is available (cf.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3663.html).
11.9.4 ‘new[]’, ‘delete[]’ and exceptions
When an exception is thrown while executing a new[] expression, what will happen? In this section
we’ll show that new[] is exception safe even when only some of the objects were properly
constructed.
To begin, new[]might throw while trying to allocate the requiredmemory. In this case a bad_alloc
is thrown and we don’t leak as nothing was allocated.
Having allocated the required memory the class’s default constructor is going to be used for each
of the objects in turn. At some point a constructor might throw. What happens next is defined by
the C++ standard: the destructors of the already constructed objects are called and the memory
allocated for the objects themselves is returned to the common pool. Assuming that the failing
constructor offers the basic guarantee new[] is therefore exception safe even if a constructor may
throw.
The following example illustrates this behavior. A request to allocate and initialize five objects
is made, but after constructing two objects construction fails by throwing an exception. The output
shows that the destructors of properly constructed objects are called and that the allocated substrate
memory is properly returned:
#include <iostream>
using namespace std;
static size_t count = 0;
class X
{
int x;
public:
X()
{
if (count == 2)
throw 1;
cout << "Object " << ++count << ’\n’;
}
~X()
{
cout << "Destroyed " << this << "\n";
}
void *operator new[](size_t size)
{
cout << "Allocating objects: " << size << " bytes\n";
return ::operator new(size);
}
void operator delete[](void *mem)
{
cout << "Deleting memory at " << mem << ", containing: " <<
*static_cast<int *>(mem) << "\n";
::operator delete(mem);
288 CHAPTER 11. MORE OPERATOR OVERLOADING
}
};
int main()
try
{
X *xp = new X[5];
cout << "Memory at " << xp << ’\n’;
delete[] xp;
}
catch (...)
{
cout << "Caught exception.\n";
}
// Output from this program (your 0x??? addresses might differ)
// Allocating objects: 24 bytes
// Object 1
// Object 2
// Destroyed 0x8428010
// Destroyed 0x842800c
// Deleting memory at 0x8428008, containing: 5
// Caught exception.
11.10 Function Objects
Function Objects are created by overloading the function call operator operator(). By defining the
function call operator an objectmasquerades as a function, hence the termfunction objects. Function
objects are also known as functors.
Function objects are important when using generic algorithms. The use of function objects is preferred
over alternatives like pointers to functions. The fact that they are important in the context of
generic algorithms leaves us with a didactic dilemma. At this point in the C++ Annotations it would
have been nice if generic algorithms would already have been covered, but for the discussion of the
generic algorithms knowledge of function objects is required. This bootstrapping problem is solved
in a well known way: by ignoring the dependency for the time being, for now concentrating on the
function object concept.
Function objects are objects for which operator() has been defined. Function objects are not just
used in combination with generic algorithms, but also as a (preferred) alternative to pointers to
functions.
Function objects are frequently used to implement predicate functions. Predicate functions return
boolean values. Predicate functions and predicate function objects are commonly referred to as ‘predicates’.
Predicates are frequently used by generic algorithms such as the count_if generic algorithm,
covered in chapter 19, returning the number of times its function object has returned true. In the
standard template library two kinds of predicates are used: unary predicates receive one argument,
binary predicates receive two arguments.
Assume we have a class Person and an array of Person objects. Further assume that the array is
not sorted. A well known procedure for finding a particular Person object in the array is to use the
function lsearch, which performs a lineair search in an array. Example:
Person &target = targetPerson(); // determine the person to find
11.10. FUNCTION OBJECTS 289
Person *pArray;
size_t n = fillPerson(&pArray);
cout << "The target person is";
if (!lsearch(&target, pArray, &n, sizeof(Person), compareFunction))
cout << " not";
cout << "found\n";
The function targetPerson determines the person we’re looking for, and fillPerson is called to
fill the array. Then lsearch is used to locate the target person.
The comparison function must be available, as its address is one of the arguments of lsearch. It
must be a real function having an address. If it is defined inline then the compiler has no choice
but to ignore that request as inline functions don’t have addresses. CompareFunction could be
implemented like this:
int compareFunction(void const *p1, void const *p2)
{
return *static_cast<Person const *>(p1) // lsearch wants 0
!= // for equal objects
*static_cast<Person const *>(p2);
}
This, of course, assumes that the operator!= has been overloaded in the class Person. But overloading
operator!= is no big deal, so let’s assume that that operator is actually available.
On average n / 2 times at least the following actions take place:
1. The two arguments of the compare function are pushed on the stack;
2. The value of the final parameter of lsearch is determined, producing compareFunction’s
address;
3. The compare function is called;
4. Then, inside the compare function the address of the right-hand argument of the
Person::operator!= argument is pushed on the stack;
5. Person::operator!= is evaluated;
6. The argument of the Person::operator!= function is popped off the stack;
7. The two arguments of the compare function are popped off the stack.
Using function objects results in a different picture. Assume we have constructed a function
PersonSearch, having the following prototype (this, however, is not the preferred approach. Normally
a generic algorithm is preferred over a home-made function. But for now we focus on
PersonSearch to illustrate the use and implementation of a function object):
Person const *PersonSearch(Person *base, size_t nmemb,
Person const &target);
This function can be used as follows:
Person &target = targetPerson();
290 CHAPTER 11. MORE OPERATOR OVERLOADING
Person *pArray;
size_t n = fillPerson(&pArray);
cout << "The target person is";
if (!PersonSearch(pArray, n, target))
cout << " not";
cout << "found\n";
So far, not much has been changed. We’ve replaced the call to lsearch with a call to another
function: PersonSearch. Now look at PersonSearch itself:
Person const *PersonSearch(Person *base, size_t nmemb,
Person const &target)
{
for (int idx = 0; idx < nmemb; ++idx)
if (target(base[idx]))
return base + idx;
return 0;
}
PersonSearch implements a plain linear search. However, in the for-loop we see
target(base[idx]). Here target is used as a function object. Its implementation is simple:
bool Person::operator()(Person const &other) const
{
return *this == other;
}
Note the somewhat peculiar syntax: operator(). The first set of parentheses define the operator
that is overloaded: the function call operator. The second set of parentheses define the parameters
that are required for this overloaded operator. In the class header file this overloaded operator is
declared as:
bool operator()(Person const &other) const;
Clearly Person::operator() is a simple function. It contains but one statement, and we could
consider defining it inline. Assuming we do, then this is what happens when operator() is called:
1. The address of the right-hand argument of the Person::operator== argument is pushed on
the stack;
2. The operator== function is evaluated (which probably also is a semantic improvement over
calling operator!= when looking for an object equal to a specified target object);
3. The argument of Person::operator== argument is popped off the stack.
Due to the fact that operator() is an inline function, it is not actually called. Instead operator==
is called immediately. Moreover, the required stack operations are fairly modest.
Function objects may truly be defined inline. Functions that are called indirectly (i.e., using pointers
to functions) can never be defined inline as their addresses must be known. Therefore, even if the
11.10. FUNCTION OBJECTS 291
function object needs to do very little work it is defined as an ordinary function if it is going to be
called through pointers. The overhead of performing the indirect call may annihilate the advantage
of the flexibility of calling functions indirectly. In these cases using inline function objects can result
in an increase of a program’s efficiency.
An added benefit of function objects is that they may access the private data of their objects. In a
search algorithm where a compare function is used (as with lsearch) the target and array elements
are passed to the compare function using pointers, involving extra stack handling. Using function
objects, the target person doesn’t vary within a single search task. Therefore, the target person could
be passed to the function object’s class constructor. This is in fact what happens in the expression
target(base[idx]) receiving as its only argument the subsequent elements of the array to search.
11.10.1 Constructing manipulators
In chapter 6 we saw constructions like cout << hex << 13 << to display the value 13 in hexadecimal
format. One may wonder by what magic the hex manipulator accomplishes this. In this section
the construction of manipulators like hex is covered.
Actually the construction of a manipulator is rather simple. To start, a definition of the manipulator
is needed. Let’s assume we want to create a manipulator w10 which sets the field width of the next
field to be written by the ostream object to 10. This manipulator is constructed as a function. The
w10 function needs to know about the ostream object in which the width must be set. By providing
the function with an ostream & parameter, it obtains this knowledge. Now that the function knows
about the ostream object we’re referring to, it can set the width in that object.
Next, it must be possible to use the manipulator in an insertion sequence. This implies that the
return value of the manipulator must be a reference to an ostream object also.
From the above considerations we’re now able to construct our w10 function:
#include <ostream>
#include <iomanip>
std::ostream &w10(std::ostream &str)
{
return str << std::setw(10);
}
The w10 function can of course be used in a ‘stand alone’ mode, but it can also be used as a manipulator.
E.g.,
#include <iostream>
#include <iomanip>
using namespace std;
extern ostream &w10(ostream &str);
int main()
{
w10(cout) << 3 << " ships sailed to America\n";
cout << "And " << w10 << 3 << " more ships sailed too.\n";
}
292 CHAPTER 11. MORE OPERATOR OVERLOADING
The w10 function can be used as a manipulator because the class ostream has an overloaded
operator<< accepting a pointer to a function expecting an ostream & and returning an ostream
&. Its definition is:
ostream& operator<<(ostream &(*func)(ostream &str))
{
return (*func)(*this);
}
In addition to the above overloaded operator<< another one is defined
ios_base &operator<<(ios_base &(*func)(ios_base &base))
{
(*func)(*this);
return *this;
}
This latter function is used when inserting, e.g., hex or internal.
The above procedure does not work for manipulators requiring arguments. It is of course possible to
overload operator<< to accept an ostream reference and the address of a function expecting an
ostream & and, e.g., an int, but while the address of such a function may be specified with the <<-
operator, the arguments itself cannot be specified. So, one wonders how the following construction
has been implemented:
cout << setprecision(3)
In this case the manipulator is defined as a macro. Macro’s, however, are the realm of the preprocessor,
and may easily suffer from unwelcome side-effects. In C++ programs they should be avoided
whenever possible. The following section introduces a way to implement manipulators requiring
arguments without resorting to macros, but using anonymous objects.
11.10.1.1 Manipulators requiring arguments
Manipulators taking arguments are implemented as macros: they are handled by the preprocessor,
and are not available beyond the preprocessing stage. The problem appears to be that you can’t call
a function in an insertion sequence: when using multiple operator<< operators in one statement
the compiler calls the functions, saves their return values, and then uses their return values in the
insertion sequence. That invalidates the ordering of the arguments passed to your <<-operators.
So, one might consider constructing another overloaded operator<< accepting the address of a
function receiving not just the ostream reference, but a series of other arguments as well. But this
creates the problem that it isn’t clear how the function should receive its arguments: you can’t just
call it since that takes us back to the above-mentioned problem. Merely passing its address is fine,
but then no arguments can be passed to the function.
There exists a solution, based on the use of anonymous objects:
• First, a class is constructed, e.g. Align, whose constructor expects multiple arguments. In our
example representing, respectively, the field width and the alignment.
• Furthermore, we define the function:
ostream &operator<<(ostream &ostr, Align const &align)
11.10. FUNCTION OBJECTS 293
so we can insert an Align object into the ostream.
Here is an example of a little program using such a home-made manipulator expecting multiple
arguments:
#include <iostream>
#include <iomanip>
class Align
{
unsigned d_width;
std::ios::fmtflags d_alignment;
public:
Align(unsigned width, std::ios::fmtflags alignment);
std::ostream &operator()(std::ostream &ostr) const;
};
Align::Align(unsigned width, std::ios::fmtflags alignment)
:
d_width(width),
d_alignment(alignment)
{}
std::ostream &Align::operator()(std::ostream &ostr) const
{
ostr.setf(d_alignment, std::ios::adjustfield);
return ostr << std::setw(d_width);
}
std::ostream &operator<<(std::ostream &ostr, Align const &align)
{
return align(ostr);
}
using namespace std;
int main()
{
cout
<< "‘" << Align(5, ios::left) << "hi" << "’"
<< "‘" << Align(10, ios::right) << "there" << "’\n";
}
/*
Generated output:
‘hi ’‘ there’
*/
Note that in order to insert an anonymous Align object into the ostream, the operator<< function
must define a Align const & parameter (note the const modifier).
294 CHAPTER 11. MORE OPERATOR OVERLOADING
11.11 The case of [io]fstream::open()
Earlier, in section 6.4.2.1, it was noted that the [io]fstream::open members expect an
ios::openmode value as their final argument. E.g., to open an fstream object for writing you
could do as follows:
fstream out;
out.open("/tmp/out", ios::out);
Combinations are also possible. To open an fstream object for both reading and writing the following
stanza is often seen:
fstream out;
out.open("/tmp/out", ios::in | ios::out);
When trying to combine enumvalues using a ‘home made’ enum we may run into problems. Consider
the following:
enum Permission
{
READ = 1 << 0,
WRITE = 1 << 1,
EXECUTE = 1 << 2
};
void setPermission(Permission permission);
int main()
{
setPermission(READ | WRITE);
}
When offering this little program to the compiler it replies with an error message like this:
invalid conversion from ’int’ to ’Permission’
The question is of course: why is it OK to combine ios::openmode values passing these combined
values to the stream’s open member, but not OK to combine Permission values.
Combining enum values using arithmetic operators results in int-typed values. Conceptually this
never was our intention. Conceptually it can be considered correct to combine enum values if the
resulting value conceptually makes sense as a value that is still within the original enumeration
domain. Note that after adding a value READWRITE = READ |WRITE to the above enum we’re still
not allowed to specify READ |WRITE as an argument to setPermission.
To answer the question about combining enumeration values and yet stay within the enumeration’s
domain we turn to operator overloading. Up to this point operator overloading has been applied
to class types. Free functions like operator<< have been overloaded, and those overloads are
conceptually within the domain of their class.
As C++ is a strongly typed language realize that defining an enum is really something beyond the
mere association of int-values with symbolic names. An enumeration type is really a type of its
11.12. USER-DEFINED LITERALS 295
own, and as with any type its operators can be overloaded. When writing READ |WRITE the compiler
performs the default conversion from enum values to int values and applies the operator to ints.
It does so when it has no alternative.
But it is also possible to overload the enum type’s operators. Thus we may ensure that we’ll remain
within the enum’s domain even though the resulting value wasn’t defined by the enum. The
advantage of type-safety and conceptual clarity is considered to outweigh the somewhat peculiar
introduction of values hitherto not defined by the enum.
Here is an example of such an overloaded operator:
Permission operator|(Permission left, Permission right)
{
return static_cast<Permission>(static_cast<int>(left) | right);
}
Other operators can easily and analogously be constructed.
Operators like the above were defined for the ios::openmode enumeration type, allowing us to
specify ios::in |ios::out as argument to open while specifying the corresponding parameter
as ios::openmode as well. Clearly, operator overloading can be used in many situations, not necessarily
only involving class-types.
11.12 User-defined literals
In addition to the well-known literals, like numerical constants (with or without suffixes), character
constants and string (textual) literals, C++ also supports user-defined literals, also known as
extensible literals.
A user-defined literal is defined by a function (see also section 23.3) that must be defined at namespace
scope. Such a function is called a literal operator. A literal operator cannot be a class member
function. The names of a literal operator must start with an underscore, and a literal operator is
used (called) by suffixing its name (including the underscore) to the argument that must be passed
to it . Assuming _NM2km (nautical mile to km) is the name of a literal operator, then it could be
called as 100_NM2km, producing, e.g., the value 185.2.
Using Type to represent the return type of the literal operator its generic declaration looks like this:
Type operator "" _identifier(parameter-list);
The blank space trailing the empty string is required. The parameter lists of literal operators can
be:
• unsigned long long int. It is used as, e.g., 123_identifier. The argument to this literal
operator can be decimal constants, binary constants (initial 0b), octal constants (initial 0) and
hexadecimal constants (initial 0x);
• long double. It is used as, e.g., 12.25_NM2km;
• char const _text. The text argument is an NTBS. It is used as, e.g., 1234_pental. The
argument must not be given double quotes, and must represent a numeric constant, as also
expected by literal operators defining unsigned long long int parameters.
296 CHAPTER 11. MORE OPERATOR OVERLOADING
• char const _text, size_t len. Here, the compiler determines len as if it had called
strlen(text). It is used as, e.g., "hello"_nVowels;
• wchar_t const _text, size_t len, same as the previous one, but accepting a string of
wchar_t characters. It is used as, e.g., L"1234"_charSum;
• char16_t const _text, size_t len, same as the previous one, but accepting a string of
char16_t characters. It is used as, e.g., u"utf 16"_uc;
• char32_t const _text, size_t len, same as the previous one, but accepting a string of
char32_t characters. It is used as, e.g., U"UTF 32"_lc;
If literal operators are overloaded the compiler will pick the literal operator requiring the least
‘effort’. E.g., 120 is processed by a literal operator defining a unsigned long long int parameter
and not by its overloaded version, defining a char const _ parameter. But if overloaded literal
operators exist defining char const _ and long double parameters then the operator defining a
char const _ parameter is used when the argument 120 is provided, while the operator defining
a long double parameter is used with the argument 120.3.
A literator operator can define any return type. Here is an example of a definition of the _NM2km
literal operator:
double operator "" _NM2km(char const *nm)
{
return std::stod(nm) * 1.852;
}
double value = 120_NM2km; // example of use
Of course, the argument could also have been a long double constant. Here’s an alternative implementation,
explicitly expecting a long double:
double constexpr operator "" _NM2km(long double nm)
{
return nm * 1.852;
}
double value = 450.5_NM2km; // example of use
A numeric constant can also be processed completely at compile-time. Section 23.3 provides the
details of this type of literal operator.
Arguments to literal operators are themselves always constants. A literal operator like _NM2km
cannot be used to convert, e.g., the value of a variable. A literal operator, although it is defined as
functinon, cannot be called like a function. The following examples therefore result in compilation
errors:
double speed;
speed_NM2km; // no identifier ’speed_NM2km’
_NM2km(speed); // no function _NM2km
_NM2km(120.3); // no function _NM2km
11.13. OVERLOADABLE OPERATORS 297
11.13 Overloadable operators
The following operators can be overloaded:
+-*/%^&|
~ ! , = < > <= >=
++ -- << >> == != && ||
+= -= *= /= %= ^= &= |=
<<= >>= [] () -> ->* new new[]
delete delete[]
Several operators have textual alternatives:
textual alternative operator
and &&
and_eq &=
bitand &
bitor |
compl ~
not !
not_eq !=
or ||
or_eq |=
xor ^
xor_eq ^=
‘Textual’ alternatives
of operators are also overloadable (e.g., operator and). However, note that textual alternatives
are not additional operators. So, within the same context operator&& and operator and can not
both be overloaded.
Several of these operators may only be overloaded as member functions within a class. This holds
true for the ’=’, the ’[]’, the ’()’ and the ’->’ operators. Consequently, it isn’t possible to
redefine, e.g., the assignment operator globally in such a way that it accepts a char const _ as an
lvalue and a String & as an rvalue. Fortunately, that isn’t necessary either, as we have seen in
section 11.3.
Finally, the following operators cannot be overloaded:
. .* :: ?: sizeof typeid
298 CHAPTER 11. MORE OPERATOR OVERLOADING

Chapter 12
Abstract Containers
C++ offers several predefined datatypes, all part of the Standard Template Library, which can be
used to implement solutions to frequently occurring problems. The datatypes discussed in this chapter
are all containers: you can put stuff inside them, and you can retrieve the stored information from
them.
The interesting part is that the kind of data that can be stored inside these containers has been left
unspecified at the time the containers were constructed. That’s why they are spoken of as abstract
containers.
Abstract containers rely heavily on templates, covered in chapter 21 and beyond. To use abstract
containers, only a minimal grasp of the template concept is required. In C++ a template is in fact a
recipe for constructing a function or a complete class. The recipe tries to abstract the functionality
of the class or function as much as possible from the data on which the class or function operates. As
the data types on which the templates operate were not known when the template was implemented,
the datatypes are either inferred from the context in which a function template is used, or they are
mentioned explicitly when a class template is used (the term that’s used here is instantiated). In
situations where the types are explicitly mentioned, the angle bracket notation is used to indicate
which data types are required. For example, below (in section 12.2) we’ll encounter the pair container,
which requires the explicit mentioning of two data types. Here is a pair object containing
both an int and a string:
pair<int, string> myPair;
The object myPair is defined as an object holding both an int and a string.
The angle bracket notation is used intensively in the upcoming discussion of abstract containers.
Actually, understanding this part of templates is the only real requirement for using abstract containers.
Now that we’ve introduced this notation, we can postpone the more thorough discussion of
templates to chapter 21, and concentrate on their use in this chapter.
Most of the abstract containers are sequential containers: they contain data that can be stored and
retrieved in some sequential way. Examples are the array, implementing a fixed-sized array; a
vector, implementing an extendable array; the list, implementing a data structure that allows
for the easy insertion or deletion of data; the queue, also called a FIFO (first in, first out) structure,
in which the first element that is entered is the first element to be retrieved again; and the stack,
which is a first in, last out (FILO or LIFO) structure.
In addition to sequential containers several special containers are available. The pair is a basic
container in which a pair of values (of types that are left open for further specification) can be
299
300 CHAPTER 12. ABSTRACT CONTAINERS
stored, like two strings, two ints, a string and a double, etc.. Pairs are often used to return data
elements that naturally come in pairs. For example, the map is an abstract container storing keys
and their associated values. Elements of these maps are returned as pairs.
A variant of the pair is the complex container, implementing operations that are defined on complex
numbers.
A tuple (cf. section 22.6) generalizes the pair container to a data structure accomodating any
number of different data types.
All abstract containers described in this chapter as well as the string and stream datatypes (cf.
chapters 5 and 6) are part of the Standard Template Library.
All but the unordered containers support the following basic set of operators:
• The overloaded assignment operator, so we can assign two containers of the same types to
each other. If the container’s data type supports move assignment, then assignment of an
anonymous temporary container to a destination container will use move assignment when
assigning new values to the destination container’s element. Overloaded assignment is also
supported by the unordered containers;
• Tests for equality: == and != The equality operator applied to two containers returns true if
the two containers have the same number of elements, which are pairwise equal according to
the equality operator of the contained data type. The inequality operator does the opposite;
• Ordering operators: <, <=, > and >=. The < operator returns true if each element in the lefthand
side container is less than each corresponding element in the right-hand side container.
Additional elements in either the left-hand side container or the right-hand side container are
ignored.
container left;
container right;
left = {0, 2, 4};
right = {1, 3}; // left < right
right = {1, 3, 6, 1, 2}; // left < right
Note that before a user-defined type (usually a class-type) can be stored in a container, the userdefined
type should at least support:
• A default value (e.g., a default constructor)
• The equality operator (==)
• The less-than operator (<)
Sequential containers can also be initialized using initializer lists.
Most containers (exceptions are the stack (section 12.4.11), priority_queue (section 12.4.5), and
queue (section 12.4.4) containers) support members to determine their maximum sizes (through
their member function max_size).
Virtually all containers support copy construction. If the container supports copy construction and
the container’s data type supports move construction, then move construction is automatically used
for the container’s data elements when a container is initialized with an anonymous temporary
container.
12.1. NOTATIONS USED IN THIS CHAPTER 301
Closely linked to the standard template library are the generic algorithms. These algorithms may
be used to perform frequently occurring tasks or more complex tasks than is possible with the containers
themselves, like counting, filling, merging, filtering etc.. An overview of generic algorithms
and their applications is given in chapter 19. Generic algorithms usually rely on the availability of
iterators, representing begin and end-points for processing data stored inside containers. The abstract
containers usually support constructors and members expecting iterators, and they often have
members returning iterators (comparable to the string::begin and string::end members). In
this chapter the iterator concept is not further investigated. Refer to chapter 18 for this.
The url http://www.sgi.com/Technology/STL is worth visiting as it offers more extensive coverage
of abstract containers and the standard template library than can be provided by the C++
annotations.
Containers often collect data during their lifetimes. When a container goes out of scope, its destructor
tries to destroy its data elements. This only succeeds if the data elements themselves are
stored inside the container. If the data elements of containers are pointers to dynamically allocated
memory then the memory pointed to by these pointers is not destroyed, resulting in a memory leak.
A consequence of this scheme is that the data stored in a container should often be considered the
‘property’ of the container: the container should be able to destroy its data elements when the container’s
destructor is called. So, normally containers should not contain pointers to data. Also, a
container should not be required to contain const data, as const data prevent the use of many of
the container’s members, like the assignment operator.
12.1 Notations used in this chapter
In this chapter about containers, the following notational conventions are used:
• Containers live in the standard namespace. In code examples this will be clearly visible, but
in the text std:: is usually omitted.
• A container without angle brackets represents any container of that type. Mentally add the
required type in angle bracket notation. E.g., pair may represent pair<string, int>.
• The notation Type represents the generic type. Type could be int, string, etc.
• Identifiers object and container represent objects of the container type under discussion.
• The identifier value represents a value of the type that is stored in the container.
• Simple, one-letter identifiers, like n represent unsigned values.
• Longer identifiers represent iteratoriterators. Examples are pos, from, beyond
Some containers, e.g., the map container, contain pairs of values, usually called ‘keys’ and ‘values’.
For such containers the following notational convention is used in addition:
• The identifier key indicates a value of the used key-type
• The identifier keyvalue indicates a value of the ‘value_type’ used with the particular container.
302 CHAPTER 12. ABSTRACT CONTAINERS
12.2 The ‘pair’ container
The pair container is a rather basic container. It is used to store two elements, called first and
second, and that’s about it. Before using pair containers the header file <utility> must be
included.
The pair’s data types are specified when the pair object is defined (or declared) using the template’s
angle bracket notation (cf. chapter 21). Examples:
pair<string, string> piper("PA28", "PH-ANI");
pair<string, string> cessna("C172", "PH-ANG");
here, the variables piper and cessna are defined as pair variables containing two strings. Both
strings can be retrieved using the first and second fields of the pair type:
cout << piper.first << ’\n’ << // shows ’PA28’
cessna.second << ’\n’; // shows ’PH-ANG’
The first and second members can also be used to reassign values:
cessna.first = "C152";
cessna.second = "PH-ANW";
If a pair object must be completely reassigned, an anonymous pair object can be used as the righthand
operand of the assignment. An anonymous variable defines a temporary variable (which receives
no name) solely for the purpose of (re)assigning another variable of the same type. Its generic
form is
type(initializer list)
Note that when a pair object is used the type specification is not completed by just mentioning the
containername pair. It also requires the specification of the data types which are stored within
the pair. For this the (template) angle bracket notation is used again. E.g., the reassignment of the
cessna pair variable could have been accomplished as follows:
cessna = pair<string, string>("C152", "PH-ANW");
In cases like these, the type specification can become quite elaborate, which has caused a revival
of interest in the possibilities offered by the typedef keyword. If many pair<type1, type2>
clauses are used in a source, the typing effort may be reduced and readability might be improved by
first defining a name for the clause, and then using the defined name later. E.g.,
typedef pair<string, string> pairStrStr;
cessna = pairStrStr("C152", "PH-ANW");
Apart from this (and the basic set of operations (assignment and comparisons)) the pair offers no
further functionality. It is, however, a basic ingredient of the upcoming abstract containers map,
multimap and hash_map.
C++ also offers a generalized pair container: the tuple, covered in section 22.6.
12.3. ALLOCATORS 303
12.3 Allocators
Most containers use a special object for allocating the memory that is managed by them. This object
is called an allocator, and it’s type is (usually by default) specified when a container is constructed. A
container’s allocator can be obtained using the container’s get_allocator member, which returns
a copy of the allocator used by the container. Allocators offer the following members:
• value_type _address(value_type &object)
returns the address of object.
• value_type _allocate(size_t count)
allocates raw memory for holding count values of the container’s value_type.
• void construct(value_type _object, Arg &&...args)
using placement new, uses the arguments following object to install a value at
object.
• void destroy(value_type _object)
calls object’s destructor (but doesn’t deallocate object’s own memory).
• void deallocate(value_type _object, size_t count)
calls operator delete to delete object’s memory, previously allocated by
allocate.
• size_t max_size()
returns the maximum number of elements that allocate can allocate.
Here is an example, using the allocator of a vector of strings (see section 12.4.2 below for a description
of the vector container):
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int main()
{
vector<string> vs;
auto allocator = vs.get_allocator(); // get the allocator
string *sp = allocator.allocate(3); // alloc. space for 3 strings
allocator.construct(&sp[0], "hello world"); // initialize 1st string
allocator.construct(&sp[1], sp[0]); // use the copy constructor
allocator.construct(&sp[2], 12, ’=’); // string of 12 = chars
cout << sp[0] << ’\n’ << // show the strings
sp[1] << ’\n’ <<
sp[2] << ’\n’ <<
304 CHAPTER 12. ABSTRACT CONTAINERS
"could have allocated " << allocator.max_size() << " strings\n";
for (size_t idx = 0; idx != 3; ++idx)
allocator.destroy(sp + idx); // delete the string’s
// contents
allocator.deallocate(sp, 3); // and delete sp itself again.
}
12.4 Available Containers
12.4.1 The ‘array’ container
The array class implements a fixed-size array. Before using the array container the <array>
header file must be included.
To define a std::array both the data type of its elements and its size must be specified: the
data type is given after an opening angular bracket, immediately following the ‘array’ container
name. The array’s size is provided after the data type specification. Finally, a closing angular
bracket completes the array’s type. Specifications like this are common practice with containers.
The combination of array, type and size defines a type. As a result, array<string, 4> defines
another type than array<string, 5>, and a function explicitly defining an array<Type, N>
parameter will not accept an array<Type, M> argument if N and M are unequal.
The array’s size may may be defined as 0 (although such an array probably has little use as it cannot
store any element). The elements of an array are stored contiguously. If array<Type, N> arr has
been defined, then &arr[n] + m == &arr[n + m, assuming 0 <= n < N and assuming 0 <= n
+ m < N.
The following constructors, operators, and member functions are available:
• Constructors:
– The copy and move constructors are available;
– A array may be constructed with a fixed number N of default elements:
array<string, N> object;
– An initial subset of the elements of an array may be initialized using a brace delimited
initializer list:
array<double, 4> dArr = {1.2, 2.4};
Here dArr is defined as an array of 4 element, with dArr[0] and dArr[1] initialized to,
respectively 1.2 and 2.4, and dArr[2] and dArr[3] initialized to 0. A attractive characteristic
of arrays (and other containers) is that containers initialize their data elements to
the data type’s default value. The data type’s default constructor is used for this initialization.
With non-class data types the value 0 is used. So, for an array<double, 4> array
we know that all but its explicitly initialized elements are initialized to zero.
• In addition to the standard operators for containers, the array supports the index operator,
which can be used to retrieve or reassign individual elements of the array. Note that the elements
which are indexed must exist. For example, having defined an empty array a statement
like iarr[0] = 18 produces an error, as the array is empty. Note that operator[] does
not respect its array bounds. If you want run-time array bound checking, use the array’s at
member.
12.4. AVAILABLE CONTAINERS 305
• The array class offers the following member functions:
– Type &at(size_t idx):
returns a reference to the array’s element at index position idx. If idx exceeds
the array’s size a std::out_of_range exception is thrown.
– Type &back():
returns a reference to the last element in the array. It is the responsibility of the
programmer to use the member only if the array is not empty.
– array::iterator begin():
returns an iterator pointing to the first element in the array, returning end if the
array is empty.
– array::const_iterator cbegin():
returns a const_iterator pointing to the first element in the array, returning cend
if the array is empty.
– array::const_iterator cend():
returns a const_iterator pointing just beyond the array’s last element.
– array::const_reverse_iterator crbegin():
returns a const_reverse_iterator pointing to the last element in the array, returning
crend if the array is empty.
– array::const_reverse_iterator crend():
returns a const_reverse_iterator pointing just before the array’s first element.
– value_type _data():
returns a pointer to the array’s first data element. With a const array a
value_type const _ is returned.
– bool empty():
returns true if the array contains no elements.
– array::iterator end():
returns an iterator pointing beyond the last element in the array.
– void fill(Type const &item):
fills all the array’s elements with a copy of item
– Type &front():
returns a reference to the first element in the array. It is the responsibility of the
programmer to use the member only if the array is not empty.
– array::reverse_iterator rbegin():
this member returns an iterator pointing to the last element in the array.
– array::reverse_iterator rend():
returns an iterator pointing before the first element in the array.
– constexpr size_t size():
returns the number of elements the array contains.
– void swap(<array<Type, N> &other):
swaps the contents of the current and other array. The array other’s data type
and size must be equal to the data type and size of the object calling swap.
Using an array rather than a standard C style array offers several advantages:
• All its elements are immediately initialized;
306 CHAPTER 12. ABSTRACT CONTAINERS
• Introspection is possible (e.g., size can be used);
• The array container can be used in the context of templates, there code is developed that
operates on data types that become available only after the code itself has been developed;
• Since array supports reverse iterators, it can be immediately be used with generic algorithms
performing ‘reversed’ operations (e.g., to perform a descending rather than ascending sort (cf.
section 19.1.58))
In general, when looking for a sequential data structure, the array or vector (introduced in the
next section) should be your ‘weapon of choice’. Only if these containers demonstrably do not fit the
problem at hand you should use another type of container.
12.4.2 The ‘vector’ container
The vector class implements an expandable array. Before using the vector container the
<vector> header file must be included.
The following constructors, operators, and member functions are available:
• Constructors:
– The copy and move constructors are available;
– A vector may be constructed empty:
vector<string> object;
– A vector may be initialized to a certain number of elements:
vector<string> object(5, string("Hello")); // initialize to 5 Hello’s,
vector<string> container(10); // and to 10 empty strings
vector<string> names = {"george", "frank", "tony", "karel"};
– A vector may be initialized using iterators. To initialize a vector with elements 5 until 10
(including the last one) of an existing vector<string> the following construction may
be used:
extern vector<string> container;
vector<string> object(&container[5], &container[11]);
Note here that the last element pointed to by the second iterator (&container[11]) is
not stored in object. This is a simple example of the use of iterators, in which the range
of values that is used starts at the first value, and includes all elements up to but not
including the element to which the second iterator refers. The standard notation for this
is [begin, end).
• In addition to the standard operators for containers, the vector supports the index operator,
which can be used to retrieve or reassign individual elements of the vector. Note that the elements
which are indexedmust exist. For example, having defined an empty vector a statement
like ivect[0] = 18 produces an error, as the vector is empty. So, the vector is not automatically
expanded, and operator[] does not respect its array bounds. In this case the vector
should be resized first, or ivect.push_back(18) should be used (see below). If you need
run-time array bound checking, use the vector’s at member.
• The vector class offers the following member functions:
– void assign(...):
assigns new contents to the vector:
12.4. AVAILABLE CONTAINERS 307
_ assign(iterator begin, iterator end) assigns the values at the iterator
range [begin, end) to the vector;
_ assign(size_type n, value_type const &val) assigns n copies of val to the
vector;
_ assign(initializer_list<value_type> values) assigns the values in the
initializer list to the vector.
– Type &at(size_t idx):
returns a reference to the vector’s element at index position idx. If idx exceeds
the vector’s size a std::out_of_range exception is thrown.
– Type &back():
returns a reference to the last element in the vector. It is the responsibility of the
programmer to use the member only if the vector is not empty.
– vector::iterator begin():
returns an iterator pointing to the first element in the vector, returning end if the
vector is empty.
– size_t capacity():
Number of elements for which memory has been allocated. It returns at least the
value returned by size
– vector::const_iterator cbegin():
returns a const_iterator pointing to the first element in the vector, returning cend
if the vector is empty.
– vector::const_iterator cend():
returns a const_iterator pointing just beyond the vector’s last element.
– void clear():
erases all the vector’s elements.
– vector::const_reverse_iterator crbegin():
returns a const_reverse_iterator pointing to the last element in the vector, returning
crend if the vector is empty.
– vector::const_reverse_iterator crend():
returns a const_reverse_iterator pointing just before the vector’s first element.
– value_type _data():
returns a pointer to the vector’s first data element.
– iterator emplace(const_iterator position, Args &&...args):
a value_type object is constructed from the arguments specified after
position, and the newly created element is inserted at position.
– void emplace_back(Args &&...args):
a value_type object is constructed from the member’s arguments, and the newly
created element is inserted beyond the vector’s last element.
– bool empty():
returns true if the vector contains no elements.
– vector::iterator end():
returns an iterator pointing beyond the last element in the vector.
– vector::iterator erase():
erases a specific range of elements in the vector:
308 CHAPTER 12. ABSTRACT CONTAINERS
_ erase(pos) erases the element pointed to by the iterator pos. The iterator ++pos is
returned.
_ erase(first, beyond) erases elements indicated by the iterator range [first,
beyond), returning beyond.
– Type &front():
returns a reference to the first element in the vector. It is the responsibility of the
programmer to use the member only if the vector is not empty.
– allocator_type get_allocator() const:
returns a copy of the allocator object used by the vector object.
– ... insert():
elements may be inserted starting at a certain position. The return value depends
on the version of insert() that is called:
_ vector::iterator insert(pos) inserts a default value of type Type at pos, pos
is returned.
_ vector::iterator insert(pos, value) inserts value at pos, pos is returned.
_ void insert(pos, first, beyond) inserts the elements in the iterator range
[first, beyond).
_ void insert(pos, n, value) inserts n elements having value value at position
pos.
– size_t max_size():
returns the maximum number of elements this vector may contain.
– void pop_back():
removes the last element from the vector. With an empty vector nothing happens.
– void push_back(value):
adds value to the end of the vector.
– vector::reverse_iterator rbegin():
this member returns an iterator pointing to the last element in the vector.
– vector::reverse_iterator rend():
returns an iterator pointing before the first element in the vector.
– void reserve(size_t request):
if request is less than or equal to capacity, this call has no effect. Otherwise, it
is a request to allocate additionalmemory. If the call is successful, then capacity
returns a value of at least request. Otherwise, capacity is unchanged. In
either case, size’s return value won’t change, until a function like resize is
called, actually changing the number of accessible elements.
– void resize():
can be used to alter the number of elements that are currently stored in the vector:
_ resize(n, value) may be used to resize the vector to a size of n. Value is optional.
If the vector is expanded and value is not provided, the additional elements are initialized
to the default value of the used data type, otherwise value is used to initialize
extra elements.
– void shrink_to_fit():
optionally reduces the amount of memory allocated by a vector to its current size.
The implementor is free to ignore or otherwise optimize this request. In order to
guarantee a ‘shrink to fit’ operation the
vector<Type>(vectorObject).swap(vectorObject)
12.4. AVAILABLE CONTAINERS 309
Figure 12.1: A list data-structure
idiom can be used.
– size_t size():
returns the number of elements in the vector.
– void swap():
swaps two vectors using identical data types. Example:
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> v1(7);
vector<int> v2(10);
v1.swap(v2);
cout << v1.size() << " " << v2.size() << ’\n’;
}
/*
Produced output:
10 7
*/
12.4.3 The ‘list’ container
The list container implements a list data structure. Before using a list container the header file
<list> must be included.
The organization of a list is shown in figure 12.1. Figure 12.1 shows that a list consists of separate
list-elements, connected by pointers. The list can be traversed in two directions: starting at Front
the list may be traversed from left to right, until the 0-pointer is reached at the end of the rightmost
list-element. The list can also be traversed from right to left: starting at Back, the list is traversed
fromright to left, until eventually the 0-pointer emanating fromthe leftmost list-element is reached.
As a subtlety note that the representation given in figure 12.1 is not necessarily used in actual
implementations of the list. For example, consider the following little program:
int main()
{
list<int> l;
cout << "size: " << l.size() << ", first element: " <<
l.front() << ’\n’;
}
310 CHAPTER 12. ABSTRACT CONTAINERS
When this program is run it might actually produce the output:
size: 0, first element: 0
Its front element can even be assigned a value. In this case the implementor has chosen to provide
the list with a hidden element. The list actually is a circular list, where the hidden element serves
as terminating element, replacing the 0-pointers in figure 12.1. As noted, this is a subtlety, which
doesn’t affect the conceptual notion of a list as a data structure ending in 0-pointers. Note also that
it is well known that various implementations of list-structures are possible (cf. Aho, A.V., Hopcroft
J.E. and Ullman, J.D., (1983) Data Structures and Algorithms (Addison-Wesley)).
Both lists and vectors are often appropriate data structures in situations where an unknown number
of data elements must be stored. However, there are some rules of thumb to follow when selecting
the appropriate data structure.
• When most accesses are random, a vector is the preferred data structure. Example: in a
program counting character frequencies in a textfile, a vector<int> frequencies(256) is
the datastructure of choice, as the values of the received characters can be used as indices into
the frequencies vector.
• The previous example illustrates a second rule of thumb, also favoring the vector: if the
number of elements is known in advance (and does not notably change during the lifetime of
the program), the vector is also preferred over the list.
• In cases where insertions or deletions prevail and the data structure is large the list is generally
preferred.
At present lists aren’t as useful anymore as they used to be (when computers were much slower and
more memory-constrained). Except maybe for some rare cases, a vector should be the preferred
container; even when implementing algorithms traditionally using lists.
Other considerations related to the choice between lists and vectors should also be given some
thought. Although it is true that the vector is able to grow dynamically, the dynamic growth requires
data-copying. Clearly, copying a million large data structures takes a considerable amount
of time, even on fast computers. On the other hand, inserting a large number of elements in a list
doesn’t require us to copy non-involved data. Inserting a new element in a list merely requires us
to juggle some pointers. In figure 12.2 this is shown: a new element is inserted between the second
and third element, creating a new list of four elements. Removing an element from a list is also
fairly easy. Starting again from the situation shown in figure 12.1, figure 12.3 shows what happens
if element two is removed from our list. Again: only pointers need to be juggled. In this case it’s even
simpler than adding an element: only two pointers need to be rerouted. To summarize the comparison
between lists and vectors: it’s probably best to conclude that there is no clear-cut answer to the
question what data structure to prefer. There are rules of thumb, which may be adhered to. But if
worse comes to worst, a profiler may be required to find out what’s best.
The list container offers the following constructors, operators, and member functions:
• Constructors:
– The copy and move constructors are available;
– A list may be constructed empty:
list<string> object;
As with the vector, it is an error to refer to an element of an empty list.
12.4. AVAILABLE CONTAINERS 311
Figure 12.2: Adding a new element to a list
Figure 12.3: Removing an element from a list
– A list may be initialized to a certain number of elements. By default, if the initialization
value is not explicitly mentioned, the default value or default constructor for the actual
data type is used. For example:
list<string> object(5, string("Hello")); // initialize to 5 Hello’s
list<string> container(10); // and to 10 empty strings
– A list may be initialized using a two iterators. To initialize a list with elements 5 until 10
(including the last one) of a vector<string> the following construction may be used:
extern vector<string> container;
list<string> object(&container[5], &container[11]);
• The list does not offer specialized operators, apart fromthe standard operators for containers.
• The following member functions are available:
– void assign(...):
assigns new contents to the list:
_ assign(iterator begin, iterator end) assigns the values at the iterator
range [begin, end) to the list;
_ assign(size_type n, value_type const &val) assigns n copies of val to the
list;
– Type &back():
returns a reference to the last element in the list. It is the responsibility of the
programmer to use this member only if the list is not empty.
– list::iterator begin():
returns an iterator pointing to the first element in the list, returning end if the
list is empty.
312 CHAPTER 12. ABSTRACT CONTAINERS
– void clear():
erases all elements from the list.
– bool empty():
returns true if the list contains no elements.
– list::iterator end():
returns an iterator pointing beyond the last element in the list.
– list::iterator erase():
erases a specific range of elements in the list:
_ erase(pos) erases the element pointed to by pos. The iterator ++pos is returned.
_ erase(first, beyond) erases elements indicated by the iterator range [first,
beyond). Beyond is returned.
– Type &front():
returns a reference to the first element in the list. It is the responsibility of the
programmer to use this member only if the list is not empty.
– allocator_type get_allocator() const:
returns a copy of the allocator object used by the list object.
– ... insert():
inserts elements into the list. The return value depends on the version of insert
that is called:
_ list::iterator insert(pos) inserts a default value of type Type at pos, pos is
returned.
_ list::iterator insert(pos, value) inserts value at pos, pos is returned.
_ void insert(pos, first, beyond) inserts the elements in the iterator range
[first, beyond).
_ void insert(pos, n, value) inserts n elements having value value at position
pos.
– size_t max_size():
returns the maximum number of elements this list may contain.
– void merge(list<Type> other):
this member function assumes that the current and other lists are sorted (see
below, the member sort). Based on that assumption, it inserts the elements of
other into the current list in such a way that the modified list remains sorted.
If both list are not sorted, the resulting list will be ordered ‘as much as possible’,
given the initial ordering of the elements in the two lists. list<Type>::merge
uses Type::operator< to sort the data in the list, which operator must therefore
be available. The next example illustrates the use of the merge member: the
list ‘object’ is not sorted, so the resulting list is ordered ’as much as possible’.
#include <iostream>
#include <string>
#include <list>
using namespace std;
void showlist(list<string> &target)
{
for
(
list<string>::iterator from = target.begin();
12.4. AVAILABLE CONTAINERS 313
from != target.end();
++from
)
cout << *from << " ";
cout << ’\n’;
}
int main()
{
list<string> first;
list<string> second;
first.push_back(string("alpha"));
first.push_back(string("bravo"));
first.push_back(string("golf"));
first.push_back(string("quebec"));
second.push_back(string("oscar"));
second.push_back(string("mike"));
second.push_back(string("november"));
second.push_back(string("zulu"));
first.merge(second);
showlist(first);
}
A subtlety is that merge doesn’t alter the list if the list itself is used as argument:
object.merge(object) won’t change the list ‘object’.
– void pop_back():
removes the last element from the list. With an empty list nothing happens.
– void pop_front():
removes the first element from the list. With an empty list nothing happens.
– void push_back(value):
adds value to the end of the list.
– void push_front(value):
adds value before the first element of the list.
– list::reverse_iterator rbegin():
returns an iterator pointing to the last element in the list.
– void remove(value):
removes all occurrences of value from the list. In the following example, the two
strings ‘Hello’ are removed from the list object:
#include <iostream>
#include <string>
#include <list>
using namespace std;
int main()
{
list<string> object;
314 CHAPTER 12. ABSTRACT CONTAINERS
object.push_back(string("Hello"));
object.push_back(string("World"));
object.push_back(string("Hello"));
object.push_back(string("World"));
object.remove(string("Hello"));
while (object.size())
{
cout << object.front() << ’\n’;
object.pop_front();
}
}
/*
Generated output:
World
World
*/
– void remove_if(Predicate pred):
removes all occurrences from the list for which the predicate function or function
object pred returns true. For each of the objects stored in the list the predicate
is called as pred(_iter), where iter represents the iterator used internally
by remove_if. If a function pred is used, its prototype should be bool
pred(value_type const &object).
list::reverse_iterator rend():
this member returns an iterator pointing before the first element in the
list.
void resize():
alters the number of elements that are currently stored in the list:
_ resize(n, value) may be used to resize the list to a size of n. Value is
optional. If the list is expanded and value is not provided, the extra elements
are initialized to the default value of the used data type, otherwise value is
used to initialize extra elements.
void reverse():
reverses the order of the elements in the list. The element back becomes
front and vice versa.
size_t size():
returns the number of elements in the list.
void sort():
sorts the list. Once the list has been sorted, An example of its
use is given at the description of the unique member function below.
list<Type>::sort uses Type::operator< to sort the data in the list,
which operator must therefore be available.
void splice(pos, object):
transfers the contents of object to the current list, starting the insertion
at the iterator position pos of the object using the splice member.
Following splice, object is empty. For example:
#include <iostream>
#include <string>
#include <list>
using namespace std;
12.4. AVAILABLE CONTAINERS 315
int main()
{
list<string> object;
object.push_front(string("Hello"));
object.push_back(string("World"));
list<string> argument(object);
object.splice(++object.begin(), argument);
cout << "Object contains " << object.size() << " elements, " <<
"Argument contains " << argument.size() <<
" elements,\n";
while (object.size())
{
cout << object.front() << ’\n’;
object.pop_front();
}
}
Alternatively, argument may be followed by an iterator of argument, indicating
the first element of argument that should be spliced, or by two
iterators begin and end defining the iterator-range [begin, end) on
argument that should be spliced into object.
void swap():
swaps two lists using identical data types.
void unique():
operating on a sorted list, this member function removes all consecutively
identical elements from the list. list<Type>::unique uses
Type::operator== to identify identical data elements, which operator
must therefore be available. Here’s an example removing all multiply occurring
words from the list:
#include <iostream>
#include <string>
#include <list>
using namespace std;
// see the merge() example
void showlist(list<string> &target)
{
for
(
list<string>::iterator from = target.begin();
from != target.end();
++from
)
cout << *from << " ";
cout << ’\n’;
}
int main()
{
316 CHAPTER 12. ABSTRACT CONTAINERS
string
array[] =
{
"charley",
"alpha",
"bravo",
"alpha"
};
list<string>
target
(
array, array + sizeof(array)
/ sizeof(string)
);
cout << "Initially we have:\n";
showlist(target);
target.sort();
cout << "After sort() we have:\n";
showlist(target);
target.unique();
cout << "After unique() we have:\n";
showlist(target);
}
/*
Generated output:
Initially we have:
charley alpha bravo alpha
After sort() we have:
alpha alpha bravo charley
After unique() we have:
alpha bravo charley
*/
12.4.4 The ‘queue’ container
The queue class implements a queue data structure. Before using a queue container the header file
<queue> must be included.
A queue is depicted in figure 12.4. In figure 12.4 it is shown that a queue has one point (the back)
where items can be added to the queue, and one point (the front) where items can be removed (read)
from the queue. A queue is therefore also called a FIFO data structure, for first in, first out. It
is most often used in situations where events should be handled in the same order as they are
generated.
The following constructors, operators, and member functions are available for the queue container:
• Constructors:
– The copy and move constructors are available;
12.4. AVAILABLE CONTAINERS 317
Figure 12.4: A queue data-structure
– A queue may be constructed empty:
queue<string> object;
As with the vector, it is an error to refer to an element of an empty queue.
• The queue container only supports the basic container operators.
• The following member functions are available for queues:
– Type &back():
returns a reference to the last element in the queue. It is the responsibility of the
programmer to use the member only if the queue is not empty.
– bool empty():
returns true if the queue contains no elements.
– Type &front():
returns a reference to the first element in the queue. It is the responsibility of the
programmer to use the member only if the queue is not empty.
– void pop():
removes the element at the front of the queue. Note that the element is not returned
by this member. Nothing happens if the member is called for an empty
queue. One might wonder why pop returns void, instead of a value of type Type
(cf. front). One reason is found in the principles of good software design: functions
should perform one task. Combining the removal and return of the removed
element breaks this principle. Moreover, when this principle is abandoned pop’s
implementation is always flawed. Consider the prototypical implementation of a
pop member that is supposed to return the just popped value:
Type queue::pop()
{
Type ret(front());
erase_front();
return ret;
}
The venom, as usual, is in the tail: since queue has no control over Type’s behavior
the final statement (return ret) might throw. By that time the queue’s front
element has already been removed from the queue and so it is lost. Thus, a Type
returning pop member cannot offer the strong guarantee and consequently pop
should not return the former front element. Because of all this, we must first
use front and then pop to obtain and remove the queue’s front element.
– void push(value):
this member adds value to the back of the queue.
– size_t size():
returns the number of elements in the queue.
318 CHAPTER 12. ABSTRACT CONTAINERS
Note that the queue does not support iterators or a subscript operator. The only elements that can
be accessed are its front and back element. A queue can be emptied by:
• repeatedly removing its front element;
• assigning an empty queue using the same data type to it;
• having its destructor called.
12.4.5 The ‘priority_queue’ container
The priority_queue class implements a priority queue data structure. Before using a
priority_queue container the <queue> header file must have been included.
A priority queue is identical to a queue, but allows the entry of data elements according to priority
rules. A real-life priority queue is found, e.g., at airport check-in terminals. At a terminal the
passengers normally stand in line to wait for their turn to check in, but late passengers are usually
allowed to jump the queue: they receive a higher priority than other passengers.
The priority queue uses operator< of the data type stored in the priority queue to decide about the
priority of the data elements. The smaller the value, the lower the priority. So, the priority queue
could be used to sort values while they arrive. A simple example of such a priority queue application
is the following program: it reads words from cin and writes a sorted list of words to cout:
#include <iostream>
#include <string>
#include <queue>
using namespace std;
int main()
{
priority_queue<string> q;
string word;
while (cin >> word)
q.push(word);
while (q.size())
{
cout << q.top() << ’\n’;
q.pop();
}
}
Unfortunately, the words are listed in reversed order: because of the underlying <-operator the
words appearing later in the ASCII-sequence appear first in the priority queue. A solution to that
problem is to define a wrapper class around the string datatype, reversing string’s operator<.
Here is the modified program:
#include <iostream>
#include <string>
#include <queue>
12.4. AVAILABLE CONTAINERS 319
class Text
{
std::string d_s;
public:
Text(std::string const &str)
:
d_s(str)
{}
operator std::string const &() const
{
return d_s;
}
bool operator<(Text const &right) const
{
return d_s > right.d_s;
}
};
using namespace std;
int main()
{
priority_queue<Text> q;
string word;
while (cin >> word)
q.push(word);
while (q.size())
{
word = q.top();
cout << word << ’\n’;
q.pop();
}
}
Other possibilities to achieve the same exist. One would be to store the contents of the priority queue
in, e.g., a vector, from which the elements can be read in reversed order.
The following constructors, operators, and member functions are available for the priority_queue
container:
• Constructors:
– The copy and move constructors are available;
– A priority_queue may be constructed empty:
priority_queue<string> object;
As with the vector, it is an error to refer to an element of an empty priority queue.
• The priority_queue only supports the basic operators of containers.
• The following member functions are available for priority queues:
320 CHAPTER 12. ABSTRACT CONTAINERS
– bool empty():
returns true if the priority queue contains no elements.
– void pop():
removes the element at the top of the priority queue. Note that the element is not
returned by this member. Nothing happens if this member is called for an empty
priority queue. See section 12.4.4 for a discussion about the reason why pop has
return type void.
– void push(value):
inserts value at the appropriate position in the priority queue.
– size_t size():
returns the number of elements in the priority queue.
– Type &top():
returns a reference to the first element of the priority queue. It is the responsibility
of the programmer to use the member only if the priority queue is not
empty.
Note that the priority queue does not support iterators or a subscript operator. The only element
that can be accessed is its top element. A priority queue can be emptied by:
• repeatedly removing its top element;
• assigning an empty queue using the same data type to it;
• having its destructor called.
12.4.6 The ‘deque’ container
The deque (pronounce: ‘deck’) class implements a doubly ended queue data structure (deque). Before
using a deque container the header file <deque> must be included.
A deque is comparable to a queue, but it allows for reading and writing at both ends. Actually, the
deque data type supports a lot more functionality than the queue, as illustrated by the following
overview of available member functions. A deque is a combination of a vector and two queues,
operating at both ends of the vector. In situations where random insertions and the addition and/or
removal of elements at one or both sides of the vector occurs frequently using a deque should be
considered.
The following constructors, operators, and member functions are available for deques:
• Constructors:
– The copy and move constructors are available;
– A deque may be constructed empty:
deque<string> object;
As with the vector, it is an error to refer to an element of an empty deque.
– A deque may be initialized to a certain number of elements. By default, if the initialization
value is not explicitly mentioned, the default value or default constructor for the actual
data type is used. For example:
deque<string> object(5, string("Hello")), // initialize to 5 Hello’s
deque<string> container(10); // and to 10 empty strings
12.4. AVAILABLE CONTAINERS 321
– A deque may be initialized using two iterators. To initialize a deque with elements 5 until
10 (including the last one) of a vector<string> the following construction may be used:
extern vector<string> container;
deque<string> object(&container[5], &container[11]);
• In addition to the standard operators for containers, the deque supports the index operator,
whichmay be used to retrieve or reassign randomelements of the deque. Note that the indexed
elements must exist.
• The following member functions are available for deques:
– void assign(...):
assigns new contents to the deque:
_ assign(iterator begin, iterator end) assigns the values at the iterator
range [begin, end) to the deque;
_ assign(size_type n, value_type const &val) assigns n copies of val to the
deque;
– Type &at(size_t idx):
returns a reference to the deque’s element at index position idx. If idx exceeds
the deque’s size a std::out_of_range exception is thrown.
– Type &back():
returns a reference to the last element in the deque. It is the responsibility of the
programmer to use the member only if the deque is not empty.
– deque::iterator begin():
returns an iterator pointing to the first element in the deque.
– deque::const_iterator cbegin():
returns a const_iterator pointing to the first element in the deque, returning cend
if the deque is empty.
– deque::const_iterator cend():
returns a const_iterator pointing just beyond the deque’s last element.
– void clear():
erases all elements in the deque.
– deque::const_reverse_iterator crbegin():
returns a const_reverse_iterator pointing to the last element in the deque, returning
crend if the deque is empty.
– deque::const_reverse_iterator crend():
returns a const_reverse_iterator pointing just before the deque’s first element.
– iterator emplace(const_iterator position, Args &&...args)
a value_type object is constructed from the arguments specified after
position, and the newly created element is inserted at position.
– void emplace_back(Args &&...args)
a value_type object is constructed from the member’s arguments, and the newly
created element is inserted beyond the deque’s last element.
– void emplace_front(Args &&...args)
a value_type object is constructed from the member’s arguments, and the newly
created element is inserted before the deque’s first element.
322 CHAPTER 12. ABSTRACT CONTAINERS
– bool empty():
returns true if the deque contains no elements.
– deque::iterator end():
returns an iterator pointing beyond the last element in the deque.
– deque::iterator erase():
the member can be used to erase a specific range of elements in the deque:
_ erase(pos) erases the element pointed to by pos. The iterator ++pos is returned.
_ erase(first, beyond) erases elements indicated by the iterator range [first,
beyond). Beyond is returned.
– Type &front():
returns a reference to the first element in the deque. It is the responsibility of the
programmer to use the member only if the deque is not empty.
– allocator_type get_allocator() const:
returns a copy of the allocator object used by the deque object.
– ... insert():
inserts elements starting at a certain position. The return value depends on the
version of insert that is called:
_ deque::iterator insert(pos) inserts a default value of type Type at pos, pos
is returned.
_ deque::iterator insert(pos, value) inserts value at pos, pos is returned.
_ void insert(pos, first, beyond) inserts the elements in the iterator range
[first, beyond).
_ void insert(pos, n, value) inserts n elements having value value starting at
iterator position pos.
– size_t max_size():
returns the maximum number of elements this deque may contain.
– void pop_back():
removes the last element from the deque. With an empty deque nothing happens.
– void pop_front():
removes the first element from the deque. With an empty deque nothing happens.
– void push_back(value):
adds value to the end of the deque.
– void push_front(value):
adds value before the first element of the deque.
– deque::reverse_iterator rbegin():
returns an iterator pointing to the last element in the deque.
– deque::reverse_iterator rend():
this member returns an iterator pointing before the first element in the deque.
– void resize():
alters the number of elements that are currently stored in the deque:
_ resize(n, value) may be used to resize the deque to a size of n. Value is optional.
If the deque is expanded and value is not provided, the additional elements are initialized
to the default value of the used data type, otherwise value is used to initialize
extra elements.
12.4. AVAILABLE CONTAINERS 323
– void shrink_to_fit():
optionally reduces the amount of memory allocated by a deque to its
current size. The implementor is free to ignore or otherwise optimize
this request. In order to guarantee a ‘shrink to fit’ operation
deque<Type>(dequeObject).swap(dequeObject) idiom can be used.
– size_t size():
returns the number of elements in the deque.
– void swap(argument):
swaps two deques using identical data types.
12.4.7 The ‘map’ container
The map class offers a (sorted) associative array. Before using a map container the <map> header
file must be included.
A map is filled with key/value pairs, which may be of any container-accepted type. Since types are
associated with both the key and the value, we must specify two types in the angle bracket notation,
comparable to the specification we’ve seen with the pair container (cf. section 12.2). The first type
represents the key’s type, the second type represents the value’s type. For example, a map in which
the key is a string and the value is a double can be defined as follows:
map<string, double> object;
The key is used to access its associated information. That information is called the value. For
example, a phone book uses the names of people as the key, and uses the telephone number and
maybe other information (e.g., the zip-code, the address, the profession) as value. Since a map sorts
its keys, the key’s operator< must be defined, and it must be sensible to use it. For example, it is
generally a bad idea to use pointers for keys, as sorting pointers is something different than sorting
the values pointed at by those pointers.
The two fundamental operations on maps are the storage of Key/Value combinations, and the retrieval
of values, given their keys. The index operator using a key as the index, can be used for both.
If the index operator is used as lvalue, the expression’s rvalue is inserted into the map. If it is used
as rvalue, the key’s associated value is retrieved. Each key can be stored only once in a map. If the
same key is entered again, the new value replaces the formerly stored value, which is lost.
A specific key/value combination can implicitly or explicitly be inserted into a map. If explicit insertion
is required, the key/value combination must be constructed first. For this, every map defines a
value_type which may be used to create values that can be stored in the map. For example, a value
for a map<string, int> can be constructed as follows:
map<string, int>::value_type siValue("Hello", 1);
The value_type is associated with the map<string, int>: the type of the key is string, the
type of the value is int. Anonymous value_type objects are also often used. E.g.,
map<string, int>::value_type("Hello", 1);
Instead of using the line map<string, int>::value_type(...) over and over again, a
typedef is frequently used to reduce typing and to improve readability:
typedef map<string, int>::value_type StringIntValue
324 CHAPTER 12. ABSTRACT CONTAINERS
Using this typedef, values for the map<string, int> may now be constructed using:
StringIntValue("Hello", 1);
Alternatively, pairs may be used to represent key/value combinations used by maps:
pair<string, int>("Hello", 1);
12.4.7.1 The ‘map’ constructors
The following constructors are available for the map container:
• The copy and move constructors are available;
• A map may be constructed empty:
map<string, int> object;
Note that the values stored in maps may be containers themselves. For example, the following
defines a map in which the value is a pair: a container nested under another container:
map<string, pair<string, string>> object;
Note the use of the two consecutive closing angle brackets, which does not result in ambiguities
as their syntactical context differs from their use as binary operators in expressions.
• A map may be initialized using two iterators. The iterators may either point to value_type
values for the map to be constructed, or to plain pair objects. If pairs are used, their first
element represents the type of the keys, and their second element represents the type of the
values. Example:
pair<string, int> pa[] =
{
pair<string,int>("one", 1),
pair<string,int>("two", 2),
pair<string,int>("three", 3),
};
map<string, int> object(&pa[0], &pa[3]);
In this example, map<string, int>::value_type could have been written instead of
pair<string, int> as well.
If begin represents the first iterator that is used to construct a map and if end represents the
second iterator, [begin, end) will be used to initialize the map. Maybe contrary to intuition,
the map constructor only enters new keys. If the last element of pa would have been "one", 3,
only two elements would have entered the map: "one", 1 and "two", 2. The value "one",
3 would silently have been ignored.
The map receives its own copies of the data to which the iterators point as illustrated by the
following example:
#include <iostream>
#include <map>
12.4. AVAILABLE CONTAINERS 325
using namespace std;
class MyClass
{
public:
MyClass()
{
cout << "MyClass constructor\n";
}
MyClass(MyClass const &other)
{
cout << "MyClass copy constructor\n";
}
~MyClass()
{
cout << "MyClass destructor\n";
}
};
int main()
{
pair<string, MyClass> pairs[] =
{
pair<string, MyClass>("one", MyClass())
};
cout << "pairs constructed\n";
map<string, MyClass> mapsm(&pairs[0], &pairs[1]);
cout << "mapsm constructed\n";
}
/*
Generated output:
MyClass constructor
MyClass copy constructor
MyClass destructor
pairs constructed
MyClass copy constructor
MyClass copy constructor
MyClass destructor
mapsm constructed
MyClass destructor
MyClass destructor
*/
When tracing the output of this program, we see that, first, the constructor of a MyClass object
is called to initialize the anonymous element of the array pairs. This object is then copied
into the first element of the array pairs by the copy constructor. Next, the original element is
not required anymore and is destroyed. At that point the array pairs has been constructed.
Thereupon, the map constructs a temporary pair object, which is used to construct the map
element. Having constructed the map element, the temporary pair object is destroyed. Eventually,
when the program terminates, the pair element stored in the map is destroyed too.
326 CHAPTER 12. ABSTRACT CONTAINERS
12.4.7.2 The ‘map’ operators
The map supports, in addition to the standard operators for containers, the index operator.
The index operator may be used to retrieve or reassign individual elements of the map. The argument
of the index operator is called a key.
If the provided key is not available in the map, a new data element is automatically added to the map
using the default value or default constructor to initialize the value part of the new element. This
default value is returned if the index operator is used as an rvalue.
When initializing a new or reassigning another element of the map, the type of the right-hand side
of the assignment operatormust be equal to (or promotable to) the type of the map’s value part. E.g.,
to add or change the value of element "two" in a map, the following statement can be used:
mapsm["two"] = MyClass();
12.4.7.3 The ‘map’ public members
The following member functions are available for the map container:
• mapped_type &at(key_type const &key):
returns a reference to the map’s mapped_type associated with key. If the key is not
stored in the map an std::out_of_range exception is thrown.
• map::iterator begin():
returns an iterator pointing to the first element of the map.
• map::const_iterator cbegin():
returns a const_iterator pointing to the first element in the map, returning cend if
the map is empty.
• map::const_iterator cend():
returns a const_iterator pointing just beyond the map’s last element.
• void clear():
erases all elements from the map.
• size_t count(key):
returns 1 if the provided key is available in the map, otherwise 0 is returned.
• map::reverse_iterator crbegin() const:
returns an iterator pointing to the last element of the map.
• map::reverse_iterator crend():
returns an iterator pointing before the first element of the map.
12.4. AVAILABLE CONTAINERS 327
• pair<iterator, bool> emplace(Args &&...args):
a value_type object is constructed from emplace’s arguments. If the map already
contained an object using the same key_type value, then a std::pair is returned
containing an iterator pointing to the object using the same key_type value and the
value false. If no such key_type value was found, the newly constructed object is
inserted into the map, and the returned std::pair contains an iterator pointing to
the newly inserted inserted value_type as well as the value true.
• iterator emplace_hint(const_iterator position, Args &&...args):
a value_type object is constructed from the member’s arguments, and the newly
created element is inserted into the map, unless the (at args) provided key already
exists. The implementation may or may not use position as a hint to start looking
for an insertion point. The returned iterator points to the value_type using the
provided key. It may refer to an already existing value_type or to a newly added
value_type; an existing value_type is not replaced. If a new value was added,
then the container’s size has been incremented when emplace_hint returns.
• bool empty():
returns true if the map contains no elements.
• map::iterator end():
returns an iterator pointing beyond the last element of the map.
• pair<map::iterator, map::iterator> equal_range(key):
this member returns a pair of iterators, being respectively the return values of the
member functions lower_bound and upper_bound, introduced below. An example
illustrating these member functions is given at the discussion of the member function
upper_bound.
• ... erase():
erases a specific element or range of elements from the map:
– bool erase(key) erases the element having the given key from the map. True is returned
if the value was removed, false if the map did not contain an element using the
given key.
– void erase(pos) erases the element pointed to by the iterator pos.
– void erase(first, beyond) erases all elements indicated by the iterator range
[first, beyond).
• map::iterator find(key):
returns an iterator to the element having the given key. If the element isn’t available,
end is returned. The following example illustrates the use of the find member
function:
#include <iostream>
#include <map>
using namespace std;
int main()
{
map<string, int> object;
328 CHAPTER 12. ABSTRACT CONTAINERS
object["one"] = 1;
map<string, int>::iterator it = object.find("one");
cout << "‘one’ " <<
(it == object.end() ? "not " : "") << "found\n";
it = object.find("three");
cout << "‘three’ " <<
(it == object.end() ? "not " : "") << "found\n";
}
/*
Generated output:
‘one’ found
‘three’ not found
*/
• allocator_type get_allocator() const:
returns a copy of the allocator object used by the map object.
• ... insert():
inserts elements into the map. Values associated with already existing keys, however,
are not replaced by new values. Its return value depends on the version of insert
that is called:
– pair<map::iterator, bool> insert(keyvalue) inserts a new value_type into
the map. The return value is a pair<map::iterator, bool>. If the returned bool
field is true, keyvalue was inserted into the map. The value false indicates that the
key that was specified in keyvalue was already available in the map, and so keyvalue
was not inserted into the map. In both cases the map::iterator field points to the data
element having the key that was specified in keyvalue. The use of this variant of insert
is illustrated by the following example:
#include <iostream>
#include <string>
#include <map>
using namespace std;
int main()
{
pair<string, int> pa[] =
{
pair<string,int>("one", 10),
pair<string,int>("two", 20),
pair<string,int>("three", 30),
};
map<string, int> object(&pa[0], &pa[3]);
// {four, 40} and ‘true’ is returned
pair<map<string, int>::iterator, bool>
ret = object.insert
(
12.4. AVAILABLE CONTAINERS 329
map<string, int>::value_type
("four", 40)
);
cout << boolalpha;
cout << ret.first->first << " " <<
ret.first->second << " " <<
ret.second << " " << object["four"] << ’\n’;
// {four, 40} and ‘false’ is returned
ret = object.insert
(
map<string, int>::value_type
("four", 0)
);
cout << ret.first->first << " " <<
ret.first->second << " " <<
ret.second << " " << object["four"] << ’\n’;
}
/*
Generated output:
four 40 true 40
four 40 false 40
*/
Note the somewhat peculiar constructions like
cout << ret.first->first << " " << ret.first->second << ...
Note that ‘ret’ is equal to the pair returned by the insert member function. Its ‘first’
field is an iterator into the map<string, int>, so it can be considered a pointer to a
map<string, int>::value_type. These value types themselves are pairs too, having
‘first’ and ‘second’ fields. Consequently, ‘ret.first->first’ is the key of the map
value (a string), and ‘ret.first->second’ is the value (an int).
– map::iterator insert(pos, keyvalue). This way a map::value_type may also
be inserted into the map. pos is ignored, and an iterator to the inserted element is returned.
– void insert(first, beyond) inserts the (map::value_type) elements pointed to by
the iterator range [first, beyond).
• key_compare key_comp():
returns a copy of the object used by the map to compare keys. The type
map<KeyType, ValueType>::key_compare is defined by the map container and
key_compare’s parameters have types KeyType const &. The comparison function
returns true if the first key argument should be ordered before the second key
argument. To compare keys and values, use value_comp, listed below.
• map::iterator lower_bound(key):
returns an iterator pointing to the first keyvalue element of which the key is at
least equal to the specified key. If no such element exists, the function returns end.
330 CHAPTER 12. ABSTRACT CONTAINERS
• size_t max_size():
returns the maximum number of elements this map may contain.
• map::reverse_iterator rbegin():
returns an iterator pointing to the last element of the map.
• map::reverse_iterator rend():
returns an iterator pointing before the first element of the map.
• size_t size():
returns the number of elements in the map.
• void swap(argument):
swaps two maps using identical key/value types.
• map::iterator upper_bound(key):
returns an iterator pointing to the first keyvalue element having a key exceeding
the specified key. If no such element exists, the function returns end. The following
example illustrates the member functions equal_range, lower_bound and
upper_bound:
#include <iostream>
#include <map>
using namespace std;
int main()
{
pair<string, int> pa[] =
{
pair<string,int>("one", 10),
pair<string,int>("two", 20),
pair<string,int>("three", 30),
};
map<string, int> object(&pa[0], &pa[3]);
map<string, int>::iterator it;
if ((it = object.lower_bound("tw")) != object.end())
cout << "lower-bound ‘tw’ is available, it is: " <<
it->first << ’\n’;
if (object.lower_bound("twoo") == object.end())
cout << "lower-bound ‘twoo’ not available" << ’\n’;
cout << "lower-bound two: " <<
object.lower_bound("two")->first <<
" is available\n";
if ((it = object.upper_bound("tw")) != object.end())
cout << "upper-bound ‘tw’ is available, it is: " <<
it->first << ’\n’;
if (object.upper_bound("twoo") == object.end())
cout << "upper-bound ‘twoo’ not available" << ’\n’;
12.4. AVAILABLE CONTAINERS 331
if (object.upper_bound("two") == object.end())
cout << "upper-bound ‘two’ not available" << ’\n’;
pair
<
map<string, int>::iterator,
map<string, int>::iterator
>
p = object.equal_range("two");
cout << "equal range: ‘first’ points to " <<
p.first->first << ", ‘second’ is " <<
(
p.second == object.end() ?
"not available"
:
p.second->first
) <<
’\n’;
}
/*
Generated output:
lower-bound ‘tw’ is available, it is: two
lower-bound ‘twoo’ not available
lower-bound two: two is available
upper-bound ‘tw’ is available, it is: two
upper-bound ‘twoo’ not available
upper-bound ‘two’ not available
equal range: ‘first’ points to two, ‘second’ is not available
*/
• value_compare value_comp():
returns a copy of the object used by the map to compare keys. The type
map<KeyType, ValueType>::value_compare is defined by the map container
and value_compare’s parameters have types value_type const &. The comparison
function returns true if the first key argument should be ordered before the
second key argument. The Value_Type elements of the value_type objects passed
to this member are not used by the returned function.
12.4.7.4 The ‘map’: a simple example
As mentioned at the beginning of section 12.4.7, the map represents a sorted associative array. In
a map the keys are sorted. If an application must visit all elements in a map the begin and end
iterators must be used.
The following example illustrates how to make a simple table listing all keys and values found in a
map:
#include <iostream>
#include <iomanip>
#include <map>
332 CHAPTER 12. ABSTRACT CONTAINERS
using namespace std;
int main()
{
pair<string, int>
pa[] =
{
pair<string,int>("one", 10),
pair<string,int>("two", 20),
pair<string,int>("three", 30),
};
map<string, int>
object(&pa[0], &pa[3]);
for
(
map<string, int>::iterator it = object.begin();
it != object.end();
++it
)
cout << setw(5) << it->first.c_str() <<
setw(5) << it->second << ’\n’;
}
/*
Generated output:
one 10
three 30
two 20
*/
12.4.8 The ‘multimap’ container
Like the map, the multimap class implements a (sorted) associative array. Before using a multimap
container the header file <map> must be included.
The main difference between the map and the multimap is that the multimap supports multiple
values associated with the same key, whereas the map contains single-valued keys. Note that the
multimap also accepts multiple identical values associated with identical keys.
The map and the multimap have the same set of constructors and member functions, with the
exception of the index operator which is not supported with the multimap. This is understandable:
if multiple entries of the same key are allowed, which of the possible values should be returned for
object[key]?
Refer to section 12.4.7 for an overview of the multimap member functions. Some member functions,
however, deserve additional attention when used in the context of the multimap container. These
members are discussed below.
• size_t map::count(key):
returns the number of entries in the multimap associated with the given key.
• ... erase():
12.4. AVAILABLE CONTAINERS 333
erases elements from the map:
– size_t erase(key) erases all elements having the given key. The number of erased
elements is returned.
– void erase(pos) erases the single element pointed to by pos. Other elements possibly
having the same keys are not erased.
– void erase(first, beyond) erases all elements indicated by the iterator range
[first, beyond).
• pair<multimap::iterator, multimap::iterator> equal_range(key):
returns a pair of iterators, being respectively the return values of lower_bound and
upper_bound, introduced below. The function provides a simple means to determine
all elements in the multimap that have the same keys. An example illustrating the
use of these member functions is given at the end of this section.
• multimap::iterator find(key):
this member returns an iterator pointing to the first value whose key is key. If the
element isn’t available, end is returned. The iterator could be incremented to visit all
elements having the same key until it is either end, or the iterator’s first member
is not equal to key anymore.
• multimap::iterator insert():
this member function normally succeeds, and so a multimap::iterator is returned,
instead of a pair<multimap::iterator, bool> as returned with the map container.
The returned iterator points to the newly added element.
Although the functions lower_bound and upper_bound act identically in the map and multimap
containers, their operation in a multimap deserves some additional attention. The next example
illustrates lower_bound, upper_bound and equal_range applied to a multimap:
#include <iostream>
#include <map>
using namespace std;
int main()
{
pair<string, int> pa[] =
{
pair<string,int>("alpha", 1),
pair<string,int>("bravo", 2),
pair<string,int>("charley", 3),
pair<string,int>("bravo", 6), // unordered ‘bravo’ values
pair<string,int>("delta", 5),
pair<string,int>("bravo", 4),
};
multimap<string, int> object(&pa[0], &pa[6]);
typedef multimap<string, int>::iterator msiIterator;
msiIterator it = object.lower_bound("brava");
cout << "Lower bound for ‘brava’: " <<
334 CHAPTER 12. ABSTRACT CONTAINERS
it->first << ", " << it->second << ’\n’;
it = object.upper_bound("bravu");
cout << "Upper bound for ‘bravu’: " <<
it->first << ", " << it->second << ’\n’;
pair<msiIterator, msiIterator>
itPair = object.equal_range("bravo");
cout << "Equal range for ‘bravo’:\n";
for (it = itPair.first; it != itPair.second; ++it)
cout << it->first << ", " << it->second << ’\n’;
cout << "Upper bound: " << it->first << ", " << it->second << ’\n’;
cout << "Equal range for ‘brav’:\n";
itPair = object.equal_range("brav");
for (it = itPair.first; it != itPair.second; ++it)
cout << it->first << ", " << it->second << ’\n’;
cout << "Upper bound: " << it->first << ", " << it->second << ’\n’;
}
/*
Generated output:
Lower bound for ‘brava’: bravo, 2
Upper bound for ‘bravu’: charley, 3
Equal range for ‘bravo’:
bravo, 2
bravo, 6
bravo, 4
Upper bound: charley, 3
Equal range for ‘brav’:
Upper bound: bravo, 2
*/
In particular note the following characteristics:
• lower_bound and upper_bound produce the same result for non-existing keys: they both
return the first element having a key that exceeds the provided key.
• Although the keys are ordered in the multimap, the values for equal keys are not ordered:
they are retrieved in the order in which they were enterd.
12.4.9 The ‘set’ container
The set class implements a sorted collection of values. Before using set containers the <set>
header file must be included.
A set contains unique values (of a container-acceptable type). Each value is stored only once.
A specific value can be explicitly created: Every set defines a value_type which may be used
to create values that can be stored in the set. For example, a value for a set<string> can be
constructed as follows:
12.4. AVAILABLE CONTAINERS 335
set<string>::value_type setValue("Hello");
The value_type is associated with the set<string>. Anonymous value_type objects are also
often used. E.g.,
set<string>::value_type("Hello");
Instead of using the line set<string>::value_type(...) over and over again, a typedef is
often used to reduce typing and to improve readability:
typedef set<string>::value_type StringSetValue
Using this typedef, values for the set<string> may be constructed as follows:
StringSetValue("Hello");
Alternatively, values of the set’s type may be used immediately. In that case the value of type Type
is implicitly converted to a set<Type>::value_type.
The following constructors, operators, and member functions are available for the set container:
• Constructors:
– The copy and move constructors are available;
– A set may be constructed empty:
set<int> object;
– A set may be initialized using two iterators. For example:
int intarr[] = {1, 2, 3, 4, 5};
set<int> object(&intarr[0], &intarr[5]);
Note that all values in the set must be different: it is not possible to store the same value
repeatedly when the set is constructed. If the same value occurs repeatedly, only the first
instance of the value is entered into the set; the remaining values are silently ignored.
Like the map, the set receives its own copy of the data it contains.
• The set container only supports the standard set of operators that are available for containers.
• The set class has the following member functions:
– set::iterator begin():
returns an iterator pointing to the first element of the set. If the set is empty end
is returned.
– void clear():
erases all elements from the set.
– size_t count(key):
returns 1 if the provided key is available in the set, otherwise 0 is returned.
– bool empty():
returns true if the set contains no elements.
336 CHAPTER 12. ABSTRACT CONTAINERS
– set::iterator end():
returns an iterator pointing beyond the last element of the set.
– pair<set::iterator, set::iterator> equal_range(key):
this member returns a pair of iterators, being respectively the return values of
the member functions lower_bound and upper_bound, introduced below.
– ... erase():
erases a specific element or range of elements from the set:
_ bool erase(value) erases the element having the given value fromthe set. True
is returned if the value was removed, false if the set did not contain an element
‘value’.
_ void erase(pos) erases the element pointed to by the iterator pos.
_ void erase(first, beyond) erases all elements indicated by the iterator range
[first, beyond).
– set::iterator find(value):
returns an iterator to the element having the given value. If the element isn’t
available, end is returned.
– allocator_type get_allocator() const:
returns a copy of the allocator object used by the set object.
– ... insert():
inserts elements into the set. If the element already exists, the existing element
is left untouched and the element to be inserted is ignored. The return value
depends on the version of insert that is called:
_ pair<set::iterator, bool> insert(keyvalue) inserts a new
set::value_type into the set. The return value is a pair<set::iterator,
bool>. If the returned bool field is true, value was inserted into the set. The
value false indicates that the value that was specified was already available in
the set, and so the provided value was not inserted into the set. In both cases the
set::iterator field points to the data element in the set having the specified
value.
_ set::iterator insert(pos, keyvalue). This way a set::value_type may
also be inserted into the set. pos is ignored, and an iterator to the inserted element is
returned.
_ void insert(first, beyond) inserts the (set::value_type) elements pointed
to by the iterator range [first, beyond) into the set.
– key_compare key_comp():
returns a copy of the object used by the set to compare keys. The type
set<ValueType>::key_compare is defined by the set container and
key_compare’s parameters have types ValueType const &. The comparison
function returns true if its first argument should be ordered before its second
argument.
– set::iterator lower_bound(key):
returns an iterator pointing to the first keyvalue element of which the key is at
least equal to the specified key. If no such element exists, the function returns
end.
– size_t max_size():
returns the maximum number of elements this set may contain.
– set::reverse_iterator rbegin():
returns an iterator pointing to the last element of the set.
12.4. AVAILABLE CONTAINERS 337
– set::reverse_iterator rend:
returns an iterator pointing before the first element of the set.
– size_t size():
returns the number of elements in the set.
– void swap(argument):
swaps two sets (argument being the second set) that use identical data types.
– set::iterator upper_bound(key):
returns an iterator pointing to the first keyvalue element having a key exceeding
the specified key. If no such element exists, the function returns end.
• value_compare value_comp():
returns a copy of the object used by the set to compare keys. The type
set<ValueType>::value_compare is defined by the set container and
value_compare’s parameters have types ValueType const &. The comparison
function returns true if its first argument should be ordered before its second
argument. Its operation is identical to that of a key_compare object, returned by
key_comp.
12.4.10 The ‘multiset’ container
Like the set, the multiset class implements a sorted collection of values. Before using multiset
containers the header file <set> must be included.
The main difference between the set and the multiset is that the multiset supports multiple
entries of the same value, whereas the set contains unique values.
The set and the multiset have the same set of constructors andmember functions. Refer to section
12.4.9 for an overview of the multisetmember functions. Somemember functions, however, behave
slightly different than their counterparts of the set container. Those members are mentioned here.
• size_t count(value):
returns the number of entries in the multiset associated with the given value.
• ... erase():
erases elements from the set:
– size_t erase(value) erases all elements having the given value. The number of
erased elements is returned.
– void erase(pos) erases the element pointed to by the iterator pos. Other elements
possibly having the same values are not erased.
– void erase(first, beyond) erases all elements indicated by the iterator range
[first, beyond).
• pair<multiset::iterator, multiset::iterator> equal_range(value):
returns a pair of iterators, being respectively the return values of lower_bound and
upper_bound, introduced below. The function provides a simple means to determine
all elements in the multiset that have the same values.
338 CHAPTER 12. ABSTRACT CONTAINERS
• multiset::iterator find(value):
returns an iterator pointing to the first element having the specified value. If the
element isn’t available, end is returned. The iterator could be incremented to visit
all elements having the given value until it is either end, or the iterator doesn’t
point to ‘value’ anymore.
• ... insert():
this member function normally succeeds and returns a multiset::iterator rather than
a pair<multiset::iterator, bool> as returned with the set container. The
returned iterator points to the newly added element.
Although the functions lower_bound and upper_bound act identically in the set and multiset
containers, their operation in a multiset deserves some additional attention. With a multiset
container lower_bound and upper_bound produce the same result for non-existing keys: they
both return the first element having a key exceeding the provided key.
Here is an example showing the use of various member functions of a multiset:
#include <iostream>
#include <set>
using namespace std;
int main()
{
string
sa[] =
{
"alpha",
"echo",
"hotel",
"mike",
"romeo"
};
multiset<string>
object(&sa[0], &sa[5]);
object.insert("echo");
object.insert("echo");
multiset<string>::iterator
it = object.find("echo");
for (; it != object.end(); ++it)
cout << *it << " ";
cout << ’\n’;
cout << "Multiset::equal_range(+latexcommand(\"{e})ch\")\n";
pair
<
multiset<string>::iterator,
multiset<string>::iterator
12.4. AVAILABLE CONTAINERS 339
>
itpair = object.equal_range("ech");
if (itpair.first != object.end())
cout << "lower_bound() points at " << *itpair.first << ’\n’;
for (; itpair.first != itpair.second; ++itpair.first)
cout << *itpair.first << " ";
cout << ’\n’ <<
object.count("ech") << " occurrences of ’ech’" << ’\n’;
cout << "Multiset::equal_range(+latexcommand(\"{e})cho\")\n";
itpair = object.equal_range("echo");
for (; itpair.first != itpair.second; ++itpair.first)
cout << *itpair.first << " ";
cout << ’\n’ <<
object.count("echo") << " occurrences of ’echo’" << ’\n’;
cout << "Multiset::equal_range(+latexcommand(\"{e})choo\")\n";
itpair = object.equal_range("echoo");
for (; itpair.first != itpair.second; ++itpair.first)
cout << *itpair.first << " ";
cout << ’\n’ <<
object.count("echoo") << " occurrences of ’echoo’" << ’\n’;
}
/*
Generated output:
echo echo echo hotel mike romeo
Multiset::equal_range("ech")
lower_bound() points at echo
0 occurrences of ’ech’
Multiset::equal_range("echo")
echo echo echo
3 occurrences of ’echo’
Multiset::equal_range("echoo")
0 occurrences of ’echoo’
*/
12.4.11 The ‘stack’ container
The stack class implements a stack data structure. Before using stack containers the header file
<stack> must be included.
A stack is also called a first in, last out (FILO or LIFO) data structure as the first item to enter the
stack is the last item to leave. A stack is an extremely useful data structure in situations where
data must temporarily remain available. For example, programs maintain a stack to store local
340 CHAPTER 12. ABSTRACT CONTAINERS
Figure 12.5: The contents of a stack while evaluating 3 4 + 2 _
variables of functions: the lifetime of these variables is determined by the time these functions
are active, contrary to global (or static local) variables, which live for as long as the program itself
lives. Another example is found in calculators using the Reverse Polish Notation (RPN), in which the
operands of operators are kept in a stack, whereas operators pop their operands off the stack and
push the results of their work back onto the stack.
As an example of the use of a stack, consider figure 12.5, in which the contents of the stack is shown
while the expression (3 + 4) _ 2 is evaluated. In the RPN this expression becomes 3 4 + 2 _,
and figure 12.5 shows the stack contents after each token (i.e., the operands and the operators) is
read from the input. Notice that each operand is indeed pushed on the stack, while each operator
changes the contents of the stack. The expression is evaluated in five steps. The caret between
the tokens in the expressions shown on the first line of figure 12.5 shows what token has just been
read. The next line shows the actual stack-contents, and the final line shows the steps for referential
purposes. Note that at step 2, two numbers have been pushed on the stack. The first number (3)
is now at the bottom of the stack. Next, in step 3, the + operator is read. The operator pops two
operands (so that the stack is empty at that moment), calculates their sum, and pushes the resulting
value (7) on the stack. Then, in step 4, the number 2 is read, which is dutifully pushed on the stack
again. Finally, in step 5 the final operator _ is read, which pops the values 2 and 7 from the stack,
computes their product, and pushes the result back on the stack. This result (14) could then be
popped to be displayed on some medium.
From figure 12.5 we see that a stack has one location (the top) where items can be pushed onto and
popped off the stack. This top element is the stack’s only immediately visible element. It may be
accessed and modified directly.
Bearing this model of the stack in mind, let’s see what we formally can do with the stack container.
For the stack, the following constructors, operators, and member functions are available:
• Constructors:
– The copy and move constructors are available;
– A stack may be constructed empty:
stack<string> object;
• Only the basic set of container operators are supported by the stack
• The following member functions are available for stacks:
12.4. AVAILABLE CONTAINERS 341
– bool empty():
this member returns true if the stack contains no elements.
– void pop():
removes the element at the top of the stack. Note that the popped element is not
returned by this member, and refer to section 12.4.4 for a discussion about the
reason why pop has return type void.
Furthermore, it is the responsibility of the stack’s user to assure that pop is not called
when the stack is empty. If pop is called for an empty stack, its internal administration
breaks, resulting, e.g., in a negative size (showing itself as a very large stacksize due to its
size member returning a size_t, and other operations (like push) fail and may crash
your program. Of course, with a well designed algorithm requests to pop from empty
stacks do not occur (which is probably why this implementation was used for the stack
container).
– void push(value):
places value at the top of the stack, hiding the other elements from view.
– size_t size():
this member returns the number of elements in the stack.
– Type &top():
this member returns a reference to the stack’s top (and only visible) element. It is
the responsibility of the programmer to use this member only if the stack is not
empty.
)
The stack does not support iterators or a subscript operator. The only elements that can be
accessed is its top element. To empty a stack:
– repeatedly remove its front element;
– assign an empty stack to it;
– have its destructor called (e.g., by ending its lifetime).
12.4.12 The ‘unordered_map’ container (‘hash table’)
In C++ hash tables are available as objects of the class unordered_map.
Before using unordered_map or unordered_multimap containers the header file
<unordered_map> must be included.
The unordered_map class implements an associative array in which the elements are stored according
to some hashing scheme. As discussed, the map is a sorted data structure. The keys in
maps are sorted using the operator< of the key’s data type. Generally, this is not the fastest way
to either store or retrieve data. The main benefit of sorting is that a listing of sorted keys appeals
more to humans than an unsorted list. However, a by far faster way to store and retrieve data is to
use hashing.
Hashing uses a function (called the hash function) to compute an (unsigned) number from the key,
which number is thereupon used as an index in the table storing the keys and their values. This
number is called the bucket number. Retrieval of a key is as simple as computing the hash value of
the provided key, and looking in the table at the computed index location: if the key is present, it is
stored in the table, at the computed bucket location and its value can be returned. If it’s not present,
the key is not currently stored in the container.
342 CHAPTER 12. ABSTRACT CONTAINERS
Collisions occur when a computed index position is already occupied by another element. For
these situations the abstract containers have solutions available. A simple solution, used by
unordered_maps, consists of using linear chaining, which uses linked list to store colliding table
elements.
The term unordered_map is used rather than hash to avoid name collisions with hash tables developed
before they were added to the language.
Because of the hashing method, the efficiency of a unordered_map in terms of speed should greatly
exceed the efficiency of the map. Comparable conclusions may be drawn for the unordered_set,
the unordered_multimap and the unordered_multiset.
12.4.12.1 The ‘unordered_map’ constructors
When defining an unordered_map type five template arguments must be specified :
• a KeyType (becoming unordered_map::key_type),
• a ValueType (becoming unordered_map::mapped_type),
• the type of an object computing a hash value from a key value (becoming
unordered_map::hasher), and
• the type of an object that can compare two keys for equality (becoming
unordered_map::key_equal).
• the type of its allocator. This is usually left unspecified, using the allocator provided by default
by the implementor.
The generic definition of an unordered_map container looks like this:
std::unordered_map <KeyType, ValueType, hash type, predicate type,
allocator type>
When KeyType is std::string or a built-in type then default types are available for the hash
type and the predicate type. In practice the allocator type is not specified, as the default allocator
suffices. In these cases an unordered_map object can be defined by merely specifying the key- and
value types, like this:
std::unordered_map<std::string, ValueType> hash(size_t size = implSize);
Here, implSize is the container’s default initial size, which is specified by the implementor. The
map’s size is automatically enlarged by the unordered_map when necessary, in which case the container
rehashes all its elements. In practice the default size argument provided by the implementor
is completely satisfactory.
The KeyType frequently consists of text. So, a unordered_map using a std::string KeyType
is frequently used. Be careful not to use a plain char const _ key_type as two char const _
values pointing to equal C-strings stored at different locations are considered to be different keys, as
their pointer values rather than their textual contents are compared. Here is an example showing
how a char const _ KeyType can be used. Note that in the example no arguments are specified
when constructing months, since default values and constructors are available:
#include <unordered_map>
12.4. AVAILABLE CONTAINERS 343
#include <iostream>
#include <string>
#include <cstring>
using namespace std;
struct EqualCp
{
bool operator()(char const *l, char const *r) const
{
return strcmp(l, r) == 0;
}
};
struct HashCp
{
size_t operator()(char const *str) const
{
return std::hash<std::string>()(str);
}
};
int main()
{
unordered_map<char const *, int, HashCp, EqualCp> months;
// or explicitly:
unordered_map<char const *, int, HashCp, EqualCp>
monthsTwo(61, HashCp(), EqualCp());
months["april"] = 30;
months["november"] = 31;
string apr("april"); // different pointers, same string
cout << "april -> " << months["april"] << ’\n’ <<
"april -> " << months[apr.c_str()] << ’\n’;
}
If other KeyTypes must be used, then the unordered_map’s constructor requires (constant references
to) a hash function object, computing a hash value from a key value, and a predicate function
object, returning true if two unordered_map::key_type objects are identical. A generic algorithm
(see chapter 19) exists performing tests of equality (i.e., equal_to). These tests can be used
if the key’s data type supports the equality operator. Alternatively, an overloaded operator== or
specialized function object could be constructed returning true if two keys are equal and false
otherwise.
Constructors
The unordered_map supports the following constructors:
• The copy and move constructors are available;
• explicit unordered_map(size_type n = implSize, hasher const &hf =
hasher(),
key_equal const &eql = key_equal(),
allocator_type const &alloc = allocator_type()): this constructor can also be
used as default constructor;
344 CHAPTER 12. ABSTRACT CONTAINERS
• unordered_map(const_iterator begin, const_iterator end, size_type
n = implSize, hasher const &hf = hasher(), key_equal const &eql =
key_equal(), allocator_type const &alloc = allocator_type()): this constructor
expects two iterators specifying a range of unordered_map::value_type const
objects, and
• unordered_map(initializer_list<value_type> initList, size_type n
= implSize, hasher const &hf = hasher(), key_equal const &eql =
key_equal(), allocator_type const &alloc = allocator_type()): a constructor
expecting an initializer_list of unordered_map::value_type values.
The following example shows a program using an unordered_map containing the names of the
months of the year and the number of days these months (usually) have. Then, using the subscript
operator the days in several months are displayed (the predicate used here is the generic
algorithm equal_to<string>, which is provided by the compiler as the default fourth argument
of the unordered_map constructor):
#include <unordered_map>
#include <iostream>
#include <string>
using namespace std;
int main()
{
unordered_map<string, int> months;
months["january"] = 31;
months["february"] = 28;
months["march"] = 31;
months["april"] = 30;
months["may"] = 31;
months["june"] = 30;
months["july"] = 31;
months["august"] = 31;
months["september"] = 30;
months["october"] = 31;
months["november"] = 30;
months["december"] = 31;
cout << "september -> " << months["september"] << ’\n’ <<
"april -> " << months["april"] << ’\n’ <<
"june -> " << months["june"] << ’\n’ <<
"november -> " << months["november"] << ’\n’;
}
/*
Generated output:
september -> 30
april -> 30
june -> 30
november -> 30
*/
12.4. AVAILABLE CONTAINERS 345
12.4.12.2 The ‘unordered_map’ public members
The unordered_map supports the index operator operating identically to the map’s index operator:
a (const) reference to the ValueType associated with the provided KeyType’s value is returned.
If not yet available, the key is added to the unordered_map, and a default ValueType value is
returned. In addition, it supports operator==.
The unordered_map provides the following member functions (key_type, value_type etc. refer
to the types defined by the unordered_map):
• mapped_type &at(key_type const &key):
returns a reference to the unordered_map’s mapped_type associated with key. If
the key is not stored in the unordered_map an std::out_of_range exception is
thrown.
• unordered_map::iterator begin():
returns an iterator pointing to the first element in the unordered_map, returning end
if the unordered_map is empty.
• size_t bucket(key_type const &key):
returns the index location where key is stored. If key wasn’t stored yet bucket adds
value_type(key, Value()) before returning its index position.
• size_t bucket_count():
returns the number of slots used by the containers. Each slot may contain one (or
more, in case of collisions) value_type objects.
• size_t bucket_size(size_t index):
returns the number of value_type objects stored at bucket position index.
• unordered_map::const_iterator cbegin():
returns a const_iterator pointing to the first element in the unordered_map, returning
cend if the unordered_map is empty.
• unordered_map::const_iterator cend():
returns a const_iterator pointing just beyond the unordered_map’s last element.
• void clear():
erases all the unordered_map’s elements.
• size_t count(key_type const &key):
returns the number of times a value_type object using key_type key is stored in
the unordered_map (which is either one or zero).
• pair<iterator, bool> emplace(Args &&...args):
a value_type object is constructed from emplace’s arguments. If the
unordered_map already contained an object using the same key_type value, then
a std::pair is returned containing an iterator pointing to the object using the
same key_type value and the value false. If no such key_type value was found,
the newly constructed object is inserted into the unordered_map, and the returned
std::pair contains an iterator pointing to the newly inserted inserted value_type
as well as the value true.
346 CHAPTER 12. ABSTRACT CONTAINERS
• iterator emplace_hint(const_iterator position, Args &&...args):
a value_type object is constructed from the member’s arguments, and the newly
created element is inserted into the unordered_map, unless the (at args) provided
key already exists. The implementation may or may not use position
as a hint to start looking for an insertion point. The returned iterator points
to the value_type using the provided key. It may refer to an already existing
value_type or to a newly added value_type; an existing value_type is not replaced.
If a new value was added, then the container’s size has been incremented
when emplace_hint returns.
• bool empty():
returns true if the unordered_map contains no elements.
• unordered_map::iterator end():
returns an iterator pointing beyond the last element in the unordered_map.
• pair<iterator, iterator> equal_range(key):
this member returns a pair of iterators defining the range of elements having a key
that is equal to key. With the unordered_map this range includes at most one
element.
• unordered_map::iterator erase():
erases a specific range of elements in the unordered_map:
– erase(pos) erases the element pointed to by the iterator pos. The iterator ++pos is
returned.
– erase(first, beyond) erases elements indicated by the iterator range [first,
beyond), returning beyond.
• iterator find(key):
returns an iterator to the element having the given key. If the element isn’t available,
end is returned.
• allocator_type get_allocator() const:
returns a copy of the allocator object used by the unordered_map object.
• hasher hash_function() const:
returns a copy of the hash function object used by the unordered_map object.
• ... insert():
elements may be inserted starting at a certain position. No insertion is performed
if the provided key is already in use. The return value depends on the version of
insert() that is called. When a pair<iterator, bool> is returned, then the
pair’s first member is an iterator pointing to the element having a key that is
equal to the key of the provided value_type, the pair’s second member is true
if value was actually inserted into the container, and false if not.
– pair<iterator, bool> insert(value_type const &value) attempts to insert
value.
– pair<iterator, bool> insert(value_type &&tmp) attempts to insert value using
value_type’s move constructor.
12.4. AVAILABLE CONTAINERS 347
– pair<iterator, bool> insert(const_iterator hint, value_type const
&value) attempts to insert value, possibly using hint as a starting point when trying
to insert value.
– pair<iterator, bool> insert(const_iterator hint, value_type &&tmp)
attempts to insert a value using value_type’s move constructor, and possibly using
hint as a starting point when trying to insert value.
– void insert(first, beyond) tries to insert the elements in the iterator range
[first, beyond).
– void insert(initializer_list <value_type> iniList) attempts to insert the
elements in iniList into the container.
• hasher key_eq() const:
returns a copy of the key_equal function object used by the unordered_map object.
• float load_factor() const:
returns the container’s current load factor, i.e. size / bucket_count.
• size_t max_bucket_count():
returns the maximum number of buckets this unordered_map may contain.
• float max_load_factor() const:
identical to load_factor.
• void max_load_factor(float max):
changes the current maximum load factor to max. When a load factor of max is
reached, the container will enlarge its bucket_count, followed by a rehash of its
elements. Note that the container’s default maximum load factor equals 1.0
• size_t max_size():
returns the maximum number of elements this unordered_map may contain.
• void rehash(size_t size):
if size exceeds the current bucket count, then the bucket count is increased to size,
followed by a rehash of its elements.
• void reserve(size_t request):
if request is less than or equal to the current bucket count, this call has no effect.
Otherwise, the bucket count is increased to a value of at least request, followed by
a rehash of the container’s elements.
• size_t size():
returns the number of elements in the unordered_map.
• void swap(unordered_map &other):
swaps the contents of the current and the other unordered_map.
348 CHAPTER 12. ABSTRACT CONTAINERS
12.4.12.3 The ‘unordered_multimap’ container
The unordered_multimap allowsmultiple objects using the same keys to be stored in an unordered
map. The unordered_multimap container offers the same set of members and constructors as the
unordered_map, but without the unique-key restriction imposed upon the unordered_map.
The unordered_multimap does not offer operator[] and does not offer the at members.
Below all members are described whose behavior differs from the behavior of the corresponding
unordered_map members:
• at
not supported by the unordered_multimap container
• size_t count(key_type const &key):
returns the number of times a value_type object using key_type key is stored
in the unordered_map. This member is commonly used to verify whether key is
available in the unordered_multimap.
• iterator emplace(Args &&...args):
a value_type object is constructed from emplace’s arguments. The returned
iterator points to the newly inserted inserted value_type.
• iterator emplace_hint(const_iterator position, Args &&...args):
a value_type object is constructed from the member’s arguments, and the newly
created element is inserted into the unordered_multimap. The implementation
may or may not use position as a hint to start looking for an insertion point. The
returned iterator points to the value_type using the provided key.
• pair<iterator, iterator> equal_range(key):
this member returns a pair of iterators defining the range of elements having a key
that is equal to key.
• terator find(key):
returns an iterator to an element having the given key. If no such element is available,
end is returned.
• ... insert():
elements may be inserted starting at a certain position. The return value depends
on the version of insert() that is called. When an iterator is returned, then it
points to the element that was inserted.
– iterator insert(value_type const &value) inserts value.
– iterator insert(value_type &&tmp) inserts value using value_type’s move constructor.
– iterator insert(const_iterator hint, value_type const &value) inserts
value, possibly using hint as a starting point when trying to insert value.
– iterator insert(const_iterator hint, value_type &&tmp) inserts value using
value_type’s move constructor, and possibly using hint as a starting point when
trying to insert value.
12.4. AVAILABLE CONTAINERS 349
– void insert(first, beyond) inserts the elements in the iterator range [first,
beyond).
– void insert(initializer_list <value_type> iniList) inserts the elements in
iniList into the container.
12.4.13 The ‘unordered_set’ container
The set container, like the map container, orders its elements. If ordering is not an issue, but fast
lookups are, then a hash-based set and/ormulti-set may be preferred. C++ provides such hash-based
sets and multi-sets: the unordered_set and unordered_multiset.
Before using these hash-based set containers the header file <unordered_set> must be included.
Elements stored in the unordered_set are immutable, but they can be inserted and removed from
the container. Different from the unordered_map, the unordered_set does not use a ValueType.
The set merely stores elements, and the stored element itself is its own key.
The unordered_set has the same constructors as the unordered_map, but the set’s value_type
is equal to its key_type.
When defining an unordered_set type four template arguments must be specified :
• a KeyType (becoming unordered_set::key_type),
• the type of an object computing a hash value from a key value (becoming
unordered_set::hasher), and
• the type of an object that can compare two keys for equality (becoming
unordered_set::key_equal).
• the type of its allocator. This is usually left unspecified, using the allocator provided by default
by the implementor.
The generic definition of an unordered_set container looks like this:
std::unordered_set <KeyType, hash type, predicate type, allocator type>
When KeyType is std::string or a built-in type then default types are available for the hash
type and the predicate type. In practice the allocator type is not specified, as the default allocator
suffices. In these cases an unordered_set object can be defined by merely specifying the key- and
value types, like this:
std::unordered_set<std::string> rawSet(size_t size = implSize);
Here, implSize is the container’s default initial size, which is specified by the implementor. The
set’s size is automatically enlarged when necessary, in which case the container rehashes all its
elements. In practice the default size argument provided by the implementor is completely satisfactory.
The unordered_set supports the following constructors:
• The copy and move constructors are available;
350 CHAPTER 12. ABSTRACT CONTAINERS
• explicit unordered_set(size_type n = implSize, hasher const &hf =
hasher(),
key_equal const &eql = key_equal(),
allocator_type const &alloc = allocator_type()): this constructor can also be
used as default constructor;
• unordered_set(const_iterator begin, const_iterator end, size_type
n = implSize, hasher const &hf = hasher(), key_equal const &eql =
key_equal(), allocator_type const &alloc = allocator_type()): this constructor
expects two iterators specifying a range of unordered_set::value_type const
objects, and
• unordered_set(initializer_list<value_type> initList, size_type n
= implSize, hasher const &hf = hasher(), key_equal const &eql =
key_equal(), allocator_type const &alloc = allocator_type()): a constructor
expecting an initializer_list of unordered_set::value_type values.
The unordered_set does not offer an index operator, and it does not offer an at member. Other
than those, it offers the same members as the unordered_map. Below the members whose behavior
differs from the behavior of the unordered_map are discussed. For a description of the remaining
members, please refer to section 12.4.12.2.
• iterator emplace(Args &&...args):
a value_type object is constructed from emplace’s arguments. It is added to the
set if it is unique, and an iterator to the value_type is returned.
• iterator emplace_hint(const_iterator position, Args &&...args):
a value_type object is constructed from the member’s arguments, and if the newly
created element is unique it is inserted into the unordered_set. The implementation
may or may not use position as a hint to start looking for an insertion point.
The returned iterator points to the value_type.
• unordered_set::iterator erase():
erases a specific range of elements in the unordered_set:
– erase(key_type const &key) erases key from the set. An iterator pointing to the
next element is returned.
– erase(pos) erases the element pointed to by the iterator pos. The iterator ++pos is
returned.
– erase(first, beyond) erases elements indicated by the iterator range [first,
beyond), returning beyond.
12.4.13.1 The ‘unordered_multiset’ container
The unordered_multiset allowsmultiple objects using the same keys to be stored in an unordered
set. The unordered_multiset container offers the same set of members and constructors as the
unordered_set, but without the unique-key restriction imposed upon the unordered_set.
Below all members are described whose behavior differs from the behavior of the corresponding
unordered_set members:
12.4. AVAILABLE CONTAINERS 351
• size_t count(key_type const &key):
returns the number of times a value_type object using key_type key is stored
in the unordered_set. This member is commonly used to verify whether key is
available in the unordered_multiset.
• iterator emplace(Args &&...args):
a value_type object is constructed from emplace’s arguments. The returned
iterator points to the newly inserted inserted value_type.
• iterator emplace_hint(const_iterator position, Args &&...args):
a value_type object is constructed from the member’s arguments, and the newly
created element is inserted into the unordered_multiset. The implementation
may or may not use position as a hint to start looking for an insertion point. The
returned iterator points to the value_type using the provided key.
• pair<iterator, iterator> equal_range(key):
this member returns a pair of iterators defining the range of elements having a key
that is equal to key.
• terator find(key):
returns an iterator to an element having the given key. If no such element is available,
end is returned.
• ... insert():
elements may be inserted starting at a certain position. The return value depends
on the version of insert() that is called. When an iterator is returned, then it
points to the element that was inserted.
– iterator insert(value_type const &value) inserts value.
– iterator insert(value_type &&tmp) inserts value using value_type’s move constructor.
– iterator insert(const_iterator hint, value_type const &value) inserts
value, possibly using hint as a starting point when trying to insert value.
– iterator insert(const_iterator hint, value_type &&tmp) inserts value using
value_type’s move constructor, and possibly using hint as a starting point when
trying to insert value.
– void insert(first, beyond) inserts the elements in the iterator range [first,
beyond).
– void insert(initializer_list <value_type> iniList) inserts the elements in
iniList into the container.
12.4.14 C14: heterogeneous lookup
The associative containers offered by C++ allow us to find a value (or values) matching a given key.
Traditionally, the type of the key used for the lookup must match the container’s key type.
The C++14 standard allows an arbitrary lookup key type to be used, as long as the comparison
operator can compare that type with the container’s key type. Thus, a char const _ key (or any
other type for which an operator< overload for std::string is available) can be used to lookup
values in a map<std::string, ValueType>. This is called heterogeneous lookup.
352 CHAPTER 12. ABSTRACT CONTAINERS
Heterogeneous lookup is allowed when the comparator given to the associative container does allow
this. The standard library classes std::less and std::greater were augmented to allow
heterogeneous lookup.
12.5 The ‘complex’ container
The complex container defines the standard operations that can be performed on complex numbers.
Before using complex containers the header file <complex> must be included.
The complex number’s real and imaginary types are specified as the container’s data type. Examples:
complex<double>
complex<int>
complex<float>
Note that the real and imaginary parts of complex numbers have the same datatypes.
When initializing (or assigning) a complex object, the imaginary part may be omitted from the initialization
or assignment resulting in its value being 0 (zero). By default, both parts are zero.
Below it is silently assumed that the used complex type is complex<double>. Given this assumption,
complex numbers may be initialized as follows:
• target: A default initialization: real and imaginary parts are 0.
• target(1): The real part is 1, imaginary part is 0
• target(0, 3.5): The real part is 0, imaginary part is 3.5
• target(source): target is initialized with the values of source.
Anonymous complex values may also be used. In the next example two anonymous complex values
are pushed on a stack of complex numbers, to be popped again thereafter:
#include <iostream>
#include <complex>
#include <stack>
using namespace std;
int main()
{
stack<complex<double>>
cstack;
cstack.push(complex<double>(3.14, 2.71));
cstack.push(complex<double>(-3.14, -2.71));
while (cstack.size())
{
cout << cstack.top().real() << ", " <<
cstack.top().imag() << "i" << ’\n’;
cstack.pop();
12.5. THE ‘COMPLEX’ CONTAINER 353
}
}
/*
Generated output:
-3.14, -2.71i
3.14, 2.71i
*/
The following member functions and operators are defined for complex numbers (below, value may
be either a primitve scalar type or a complex object):
• Apart from the standard container operators, the following operators are supported from the
complex container.
– complex operator+(value):
this member returns the sum of the current complex container and value.
– complex operator-(value):
this member returns the difference between the current complex container and
value.
– complex operator_(value):
this member returns the product of the current complex container and value.
– complex operator/(value):
this member returns the quotient of the current complex container and value.
– complex operator+=(value):
this member adds value to the current complex container, returning the new
value.
– complex operator-=(value):
this member subtracts value from the current complex container, returning the
new value.
– complex operator_=(value):
this member multiplies the current complex container by value, returning the
new value
– complex operator/=(value):
this member divides the current complex container by value, returning the new
value.
• Type real():
returns the real part of a complex number.
• Type imag():
returns the imaginary part of a complex number.
• Several mathematical functions are available for the complex container, such as abs, arg,
conj, cos, cosh, exp, log, norm, polar, pow, sin, sinh and sqrt. All these functions are
free functions, not member functions, accepting complex numbers as their arguments. For
example,
abs(complex<double>(3, -5));
pow(target, complex<int>(2, 3));
354 CHAPTER 12. ABSTRACT CONTAINERS
• Complex numbers may be extracted from istream objects and inserted into ostream objects.
The insertion results in an ordered pair (x, y), in which x represents the real part and y the
imaginary part of the complex number. The same form may also be used when extracting a
complex number from an istream object. However, simpler forms are also allowed. E.g., when
extracting 1.2345 the imaginary part is set to 0.
12.6 Unrestricted Unions
We end this chapter about abstract containers with a small detour, introducing extensions to the
union concept. Although unions themselves aren’t ‘abstract containers’, having covered containers
puts us in a good position to introduce and illustrate unrestricted unions.
Whereas traditional unions can only contain primitive data, unrestricted unions allow addition of
data fields of types for which non-trivial constructors have been defined. Such data fields commonly
are of class-types. Here is an example of such an unrestricted union:
union Union
{
int u_int;
std::string u_string;
};
One of its fields defines a constructor, turning this union into an unrestricted union. As an unrestricted
union defines at least one field of a type having a constructor the question becomes how
these unions can be constructed and destroyed.
The destructor of a union consisting of, e.g. a std::string and an int should of course not call
the string’s destructor if the union’s last (or only) use referred to its int field. Likewise, when
the std::string field is used, and a switch is made next from the std::string to the int field,
std::string’s destructor should be called before any assignment to the double field takes place.
The compiler does not solve the issue for us, and in fact does not implement default constructors
or destructors for unrestricted unions at all. If we try to define an unrestricted union like the one
shown above, an error message like the following is issued:
error: use of deleted function ’Union::Union()’
error: ’Union::Union()’ is implicitly deleted because the default
definition would be ill-formed:
error: union member ’Union::u_string’ with non-trivial
’std::basic_string<...>::basic_string() ...’
12.6.1 Implementing the destructor
Although the compiler won’t provide (default) implementations for constructors and destructors of
unrestricted unions, we can. The task isn’t difficult, but there are some caveats.
Consider our unrestricted union’s destructor. It clearly should destroy u_string’s data if that is its
currently active field; but it should do nothing if u_int is its currently active field. But how does
the destructor know what field to destroy? It doesn’t as the unrestricted union holds no information
about what field is currently active.
Here is one way to tackle this problem:
12.6. UNRESTRICTED UNIONS 355
If we embed the unrestricted union in a larger aggregate, like a class or a struct, then the class or
struct can be provided with a tag data member storing the currently active union-field. The tag can
be an enumeration type, defined by the aggregate. The unrestricted union may then be controlled
by the aggregate. Under this approach we start out with an explicit empty implementations of the
destructor, as there’s no way to tell the destructor itself what field to destroy:
Data::Union::~Union()
{};
12.6.2 Embedding an unrestricted union in a surrounding class
Next, we embed the unrestricted union in a surrounding aggregate: class Data. The aggregate is
provided with an enum Tag, declared in its public section, so Data’s users may request tags. Union
itself is for Data’s internal use only, so Union is declared in Data’s private section. Using a struct
Data rather than class Data we start out in a public section, saving us from having to specify the
initial public: section for enum Tag:
struct Data
{
enum Tag
{
INT,
STRING
};
private:
union Union
{
int u_int;
std::string u_string;
~Union(); // no actions
// ... to do: declarations of members
};
Tag d_tag;
Union d_union;
};
Data’s constructors receive int or string values. To pass these values on to d_union, we need
Union constructors for the various union fields; matching Data constructors also initialize d_tag to
proper values:
Data::Union::Union(int value)
:
u_int(value)
{}
Data::Union::Union(std::string const &str)
:
u_string(str)
{}
356 CHAPTER 12. ABSTRACT CONTAINERS
Data::Data(std::string const &str)
:
d_tag(STRING),
d_union(str)
{}
Data::Data(int value)
:
d_tag(INT),
d_union(value)
{}
12.6.3 Destroying an embedded unrestricted union
Data’s destructor has a data member which is an unrestricted union. As the union’s destructor can’t
perform any actions, the union’s proper destruction is delegated to a member, Union::destroy
destroying the fields for which destructors are defined. As d_tag stores the currently used Union
field, Data’s destructor can pass d_tag to Union::destroy to inform it about which field should
be destroyed.
Union::destroy does not need to perform any action for INT tags, but for STRING tags the
memory allocated by u_string must be returned. For this an explicit destructor call is used.
Union::destroy and Data’s destructor are therefore implemented like this:
void Data::Union::destroy(Tag myTag)
{
if (myTag == Tag::STRING)
u_string.~string();
}
Data::~Data()
{
d_union.destroy(d_tag);
}
12.6.4 Copy and move constructors
Union’s copy and move constructors suffer from the same problem as Union’s destructor does: the
union does not know which is its currently active field. But again: Data does, and by defining
‘extended’ copy and move constructors, also receiving a Tag argument, these extended constructors
can perform their proper initializations. The Union’s copy- and move-constructors are deleted, and
extended copy- and move constructors are declared:
Union(Union const &other) = delete;
Union &operator=(Union const &other) = delete;
Union(Union const &other, Tag tag);
Union(Union &&tmp, Tag tag);
Shortly we’ll encounter a situation where we must be able to initialize a block of memory using an
existing Union object. This task can be performed by copy members, whose implementations are
12.6. UNRESTRICTED UNIONS 357
trivial, and which may be used by the above constructors. They can be declared in Union’s private
section, and have identical parameter lists as the above constructors:
void copy(Union const &other, Tag tag);
void copy(Union &&other, Tag tag);
The constructors merely have to call these copy members:
inline Data::Union::Union(Union const &other, Tag tag)
{
copy(other, tag);
}
inline Data::Union::Union(Union &&tmp, Tag tag)
{
copy(std::move(tmp), tag);
}
Interestingly, no ‘initialization followed by assignment’ happens here: d_union has not been initialized
in any way by the the time we reach the statement blocks of the above constructors. But upon
reaching the statement blocks, d_union memory is merely raw memory. This is no problem, as the
copy members use placement new to initialize the Union’s memory:
void Data::Union::copy(Union const &other, Tag otag)
{
if (tag == INT)
u_int = other.u_int;
else
new (this) string(other.u_string);
}
void Data::Union::copy(Union &&tmp, Tag tag)
{
if (tag == INT)
u_int = tmp.u_int;
else
new (this) string(std::move(tmp.u_string));
}
12.6.5 Assignment
To assign a Data object to another data object, we need an assignment operator. The standard mold
for the assignment operator looks like this:
Class &Class::operator=(Class const &other)
{
Class tmp(other);
swap(*this, tmp);
return *this;
}
358 CHAPTER 12. ABSTRACT CONTAINERS
This implementation is exception safe: it offers the ‘commit or roll-back’ guarantee (cf. section 9.6).
But can it be applied to Data?
It depends. It depends on whether Data objects can be fast swapped (cf. section 9.6.1.1) or not. If
Union’s fields can be fast swapped then we can simply swap bytes and we’re done. In that case Union
does not require any additional members (to be specific: it won’t need an assignment operator).
But now assume that Union’s fields cannot be fast swapped. How to implement an exception-safe
assignment (i.e., an assignment offering the ‘commit or roll-back’ guarantee) in that case? The
d_tag field clearly isn’t a problem, so we delegate the responsibility for proper assignment to Union,
implementing Data’s assignment operators as follows:
Data &Data::operator=(Data const &rhs)
{
if (d_union.assign(d_tag, rhs.d_union, rhs.d_tag))
d_tag = rhs.d_tag;
return *this;
}
Data &Data::operator=(Data &&tmp)
{
if (d_union.assign(d_tag, std::move(tmp.d_union), tmp.d_tag))
d_tag = tmp.d_tag;
return *this;
}
But now for Union::assign. Assuming that both Unions use different fields, but swapping objects
of the separate types is allowed. Now things may go wrong. Assume the left-side union uses type
X, the right-side union uses type Y and both types use allocation. First, briefly look at standard
swapping. It involves three steps:
• tmp(lhs): initialize a temporary objecct;
• lhs = rhs: assign the rhs object to the lhs object;
• rhs = tmp: assign the tmp object to the rhs
Usually we assume that these steps do not throw exceptions, as swap itself shouldn’t throw exceptions.
How could we implement swapping for our union? Assume the fields are known (easily done
by passing Tag values to Union::swap):
• X tmp(lhs.x): initialize a temporary X;
• in-place destroy lhs.x; placement new initialize lhs.y from rhs.y (alternatively: placement new
default initialize lhs.y, then do the standard lhs.y = rhs.y)
• in-place destroy rhs.y; placement new initialize rhs.x from tmp (alternatively: placement new
default initialize rhs.x, then do the standard rhs.x = tmp)
By C++-standard requirement, the in-place destruction won’t throw. Since the standard swap also
performs an assignment that part should work fine as well. And since the standard swap also does
a copy construction the placement new operations should perform fine as well, and if so, Union may
be provided with the following swap member:
void Data::Union::swap(Tag myTag, Union &other, Tag oTag)
12.6. UNRESTRICTED UNIONS 359
{
Union tmp(*this, myTag); // save lhs
destroy(myTag); // destroy lhs
copy(other, oTag); // assign rhs
other.destroy(oTag); // destroy rhs
other.copy(tmp, myTag); // save lhs via tmp
}
Now that swap is available Data’s assignment operators are easily realized:
Data &Data::operator=(Data const &rhs)
{
Data tmp(rhs); // tmp(std::move(rhs)) for the move assignment
d_union.swap(d_tag, tmp.d_union, tmp.d_tag);
swap(d_tag, tmp.d_tag);
return *this;
}
What if the Union constructors could throw? In that case we can provide Data with an ’commit or
roll-back’ assignment operator like this:
Data &Data::operator=(Data const &rhs)
{
Data tmp(rhs);
// rolls back before throwing an exception
d_union.assign(d_tag, rhs.d_union, rhs.d_tag);
d_tag = rhs.d_tag;
return *this;
}
How to implement Union::assign? Here are the steps assign must take:
• First save the current union in a block ofmemory. This merely involves a non-throwing memcpy
operation;
• Then use placement new to copy the other object’s union field into the current object. If this
throws:
– catch the exception, restore the original Union from the saved block and rethrow the
exception: we have rolled-back to our previous (valid) state.
• We still have to delete the original field’s allocated data. To do so, we perform the following
steps:
– (Fast) swap the current union’s new contents with the contents in the previously saved
block;
– Call destroy for the now restored original union;
– Re-install the new union from the memory block.
360 CHAPTER 12. ABSTRACT CONTAINERS
As none of the above steps will throw, we have committed the new situation.
Here is the implementation of the ‘commit or roll-back’ Union::assign:
void Data::Union::assign(Tag myTag, Union const &other, Tag otag)
{
char saved[sizeof(Union)];
memcpy(saved, this, sizeof(Union)); // raw copy: saved <- *this
try
{
copy(other, otag); // *this = other: may throw
fswap(*this, // *this <-> saved
*reinterpret_cast<Union *>(saved));
destroy(myTag); // destroy original *this
memcpy(this, saved, sizeof(Union)); // install new *this
}
catch (...) // copy threw
{
memcpy(this, saved, sizeof(Union)); // roll back: restore *this
throw;
}
}
The source distribution contains yo/containers/examples/unrestricted2.cc offering a small
demo-program in which the here developed Data class is used.

Chapter 13
Inheritance
When programming in C, programming problems are commonly approached using a top-down structured
approach: functions and actions of the program are defined in terms of sub-functions, which
again are defined in sub-sub-functions, etc.. This yields a hierarchy of code: main at the top, followed
by a level of functions which are called from main, etc..
In C++ the relationship between code and data is also frequently defined in terms of dependencies
among classes. This looks like composition (see section 7.3), where objects of a class contain objects
of another class as their data. But the relation described here is of a different kind: a class can be
defined in terms of an older, pre-existing, class. This produces a new class having all the functionality
of the older class, and additionally defining its own specific functionality. Instead of composition,
where a given class contains another class, we here refer to derivation, where a given class is or
is-implemented-in-terms-of another class.
Another term for derivation is inheritance: the new class inherits the functionality of an existing
class, while the existing class does not appear as a data member in the interface of the new class.
When discussing inheritance the existing class is called the base class, while the new class is called
the derived class.
Derivation of classes is often used when the methodology of C++ program development is fully exploited.
In this chapter we first address the syntactic possibilities offered by C++ for deriving classes.
Following this we address some of the specific possibilities offered by class derivation (inheritance).
As we have seen in the introductory chapter (see section 2.4), in the object-oriented approach to
problem solving classes are identified during the problem analysis. Under this approach objects of
the defined classes represent entities that can be observed in the problem at hand. The classes are
placed in a hierarchy, with the top-level class containing limited functionality. Each new derivation
(and hence descent in the class hierarchy) adds new functionality compared to yet existing classes.
In this chapter we shall use a simple vehicle classification system to build a hierarchy of classes.
The first class is Vehicle, which implements as its functionality the possibility to set or retrieve
the mass of a vehicle. The next level in the object hierarchy are land-, water- and air vehicles.
The initial object hierarchy is illustrated in Figure 13.1.
This chapter mainly focuses on the technicalities of class derivation. The distinction between inheritance
used to create derived classes whose objects should be considered objects of the base class and
inheritance used to implement derived classes in-terms-of their base classes is postponed until the
next chapter (14).
Inheritance (and polymorphism, cf. chapter 14) can be used with classes and structs. It is not defined
361
362 CHAPTER 13. INHERITANCE
Figure 13.1: Initial object hierarchy of vehicles.
for unions.
13.1 Related types
The relationship between the proposed classes representing different kinds of vehicles is further
investigated here. The figure shows the object hierarchy: a Car is a special case of a Land vehicle,
which in turn is a special case of a Vehicle.
The class Vehicle represents the ‘greatest common divisor’ in the classification system. Vehicle
is given limited functionality: it can store and retrieve a vehicle’s mass:
class Vehicle
{
size_t d_mass;
public:
Vehicle();
Vehicle(size_t mass);
size_t mass() const;
void setMass(size_t mass);
};
Using this class, the vehicle’s mass can be defined as soon as the corresponding object has been
created. At a later stage the mass can be changed or retrieved.
To represent vehicles travelling over land, a new class Land can be defined offering Vehicle’s functionality
and adding its own specific functionality. Assume we are interested in the speed of land
vehicles and in their mass. The relationship between Vehicles and Lands could of course be represented
by composition but that would be awkward: composition suggests that a Land vehicle
is-implemented-in-terms-of, i.e., contains, a Vehicle, while the natural relationship clearly is that
the Land vehicle is a kind of Vehicle.
A relationship in terms of composition would also somewhat complicate our Land class’s design.
Consider the following example showing a class Land using composition (only the setMass functionality
is shown):
13.1. RELATED TYPES 363
class Land
{
Vehicle d_v; // composed Vehicle
public:
void setMass(size_t mass);
};
void Land::setMass(size_t mass)
{
d_v.setMass(mass);
}
Using composition, the Land::setMass function only passes its argument on to
Vehicle::setMass. Thus, as far as mass handling is concerned, Land::setMass introduces
no extra functionality, just extra code. Clearly this code duplication is superfluous: a Land
object is a Vehicle; to state that a Land object contains a Vehicle is at least somewhat peculiar.
The intended relationship is represented better by inheritance. A rule of thumb for choosing between
inheritance and composition distinguishes between is-a and has-a relationships. A truck is a vehicle,
so Truck should probably derive from Vehicle. On the other hand, a truck has an engine; if you
need to model engines in your system, you should probably express this by composing an Engine
class inside the Truck class.
Following the above rule of thumb, Land is derived from the base class Vehicle:
class Land: public Vehicle
{
size_t d_speed;
public:
Land();
Land(size_t mass, size_t speed);
void setspeed(size_t speed);
size_t speed() const;
};
To derive a class (e.g., Land) from another class (e.g., Vehicle) postfix the class name Land in its
interface by : public Vehicle:
class Land: public Vehicle
The class Land now contains all the functionality of its base class Vehicle as well as its own features.
Here those features are a constructor expecting two arguments and member functions to
access the d_speed data member. Here is an example showing the possibilities of the derived class
Land:
Land veh(1200, 145);
int main()
{
cout << "Vehicle weighs " << veh.mass() << ";\n"
"its speed is " << veh.speed() << ’\n’;
}
364 CHAPTER 13. INHERITANCE
This example illustrates two features of derivation.
• First, mass is not mentioned as a member in Land’s interface. Nevertheless it is used in
veh.mass. This member function is an implicit part of the class, inherited from its ‘parent’
vehicle.
• Second, although the derived class Land contains the functionality of Vehicle, the Vehicle’s
private members remain private: they can only be accessed by Vehicle’s own member functions.
This means that Land’s member functions must use Vehicle’s member functions (like
mass and setMass) to address the mass field. Here there’s no difference between the access
rights granted to Land and the access rights granted to other code outside of the class Vehicle.
The class Vehicle encapsulates the specific Vehicle characteristics, and data hiding is one
way to realize encapsulation.
Encapsulation is a core principle of good class design. Encapsulation reduces the dependencies
among classes improving the maintainability and testability of classes and allowing us to modify
classes without the need to modify depending code. By strictly complying with the principle of
data hiding a class’s internal data organization may change without requiring depending code to be
changed as well. E.g., a class Lines originally storing C-strings could at some point have its data
organization changed. It could abandon its char __ storage in favor of a vector<string> based
storage. When Lines uses perfect data hiding depending source code may use the new Lines class
without requiring any modification at all.
As a rule of thumb, derived classes must be fully recompiled (but don’t have to be modified) when
the data organization (i.e., the data members) of their base classes change. Adding new member
functions to the base class doesn’t alter the data organization so no recompilation is needed when
new member functions are added.
There is one subtle exception to this rule of thumb: if a new member function is added to a base
class and that function happens to be declared as the first virtual member function of the base class
(cf. chapter 14 for a discussion of the virtual member function concept) then that also changes the
data organization of the base class.
Now that Land has been derived from Vehicle we’re ready for our next class derivation. We’ll
define a class Car to represent automobiles. Agreeing that a Car object is a Land vehicle, and that
a Car has a brand name it’s easy to design the class Car:
class Car: public Land
{
std::string d_brandName;
public:
Car();
Car(size_t mass, size_t speed, std::string const &name);
std::string const &brandName() const;
};
In the above class definition, Car was derived from Land, which in turn is derived from Vehicle.
This is called nested derivation: Land is called Car’s direct base class, while Vehicle is called Car’s
indirect base class.
13.2. ACCESS RIGHTS: PUBLIC, PRIVATE, PROTECTED 365
13.1.1 Inheritance depth: desirable?
Now that Car has been derived fromLand and Land has been derived fromVehicle wemight easily
be seduced into thinking that these class hierarchies are the way to go when designing classes. But
maybe we should temper our enthusiasm.
Repeatedly deriving classes from classes quickly results in big, complex class hierarchies that are
hard to understand, hard to use and hard to maintain. Hard to understand and use as users of our
derived class now also have to learn all its (indirect) base class features as well. Hard to maintain
because all those classes are very closely coupled. While it may be true that when data hiding is
meticulously adhered to derived classes do not have to be modified when their base classes alter
their data organization, it also quickly becomes practically infeasible to change those base classes
once more and more (derived) classes depend on their current organization.
What initially looks like a big gain, inheriting the base class’s interface, thus becomes a liability.
The base class’s interface is hardly ever completely required and in the end a class may benefit from
explicitly defining its own member functions rather than obtaining them through inheritance.
Often classes can be defined in-terms-of existing classes: some of their features are used, but others
need to be shielded off. Consider the stack container: it is commonly implemented in-terms-of a
deque, returning deque::back’s value as stack::top’s value.
When using inheritance to implement an is-a relationship make sure to get the ‘direction of use’
right: inheritance aiming at implementing an is-a relationship should focus on the base class: the
base class facilities aren’t there to be used by the derived class, but the derived class facilities should
redefine (reimplement) the base class facilities using polymorphism (which is the topic of the next
chapter), allowing code to use the derived class facilities polymorphically through the base class.
We’ve seen this approach when studying streams: the base class (e.g., ostream) is used time and
again. The facilities defined by classes derived from ostream (like ofstream and ostringstream)
are then used by code only relying on the facilities offered by the ostream class, never using the
derived classes directly.
When designing classes always aim at the lowest possible coupling. Big class hierarchies usually
indicate poor understanding of robust class design. When a class’s interface is only partially used
and if the derived class is implemented in terms of another class consider using composition rather
than inheritance and define the appropriate interface members in terms of the members offered by
the composed objects.
13.2 Access rights: public, private, protected
Early in the C++ Annotations (cf. section 3.2.1) we encountered two important design principles
when developing classes: data hiding and encapsulation. Data hiding restricts control over an
object’s data to the members of its class, encapsulating is used to restrict access to the functionality
of objects. Both principles are invaluable tools for maintaining data integrity.
The keyword private starts sections in class interfaces in which members are declared which can
only be accessed by members of the class itself. This is our main tool for realizing data hiding.
According to established good practices of class design the public sections are populated with member
functions offering a clean interface to the class’s functionality. These members allow users to
communicate with objects; leaving it to the objects how requests sent to objects are handled. In a
well-designed class its objects are in full control of their data.
Inheritance doesn’t change these principles, nor does it change the way the private and protected
keywords operate. A derived class does not have access to a base class’s private section.
366 CHAPTER 13. INHERITANCE
Sometimes this is a bit too restrictive. Consider a class implementing a random number generating
streambuf (cf. chapter 6). Such a streambuf can be used to construct an istream irand, after
which extractions from irand produces the next random number, like in the next example in which
10 random numbers are generated using stream I/O:
RandBuf buffer;
istream irand(&buffer);
for (size_t idx = 0; idx != 10; ++idx)
{
size_t next;
irand >> next;
cout << "next random number: " << next << ’\n’;
}
The question is, how many random numbers should irand be able to generate? Fortunately, there’s
no need to answer this question, as RandBuf can be made responsible for generating the next random
number. RandBuf, therefore, operates as follows:
• It generates a random number;
• It is passed in textual form to its base class streambuf;
• The istream object extracts this random number, merely using streambuf’s interface;
• this process is repeated for subsequent random numbers;
Once RandBuf has stored the text representation of the next random number in some buffer, it must
tell its base class (streambuf) where to find the random number’s characters. For this streambuf
offers a member setg, expecting the location and size of the buffer holding the random number’s
characters.
The member setg clearly cannot be declared in streambuf’s private section, as RandBuf must use
it to prepare for the extraction of the next random number. But it should also not be in streambuf’s
public section, as that could easily result in unexpected behavior by irand. Consider the following
hypothetical example:
RandBuf randBuf;
istream irand(&randBuf);
char buffer[] = "12";
randBuf.setg(buffer, ...); // setg public: buffer now contains 12
size_t next;
irand >> next; // not a *random* value, but 12.
Clearly there is a close connection between streambuf and its derived class RandBuf. By allowing
RandBuf to specify the buffer from which streambuf reads characters RandBuf remains in control,
denying other parts of the program to break its well-defined behavior.
This close connection between base- and derived-classes is realized by a third keyword related to
the accessibility of class members: protected. Here is how the member setg could have been be
declared in a class streambuf:
class streambuf
13.2. ACCESS RIGHTS: PUBLIC, PRIVATE, PROTECTED 367
{
// private data here (as usual)
protected:
void setg(... parameters ...); // available to derived classes
public:
// public members here
};
Protected members are members that can be accessed by derived classes, but are not part of a class’s
public interface.
Avoid the temptation to declare data members in a class’s protected section: it’s a sure sign of bad
class design as it needlessly results in tight coupling of base and derived classes. The principle of
data hiding should not be abandoned now that the keyword protected has been introduced. If a
derived class (but not other parts of the software) should be given access to its base class’s data,
use member functions: accessors and modifiers declared in the base class’s protected section. This
enforces the intended restricted access without resulting in tightly coupled classes.
13.2.1 Public, protected and private derivation
With inheritance public derivation is frequently used. When public derivation is used the access
rights of the base class’s interface remains unaltered in the derived class. But the type of inheritance
may also be defined as private or protected.
Protected derivation is used when the keyword protected is put in front of the derived class’s base
class:
class Derived: protected Base
When protected derivation is used all the base class’s public and protected members become protected
members in the derived class. The derived class may access all the base class’s public and
protected members. Classes that are in turn derived from the derived class view the base class’s
members as protected. Any other code (outside of the inheritance tree) is unable to access the base
class’s members.
Private derivation is used when the keyword private is put in front of the derived class’s base
class:
class Derived: private Base
When private derivation is used all the base class’s members turn into private members in the
derived class. The derived class members may access all base class public and protected members
but base class members cannot be used elsewhere.
Public derivation should be used to define an is-a relationship between a derived class and a base
class: the derived class object is-a base class object allowing the derived class object to be used polymorphically
as a base class object in code expecting a base class object. Private inheritance is used
in situations where a derived class object is defined in-terms-of the base class where composition
cannot be used. There’s little documented use for protected inheritance, but one could maybe encounter
protected inheritance when defining a base class that is itself a derived class and needs to
make its base class members available to classes derived from itself.
368 CHAPTER 13. INHERITANCE
Combinations of inheritance types do occur. For example, when designing a stream-class it is usually
derived from std::istream or std::ostream. However, before a stream can be constructed, a
std::streambuf must be available. Taking advantage of the fact that the inheritance order is
defined in the class interface, we use multiple inheritance (see section 13.6) to derive the class from
both std::streambuf and (then) from std::ostream. To the class’s users it is a std::ostream
and not a std::streambuf. So private derivation is used for the latter, and public derivation for
the former class:
class Derived: private std::streambuf, public std::ostream
13.2.2 Promoting access rights
When private or protected derivation is used, users of derived class objects are denied access to
the base class members. Private derivation denies access to all base class members to users of the
derived class, protected derivation does the same, but allows classes that are in turn derived from
the derived class to access the base class’s public and protected members.
In some situations this scheme is too restrictive. Consider a class RandStream derived privately
froma class RandBuf which is itself derived fromstd::streambuf and also publicly fromistream:
class RandBuf: public std::streambuf
{
// implements a buffer for random numbers
};
class RandStream: private RandBuf, public std::istream
{
// implements a stream to extract random values from
};
Such a class could be used to extract, e.g., random numbers using the standard istream interface.
Although the RandStream class is constructed with the functionality of istream objects in mind,
some of the members of the class std::streambuf may be considered useful by themselves. E.g.,
the function streambuf::in_avail returns a lower bound to the number of characters that can be
read immediately. The standard way to make this function available is to define a shadow member
calling the base class’s member:
class RandStream: private RandBuf, public std::istream
{
// implements a stream to extract random values from
public:
std::streamsize in_avail();
};
inline std::streamsize RandStream::in_avail()
{
return std::streambuf::in_avail();
}
This looks like a lot of work for just making available a member from the protected or private base
classes. If the intent is to make available the in_avail member access promotion can be used.
Access promotion allows us to specify which members of private (or protected) base classes become
13.3. THE CONSTRUCTOR OF A DERIVED CLASS 369
available in the protected (or public) interface of the derived class. Here is the above example, now
using access promotion:
class RandStream: private RandBuf, public std::istream
{
// implements a stream to extract random values from
public:
using std::streambuf::in_avail;
};
It should be noted that access promotionmakes available all overloaded versions of the declared base
class member. So, if streambuf would offer not only in_avail but also, e.g., in_avail(size_t
_) both members would become part of the public interface.
13.3 The constructor of a derived class
A derived class inherits functionality from its base class (or base classes, as C++ supports multiple
inheritance, cf. section 13.6). When a derived class object is constructed it is built on top of its base
class object. As a consequence the base class must have been constructed before the actual derived
class elements can be initialized. This results in some requirements that must be observed when
defining derived class constructors.
A constructor exists to initialize the object’s data members. A derived class constructor is also
responsible for the proper initialization of its base class. Looking at the definition of the class Land
introduced earlier (section 13.1), its constructor could simply be defined as follows:
Land::Land(size_t mass, size_t speed)
{
setMass(mass);
setspeed(speed);
}
However, this implementation has several disadvantages.
• When constructing a derived class object a base class constructor is always called before any
action is performed on the derived class object itself. By default the base class’s default constructor
is going to be called.
• Using the base class constructor only to reassign new values to its data members in the derived
class constructor’s body usually is inefficient, but sometimes sheer impossible as in situations
where base class reference or const data members must be initialized. In those cases a specialized
base class constructor must be used instead of the base class default constructor.
A derived class’s base class may be initialized using a dedicated base class constructor by calling
the base class constructor in the derived class constructor’s initializer clause. Calling a base class
constructor in a constructor’s initializer clause is called a base class initializer. The base class initializer
must be called before initializing any of the derived class’s data members and when using
the base class initializer none of the derived class data members may be used. When constructing a
derived class object the base class is constructed first and only after that construction has successfully
completed the derived class data members are available for initialization. Land’s constructor
may therefore be improved:
370 CHAPTER 13. INHERITANCE
Land::Land(size_t mass, size_t speed)
:
Vehicle(mass),
d_speed(speed)
{}
Derived class constructors always by default call their base class’s default constructor. This is of
course not correct for a derived class’s copy constructor. Assuming that the class Land must be
provided with a copy constructor it may use the Land const &other to represent the other’s base
class:
Land::Land(Land const &other) // assume a copy constructor is needed
:
Vehicle(other), // copy-construct the base class part.
d_speed(other.speed) // copy-construct Land’s data members
{}
13.3.1 Move construction
As with classes using composition derived classes may benefit from defining a move constructor. A
derived class may offer a move constructor for two reasons:
• it supports move construction for its data members
• its base class is move-aware
The design of move constructors moving data members was covered in section 9.7. A move constructor
for a derived class whose base class is move-aware must anonimize the rvalue reference before
passing it to the base class move constructor. The std::move function should be used when implementing
the move constructor to move the information in base classes or composed objects to their
new destination object.
The first example shows the move constructor for the class Car, assuming it has a movable char
_d_brandName data member and assuming that Land is a move-aware class. The second example
shows the move constructor for the class Land, assuming that it does not itself have movable data
members, but that its Vehicle base class is move-aware:
Car::Car(Car &&tmp)
:
Land(std::move(tmp)), // anonimize ‘tmp’
d_brandName(tmp.d_brandName) // move the char *’s value
{
tmp.d_brandName = 0;
}
Land(Land &&tmp)
:
Vehicle(std::move(tmp)), // move-aware Vehicle
d_speed(tmp.d_speed) // plain copying of plain data
{}
13.3. THE CONSTRUCTOR OF A DERIVED CLASS 371
13.3.2 Move assignment
Derived classes may also benefit from move assignment operations. If the derived class and its base
class support swapping then the implementation is simple, following the standard shown earlier in
section 9.7.3. For the class Car this could boil down to:
Car &Car::operator=(Car &&tmp)
{
swap(tmp);
return *this;
}
If swapping is not supported then std::move can be used to call the base class’s move assignment
operator:
Car &Car::operator=(Car &&tmp)
{
static_cast<Land &>(*this) = std::move(tmp);
// move Car’s own data members next
return *this;
}
13.3.3 Inheriting constructors
Derived classes can be constructed without explicitly defining derived class constructors. In those
cases the available base class constructors are called.
This feature is either used or not. It is not possible to omit some of the derived class constructors,
using the corresponding base class constructors instead. To use this feature for classes that are derived
from multiple base classes (cf. section 13.6) all the base class constructors must have different
signatures. Considering the complexities that are involved here it’s probably best to avoid using
base class constructors for classes using multiple inheritance.
The construction of derived class objects can be delegated to base class constructor(s) using the
following syntax:
class BaseClass
{
public:
// BaseClass constructor(s)
};
class DerivedClass: public BaseClass
{
public:
using BaseClass::BaseClass; // No DerivedClass constructors
// are defined
};
372 CHAPTER 13. INHERITANCE
13.4 The destructor of a derived class
Destructors of classes are automatically called when an object is destroyed. This also holds true for
objects of classes derived from other classes. Assume we have the following situation:
class Base
{
public:
~Base();
};
class Derived: public Base
{
public:
~Derived();
};
int main()
{
Derived derived;
}
At the end of main, the derived object ceases to exists. Hence, its destructor (~Derived) is called.
However, since derived is also a Base object, the ~Base destructor is called as well. The base class
destructor is never explicitly called from the derived class destructor.
Constructors and destructors are called in a stack-like fashion: when derived is constructed, the
appropriate base class constructor is called first, then the appropriate derived class constructor is
called. When the object derived is destroyed, its destructor is called first, automatically followed
by the activation of the Base class destructor. A derived class destructor is always called before its
base class destructor is called.
When the construction of a derived class objects did not successfully complete (i.e., the constructor
threw an exception) then its destructor is not called. However, the destructors of properly constructed
base classes will be called if a derived class constructor throws an exception. This, of course,
is it should be: a properly constructed object should also be destroyed, eventually. Example:
#include <iostream>
struct Base
{
~Base()
{
std::cout << "Base destructor\n";
}
};
struct Derived: public Base
{
Derived()
{
throw 1; // at this time Base has been constructed
}
};
int main()
{
13.5. REDEFINING MEMBER FUNCTIONS 373
try
{
Derived d;
}
catch(...)
{}
}
/*
This program displays ‘Base destructor’
*/
13.5 Redefining member functions
Derived classes may redefine base class members. Let’s assume that a vehicle classification system
must also cover trucks, consisting of two parts: the front part, the tractor, pulls the rear part, the
trailer. Both the tractor and the trailer have their own mass, and the mass function should return
the combined mass.
The definition of a Truck starts with a class definition. Our initial Truck class is derived from Car
but it is then expanded to hold one more size_t field representing the additional mass information.
Here we choose to represent the mass of the tractor in the Car class and to store the mass of of full
truck (tractor + trailer) in its own d_mass data member:
class Truck: public Car
{
size_t d_mass;
public:
Truck();
Truck(size_t tractor_mass, size_t speed, char const *name,
size_t trailer_mass);
void setMass(size_t tractor_mass, size_t trailer_mass);
size_t mass() const;
};
Truck::Truck(size_t tractor_mass, size_t speed, char const *name,
size_t trailer_mass)
:
Car(tractor_mass, speed, name)
{
d_mass = trailer_mass + trailer_mass;
}
Note that the class Truck now contains two functions already present in the base class Car:
setMass and mass.
• The redefinition of setMass poses no problems: this function is simply redefined to perform
actions which are specific to a Truck object.
• Redefining setMass, however, hides Car::setMass. For a Truck only the setMass function
having two size_t arguments can be used.
374 CHAPTER 13. INHERITANCE
• The Vehicle’s setMass function remains available for a Truck, but it must now be called
explicitly, as Car::setMass is hidden from view. This latter function is hidden, even though
Car::setMass has only one size_t argument. To implement Truck::setMass we could
write:
void Truck::setMass(size_t tractor_mass, size_t trailer_mass)
{
d_mass = tractor_mass + trailer_mass;
Car::setMass(tractor_mass); // note: Car:: is required
}
• Outside of the class Car::setMass is accessed using the scope resolution operator. So, if a
Truck truck needs to set its Car mass, it must use
truck.Car::setMass(x);
• An alternative to using the scope resolution operator is to add a member having the same
function prototype as the base class member to the derived class’s interface. This derived class
member could be implemented inline to call the base class member. E.g., we add the following
member to the class Truck:
// in the interface:
void setMass(size_t tractor_mass);
// below the interface:
inline void Truck::setMass(size_t tractor_mass)
{
(d_mass -= Car::mass()) + tractor_mass;
Car::setMass(tractor_mass);
}
Now the single argument setMass member function can be used by Truck objects without using
the scope resolution operator. As the function is defined inline, no overhead of an additional
function call is involved.
• To prevent hiding the base class members a using declaration may be added to the derived
class interface. The relevant section of Truck’s class interface then becomes:
class Truck: public Car
{
public:
using Car::setMass;
void setMass(size_t tractor_mass, size_t trailer_mass);
};
A using declaration imports (all overloaded versions of) the mentioned member function directly
into the derived class’s interface. If a base class member has a signature that is identical
to a derived class member then compilation fails (a using Car::mass declaration cannot
be added to Truck’s interface). Now code may use truck.setMass(5000) as well as
truck.setMass(5000, 2000).
Using declarations obey access rights. To prevent non-class members from using
setMass(5000) without a scope resolution operator but allowing derived class members to
do so the using Car::setMass declaration should be put in the class Truck’s private section.
13.5. REDEFINING MEMBER FUNCTIONS 375
• The function mass is also already defined in Car, as it was inherited from Vehicle. In this
case, the class Truck redefines this member function to return the truck’s full mass:
size_t Truck::mass() const
{
return d_mass;
}
Example:
int main()
{
Land vehicle(1200, 145);
Truck lorry(3000, 120, "Juggernaut", 2500);
lorry.Vehicle::setMass(4000);
cout << ’\n’ << "Tractor weighs " <<
lorry.Vehicle::mass() << ’\n’ <<
"Truck + trailer weighs " << lorry.mass() << ’\n’ <<
"Speed is " << lorry.speed() << ’\n’ <<
"Name is " << lorry.name() << ’\n’;
}
The class Truck was derived from Car. However, one might question this class design. Since a truck
is conceived of as a combination of an tractor and a trailer it is probably better defined using a mixed
design, using inheritance for the tractor part (inheriting from Car, and composition for the trailer
part).
This redesign changes our point of view from a Truck being a Car (and some strangely added data
members) to a Truck still being an Car (the tractor) and containing a Vehicle (the trailer).
Truck’s interface is now very specific, not requiring users to study Car’s and Vehicle’s interfaces
and it opens up possibilities for defining ‘road trains’: tractors towing multiple trailers. Here is an
example of such an alternate class setup:
class Truck: public Car // the tractor
{
Vehicle d_trailer; // use vector<Vehicle> for road trains
public:
Truck();
Truck(size_t tractor_mass, size_t speed, char const *name,
size_t trailer_mass);
void setMass(size_t tractor_mass, size_t trailer_mass);
void setTractorMass(size_t tractor_mass);
void setTrailerMass(size_t trailer_mass);
size_t tractorMass() const;
size_t trailerMass() const;
// consider:
Vehicle const &trailer() const;
};
376 CHAPTER 13. INHERITANCE
13.6 Multiple inheritance
Up to now, a class has always been derived from a single base class. In addition to single inheritance
C++ also supports multiple inheritance. In multiple inheritance a class is derived from several base
classes and hence inherits functionality from multiple parent classes at the same time.
When using multiple inheritance it should be defensible to consider the newly derived class an instantiation
of both base classes. Otherwise, composition is more appropriate. In general, linear
derivation (using only one base class) is used much more frequently than multiple derivation. Good
class design dictates that a class should have a single, well described responsibility and that principle
often conflicts with multiple inheritance where we can state that objects of class Derived are
both Base1 and Base2 objects.
But then, consider the prototype of an object for which multiple inheritance was used to its extreme:
the Swiss army knife! This object is a knife, it is a pair of scissors, it is a can-opener, it is a corkscrew,
it is ....
The ‘Swiss army knife’ is an extreme example of multiple inheritance. In C++ there are some good
reasons, not violating the ‘one class, one responsibility’ principle that is covered in the next chapter.
In this section the technical details of constructing classes using multiple inheritance are discussed.
How to construct a ‘Swiss army knife’ in C++? First we need (at least) two base classes. For example,
let’s assume we are designing a toolkit allowing us to construct an instrument panel of an aircraft’s
cockpit. We design all kinds of instruments, like an artificial horizon and an altimeter. One of the
components that is often seen in aircraft is a nav-com set: a combination of a navigational beacon
receiver (the ‘nav’ part) and a radio communication unit (the ‘com’-part). To define the nav-com set,
we start by designing the NavSet class (assume the existence of the classes Intercom, VHF_Dial
and Message):
class NavSet
{
public:
NavSet(Intercom &intercom, VHF_Dial &dial);
size_t activeFrequency() const;
size_t standByFrequency() const;
void setStandByFrequency(size_t freq);
size_t toggleActiveStandby();
void setVolume(size_t level);
void identEmphasis(bool on_off);
};
Next we design the class ComSet:
class ComSet
{
public:
ComSet(Intercom &intercom);
size_t frequency() const;
size_t passiveFrequency() const;
void setPassiveFrequency(size_t freq);
13.6. MULTIPLE INHERITANCE 377
size_t toggleFrequencies();
void setAudioLevel(size_t level);
void powerOn(bool on_off);
void testState(bool on_off);
void transmit(Message &message);
};
Using objects of this class we can receive messages, transmitted though the Intercom, but we
can also transmit messages using a Message object that’s passed to the ComSet object using its
transmit member function.
Now we’re ready to construct our NavCom set:
class NavComSet: public ComSet, public NavSet
{
public:
NavComSet(Intercom &intercom, VHF_Dial &dial);
};
Done. Now we have defined a NavComSet which is both a NavSet and a ComSet: the facilities of
both base classes are now available in the derived class using multiple inheritance.
Please note the following:
• The keyword public is present before both base class names (NavSet and ComSet). By default
inheritance uses private derivation and the keyword public must be repeated before each of
the base class specifications. Base classes are not required to use the same derivation type. One
base class could have public derivation and another base class could use private derivation.
• The multiply derived class NavComSet introduces no additional functionality of its own, but
merely combines two existing classes into a new aggregate class. Thus, C++ offers the possibility
to simply sweep multiple simple classes into one more complex class.
• Here is the implementation of The NavComSet constructor:
NavComSet::NavComSet(Intercom &intercom, VHF_Dial &dial)
:
ComSet(intercom),
NavSet(intercom, dial)
{}
The constructor requires no extra code: Its purpose is to activate the constructors of its base
classes. The order in which the base class initializers are called is not dictated by their calling
order in the constructor’s code, but by the ordering of the base classes in the class interface.
• The NavComSet class definition requires no additional data members or member functions:
here (and often) the inherited interfaces provide all the required functionality and data for the
multiply derived class to operate properly.
Of course, while defining the base classes, we made life easy on ourselves by strictly using different
member function names. So, there is a function setVolume in the NavSet class and a function
setAudioLevel in the ComSet class. A bit cheating, since we could expect that both units in fact
have a composed object Amplifier, handling the volume setting. A revised class might offer an
Amplifier &amplifier() const member function, and leave it to the application to set up its
378 CHAPTER 13. INHERITANCE
own interface to the amplifier. Alternatively, a revised class could define members for setting the
volume of either the NavSet or the ComSet parts.
In situations where two base classes offer identically named members special provisions need to be
made to prevent ambiguity:
• The intended base class can explicitly be specified using the base class name and scope resolution
operator:
NavComSet navcom(intercom, dial);
navcom.NavSet::setVolume(5); // sets the NavSet volume level
navcom.ComSet::setVolume(5); // sets the ComSet volume level
• The class interface is provided withmember functions that can be called unambiguously. These
additional members are usually defined inline:
class NavComSet: public ComSet, public NavSet
{
public:
NavComSet(Intercom &intercom, VHF_Dial &dial);
void comVolume(size_t volume);
void navVolume(size_t volume);
};
inline void NavComSet::comVolume(size_t volume)
{
ComSet::setVolume(volume);
}
inline void NavComSet::navVolume(size_t volume)
{
NavSet::setVolume(volume);
}
• If the NavComSet class is obtained from a third party, and cannot be modified, a disambiguating
wrapper class may be used:
class MyNavComSet: public NavComSet
{
public:
MyNavComSet(Intercom &intercom, VHF_Dial &dial);
void comVolume(size_t volume);
void navVolume(size_t volume);
};
inline MyNavComSet::MyNavComSet(Intercom &intercom, VHF_Dial &dial)
:
NavComSet(intercom, dial);
{}
inline void MyNavComSet::comVolume(size_t volume)
{
ComSet::setVolume(volume);
}
inline void MyNavComSet::navVolume(size_t volume)
{
NavSet::setVolume(volume);
}
13.7. CONVERSIONS BETWEEN BASE CLASSES AND DERIVED CLASSES 379
13.7 Conversions between base classes and derived classes
When public inheritance is used to define classes, an object of a derived class is at the same time an
object of the base class. This has important consequences for object assignment and for the situation
where pointers or references to such objects are used. Both situations are now discussed.
13.7.1 Conversions with object assignments
Continuing our discussion of the NavCom class, introduced in section 13.6, we now define two objects,
a base class and a derived class object:
ComSet com(intercom);
NavComSet navcom(intercom2, dial2);
The object navcom is constructed using an Intercom and a VHF_Dial object. However, a
NavComSet is at the same time a ComSet, allowing the assignment from navcom (a derived class
object) to com (a base class object):
com = navcom;
The effect of this assignment is that the object com now communicates with intercom2. As a
ComSet does not have a VHF_Dial, the navcom’s dial is ignored by the assignment. When assigning
a base class object from a derived class object only the base class data members are assigned,
other data members are dropped, a phenomenon called slicing. In situations like these slicing probably
does not have serious consequences, but when passing derived class objects to functions defining
base class parameters or when returning derived class objects from functions returning base class
objects slicing also occurs and might have unwelcome side-effects.
The assignment from a base class object to a derived class object is problematic. In a statement like
navcom = com;
it isn’t clear how to reassign the NavComSet’s VHF_Dial data member as they are missing in the
ComSet object com. Such an assignment is therefore refused by the compiler. Although derived class
objects are also base class objects, the reverse does not hold true: a base class object is not also a
derived class object.
The following general rule applies: in assignments in which base class objects and derived class
objects are involved, assignments in which data are dropped are legal (called slicing). Assignments
in which data remain unspecified are not allowed. Of course, it is possible to overload an assignment
operator to allow the assignment of a derived class object from a base class object. To compile the
statement
navcom = com;
the class NavComSet must have defined an overloaded assignment operator accepting a ComSet
object for its argument. In that case it’s up to the programmer to decide what the assignment
operator will do with the missing data.
380 CHAPTER 13. INHERITANCE
13.7.2 Conversions with pointer assignments
We return to our Vehicle classes, and define the following objects and pointer variable:
Land land(1200, 130);
Car auto(500, 75, "Daf");
Truck truck(2600, 120, "Mercedes", 6000);
Vehicle *vp;
Now we can assign the addresses of the three objects of the derived classes to the Vehicle pointer:
vp = &land;
vp = &auto;
vp = &truck;
Each of these assignments is acceptable. However, an implicit conversion of the derived class to the
base class Vehicle is used, since vp is defined as a pointer to a Vehicle. Hence, when using vp only
the member functions manipulating mass can be called as this is the Vehicle’s only functionality.
As far as the compiler can tell this is the object vp points to.
The same holds true for references to Vehicles. If, e.g., a function is defined having a Vehicle
reference parameter, the function may be passed an object of a class derived from Vehicle. Inside
the function, the specific Vehicle members remain accessible. This analogy between pointers and
references holds true in general. Remember that a reference is nothing but a pointer in disguise: it
mimics a plain variable, but actually it is a pointer.
This restricted functionality has an important consequence for the class Truck. Following vp =
&truck, vp points to a Truck object. So, vp->mass() returns 2600 instead of 8600 (the combined
mass of the cabin and of the trailer: 2600 + 6000), which would have been returned by
truck.mass().
When a function is called using a pointer to an object, then the type of the pointer (and not the type of
the object) determines which member functions are available and can be executed. In other words,
C++ implicitly converts the type of an object reached through a pointer to the pointer’s type.
If the actual type of the object pointed to by a pointer is known, an explicit type cast can be used to
access the full set of member functions that are available for the object:
Truck truck;
Vehicle *vp;
vp = &truck; // vp now points to a truck object
Truck *trp;
trp = static_cast<Truck *>(vp);
cout << "Make: " << trp->name() << ’\n’;
Here, the second to last statement specifically casts a Vehicle _ variable to a Truck _. As usual
(when using casts), this code is not without risk. It only works if vp really points to a Truck.
Otherwise the program may produce unexpected results.
13.8. USING NON-DEFAULT CONSTRUCTORS WITH NEW[] 381
13.8 Using non-default constructors with new[]
An often heard complaint is that operator new[] calls the default constructor of a class to initialize
the allocated objects. For example, to allocate an array of 10 strings we can do
new string[10];
but it is not possible to use another constructor. Assuming that we’d want to initialize the strings
with the text hello world, we can’t write something like:
new string("hello world")[10];
The initialization of a dynamically allocated object usually consists of a two-step process: first the
array is allocated (implicitly calling the default constructor); second the array’s elements are initialized,
as in the following little example:
string *sp = new string[10];
fill(sp, sp + 10, string("hello world"));
These approaches all suffer from‘double initializations’, comparable to not usingmember initializers
in constructors.
One way to avoid double initialization is to use inheritance. Inheritance can profitably be used to
call non-default constructors in combination with operator new[]. The approach capitalizes on the
following:
• A base class pointer may point to a derived class object;
• A derived class without (non-static) data members has the same size as its base class.
The above also suggests a possible approach:
• Derive a simple, member-less class from the class we’re interested in;
• Use the appropriate base class initializer in its default constructor;
• Allocate the required number of derived class objects, and assign new[]’s return expression to
a pointer to base class objects.
Here is a simple example, producing 10 lines containing the text hello world:
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
using namespace std;
struct Xstr: public string
{
Xstr()
382 CHAPTER 13. INHERITANCE
:
string("hello world")
{}
};
int main()
{
string *sp = new Xstr[10];
copy(sp, sp + 10, ostream_iterator<string>(cout, "\n"));
}
Of course, the above example is fairly unsophisticated, but it’s easy to polish the example: the
class Xstr can be defined in an anonymous namespace, accessible only to a function getString()
which may be given a size_t nObjects parameter, allowing users to specify the number of hello
world-initialized strings they would like to allocate.
Instead of hard-coding the base class arguments it’s also possible to use variables or functions providing
the appropriate values for the base class constructor’s arguments. In the next example a local
class Xstr is defined inside a function nStrings(size_t nObjects, char const _fname), expecting
the number of string objects to allocate and the name of a file whose subsequent lines are
used to initialize the objects. The local class is invisible outside of the function nStrings, so no
special namespace safeguards are required.
As discussed in section 7.9, members of local classes cannot access local variables from their surrounding
function. However, they can access global and static data defined by the surrounding
function.
Using a local class neatly allows us to hide the implementation details within the function
nStrings, which simply opens the file, allocates the objects, and closes the file again. Since the
local class is derived from string, it can use any string constructor for its base class initializer. In
this particular case it calls the string(char const _) constructor, providing it with subsequent
lines of the just opened stream via its static member function nextLine(). This latter function is,
as it is a static member function, available to Xstr default constructor’s member initializers even
though no Xstr object is available by that time.
#include <fstream>
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
using namespace std;
string *nStrings(size_t size, char const *fname)
{
static ifstream in;
struct Xstr: public string
{
Xstr()
:
string(nextLine())
{}
static char const *nextLine()
{
13.8. USING NON-DEFAULT CONSTRUCTORS WITH NEW[] 383
static string line;
getline(in, line);
return line.c_str();
}
};
in.open(fname);
string *sp = new Xstr[size];
in.close();
return sp;
}
int main()
{
string *sp = nStrings(10, "nstrings.cc");
copy(sp, sp + 10, ostream_iterator<string>(cout, "\n"));
}
When this program is run, it displays the first 10 lines of the file nstrings.cc.
Note that the above implementation can’t safely be used in a multithreaded environment. In that
case a mutex should be used to protect the three statements just before the function’s return statement.
A completely different way to avoid the double initialization (not using inheritance) is to use placement
new (cf. section 9.1.5): simply allocate the required amount of memory followed by the proper
in-place allocation of the objects, using the appropriate constructors. The following example can also
be used in multithreaded environments. The approach uses a pair of static construct/destroy
members to perform the required initialization.
In the program shown below construct expects a istream that provides the initialization strings
for objects of a class String simply containing a std::string object. Construct first allocates
enough memory for the n String objects plus room for an initial size_t value. This initial size_t
value is then initialized with n. Next, in a for statement, lines are read from the provided stream
and the lines are passed to the constructors, using placement new calls. Finally the address of the
first String object is returned.
The member destroy handles the destruction of the objects. It retrieves the number of objects
to destroy from the size_t it finds just before the location of the address of the first object to
destroy. The objects are then destroyed by explicitly calling their destructors. Finally the raw
memory, originally allocated by construct is returned.
#include <fstream>
#include <iostream>
#include <string>
using namespace std;
class String
{
union Ptrs
{
void *vp;
String *sp;
size_t *np;
384 CHAPTER 13. INHERITANCE
};
std::string d_str;
public:
String(std::string const &txt)
:
d_str(txt)
{}
~String()
{
cout << "destructor: " << d_str << ’\n’;
}
static String *construct(istream &in, size_t n)
{
Ptrs p = {operator new(n * sizeof(String) + sizeof(size_t))};
*p.np++ = n;
string line;
for (size_t idx = 0; idx != n; ++idx)
{
getline(in, line);
new(p.sp + idx) String(line);
}
return p.sp;
}
static void destroy(String *sp)
{
Ptrs p = {sp};
--p.np;
for (size_t n = *p.np; n--; )
sp++->~String();
operator delete (p.vp);
}
};
int main()
{
String *sp = String::construct(cin, 5);
String::destroy(sp);
}
/*
After providing 5 lines containing, respectively
alfa, bravo, charlie, delta, echo
the program displays:
destructor: alfa
destructor: bravo
destructor: charlie
destructor: delta
destructor: echo
13.8. USING NON-DEFAULT CONSTRUCTORS WITH NEW[] 385
*/
386 CHAPTER 13. INHERITANCE

Chapter 14
Polymorphism
Using inheritance classes may be derived from other classes, called base classes. In the previous
chapter we saw that base class pointers may be used to point to derived class objects. We also saw
that when a base class pointer points to an object of a derived class it is the type of the pointer
rather than the type of the object it points to what determines which member functions are visible.
So when a Vehicle _vp, points to a Car object Car’s speed or brandName members can’t be used.
In the previous chapter two fundamental ways classes may be related to each other were discussed:
a class may be implemented-in-terms-of another class and it can be stated that a derived class is-a
base class. The former relationship is usually implemented using composition, the latter is usually
implemented using a special form of inheritance, called polymorphism, the topic of this chapter.
An is-a relationship between classes allows us to apply the Liskov Substitution Principle (LSP)
according to which a derived class object may be passed to and used by code expecting a pointer
or reference to a base class object. In the C++ Annotations so far the LSP has been applied many
times. Every time an ostringstream, ofstream or fstream was passed to functions expecting
an ostream we’ve been applying this principle. In this chapter we’ll discover how to design our own
classes accordingly.
LSP is implemented using a technique called polymorphism: although a base class pointer is used it
performs actions defined in the (derived) class of the object it actually points to. So, a Vehicle _vp
might behave like a Car _ when pointing to a Car1.
Polymorphism is implemented using a feature called late binding. It’s called that way because the
decision which function to call (a base class function or a function of a derived class) cannot be made
at compile-time, but is postponed until the program is actually executed: only then it is determined
which member function will actually be called.
In C++ late binding is not the default way functions are called. By default static binding (or early
binding) is used. With static binding the functions that are called are determined by the compiler,
merely using the class types of objects, object pointers or object refences.
Late binding is an inherently different (and slightly slower) process as it is decided at run-time,
rather than at compile-time what function is going to be called. As C++ supports both late- and
early-binding C++ programmers are offered an option as to what kind of binding to use. Choices
can be optimized to the situations at hand. Many other languages offering object oriented facilities
(e.g., Java) only or by default offer late binding. C++ programmers should be keenly aware of this.
Expecting early binding and getting late binding may easily produce nasty bugs.
1In one of the StarTrek movies, Capt. Kirk was in trouble, as usual. He met an extremely beautiful lady who, however,
later on changed into a hideous troll. Kirk was quite surprised, but the lady told him: “Didn’t you know I am a polymorph?”
387
388 CHAPTER 14. POLYMORPHISM
Let’s look at a simple example to start appreciating the differences between late and early binding.
The example merely illustrates. Explanations of why things are as shown are shortly provided.
Consider the following little program:
#include <iostream>
using namespace std;
class Base
{
protected:
void hello()
{
cout << "base hello\n";
}
public:
void process()
{
hello();
}
};
class Derived: public Base
{
protected:
void hello()
{
cout << "derived hello\n";
}
};
int main()
{
Derived derived;
derived.process();
}
The important characteristic of the above program is the Base::process function, calling hello.
As process is the only member that is defined in the public interface it is the only member that can
be called by code not belonging to the two classes. The class Derived, derived from Base clearly
inherits Base’s interface and so process is also available in Derived. So the Derived object in
main is able to call process, but not hello.
So far, so good. Nothing new, all this was covered in the previous chapter. One may wonder why
Derived was defined at all. It was presumably defined to create an implementation of hello that’s
appropriate for Derived but differing from Base::hello’s implementation. Derived’s author’s
reasoning was as follows: Base’s implementation of hello is not appropriate; a Derived class object
can remedy that by providing an appropriate implementation. Furthermore our author reasoned:
“since the type of an object determines the interface that is used, process must call
Derived::hello as hello is called via process from a Derived class object”.
Unfortunately our author’s reasoning is flawed, due to static binding. When Base::process was
compiled static binding caused the compiler to bind the hello call to Base::hello().
14.1. VIRTUAL FUNCTIONS 389
The author intended to create a Derived class that is-a Base class. That only partially succeeded:
Base’s interface was inherited, but after that Derived has relinquished all control over what happens.
Once we’re in process we’re only able to see Base’s member implementations. Polymorphism
offers a way out, allowing us to redefine (in a derived class) members of a base class allowing these
redefined members to be used from the base class’s interface.
This is the essence of LSP: public inheritance should not be used to reuse the base class members
(in derived classes) but to be reused (by the base class, polymorphically using derived class members
reimplementing base class members).
Take a second to appreciate the implications of the above little program. The hello and process
members aren’t too impressive, but the implications of the example are. The process member
could implement directory travel, hello could define the action to perform when encountering a
file. Base::hellomight simply show the name of a file, but Derived::hellomight delete the file;
might only list its name if its younger than a certain age; might list its name if it contains a certain
text; etc., etc.. Up to now Derived would have to implement process’s actions itself; Up to now
code expecting a Base class reference or pointer could only perform Base’s actions. Polymorphism
allows us to reimplement members of base classes and to use those reimplemented members in code
expecting base class references or pointers. Using polymorphism existing code may be reused by
derived classes reimplementing the appropriate members of their base classes. It’s about time to
uncover how this magic can be realized.
Polymorphism, which is not the default in C++, solves the problem and allows the author of the
classes to reach its goal. For the curious reader: prefix void hello() in the Base class with
the keyword virtual and recompile. Running the modified program produces the intended and
expected derived hello. Why this happens is explained next.
14.1 Virtual functions
By default the behavior of a member function called via a pointer or reference is determined by the
implementation of that function in the pointer’s or reference’s class. E.g., a Vehicle _ activates
Vehicle’s member functions, even when pointing to an object of a derived class. This is known as
as early or static binding: the function to call is determined at compile-time. In C++ late or dynamic
binding is realized using virtual member functions.
A member function becomes a virtual member function when its declaration starts with the keyword
virtual. It is stressed once again that in C++, different from several other object oriented
languages, this is not the default situation. By default static binding is used.
Once a function is declared virtual in a base class, it remains virtual in all derived classes. The
keyword virtual should not be mentioned with members declared virtual in the base class. In
derived classes those members should be provided with the override indicator.
In the vehicle classification system (see section 13.1), let’s concentrate on the members mass and
setMass. These members define the user interface of the class Vehicle. What we would like to
accomplish is that this user interface can be used for Vehicle and for any class inheriting from
Vehicle, since objects of those classes are themselves also Vehicles.
If we can define the user interface of our base class (e.g., Vehicle) such that it remains usable
irrespective of the classes we derive from Vehicle our software achieves an enormous reusability:
we design or software around Vehicle’s user interface, and our software will also properly function
for derived classes. Using plain inheritance doesn’t accomplish this. If we define
std::ostream &operator<<(std::ostream &out, Vehicle const &vehicle)
390 CHAPTER 14. POLYMORPHISM
{
return out << "Vehicle’s mass is " << vehicle.mass() << " kg.";
}
and Vehicle’s member mass returns 0, but Car’s member mass returns 1000, then twice a mass
of 0 is reported when the following program is executed:
int main()
{
Vehicle vehicle;
Car vw(1000);
cout << vehicle << ’\n’ << vw << endl;
}
We’ve defined an overloaded insertion operator, but since it only knows about Vehicle’s user interface,
‘cout << vw’ will use vw’s Vehicle’s user interface as well, thus displaying a mass of
0.
Reusablility is enhanced if we add a redefinable interface to the base class’s interface. A redefinable
interface allows derived classes to fill in their own implementation, without affecting the user interface.
At the same time the user interface will behave according to the derived class’s wishes, and not
just to the base class’s default implementation.
Members of the reusable interface should be declared in the class’s private sections: conceptually
they merely belong to their own classes (cf. section 14.7). In the base class these members should
be declared virtual. These members can be redefined (overridden) by derived classes, and should
there be provided with override indicators.
We keep our user interface (mass), and add the redefinable member vmass to Vehicle’s interface:
class Vehicle
{
public:
size_t mass() const;
size_t si_mass() const; // see below
private:
virtual size_t vmass() const;
};
Separating the user interface from the redefinable interface is a sensible thing to do. It allows us
to fine-tune the user interface (only one point of maintenance), while at the same time allowing us
to standardize the expected behavior of the members of the redefinable interface. E.g., in many
countries the International system of units is used, using the kilogram as the unit for mass. Some
countries use other units (like the lbs: 1 kg being approx. 2.2046 lbs). By separating the user
interface from the redefinable interface we can use one standard for the redefinable interface, and
keep the flexibility of transforming the information ad-lib in the user interface.
Just to maintain a clean separation of user- and redefinable interface we might consider adding
another accessor to Vehicle, providing the si_mass, simply implemented like this:
size_t Vehicle::si_mass() const
{
14.1. VIRTUAL FUNCTIONS 391
return vmass();
}
If Vehicle supports a member d_massFactor then its mass member can be implemented like this:
size_t Vehicle::mass()
{
return d_massFactor * si_mass();
}
Vehicle itself could define vmass so that it returns a token value. E.g.,
size_t Vehicle::vmass()
{
return 0;
}
Now let’s have a look at the class Car. It is derived from Vehicle, and it inherits Vehicle’s user
interface. It also has a data member size_t d_mass, and it implements its own reusable interface:
class Car: public Vehicle
{
...
private:
size_t vmass() override;
}
If Car constructors require us to specify the car’s mass (stored in d_mass), then Car simply implements
its vmass member like this:
size_t Car::vmass() const
{
return d_mass;
}
The class Truck, inheriting from Car needs two mass values: the tractor’s mass and the trailer’s
mass. The tractor’s mass is passed to its Car base class, the trailor’s mass is passed to its Vehicle
d_trailor data member. Truck, too, overrides vmass, this time returning the sum of its tractor
and trailor masses:
size_t Truck::vmass() const
{
return Car::si_mass() + d_trailer.si_mass();
}
Once a class member has been declared virtual it becomes a virtual member in all derived classes,
whether or not these members are provided with the override indicator. But override should be
used, as it allows to compiler to catch typos when writing down the derived class interface.
A member function may be declared virtual anywhere in a class hierarchy, but this probably defeats
the underlying polymorphic class design, as the original base class is no longer capable of
392 CHAPTER 14. POLYMORPHISM
completely covering the redefinable interfaces of derived classes. If, e.g, mass is declared virtual in
Car, but not in Vehicle, then the specific characteristics of virtual member functions would only
be available for Car objects and for objects of classes derived from Car. For a Vehicle pointer or
reference static binding would remain to be used.
The effect of late binding (polymorphism) is illustrated below:
void showInfo(Vehicle &vehicle)
{
cout << "Info: " << vehicle << ’\n’;
}
int main()
{
Car car(1200); // car with mass 1200
Truck truck(6000, 115, // truck with cabin mass 6000,
"Scania", 15000); // speed 115, make Scania,
// trailer mass 15000
showInfo(car); // see (1) below
showInfo(truck); // see (2) below
Vehicle *vp = &truck;
cout << vp->speed() << ’\n’;// see (3) below
}
Now that mass is defined virtual, late binding is used:
• at (1), Car’s mass is displayed;
• at (2) Truck’s mass is displayed;
• at (3) a syntax error is generated. The member speed is not a member of Vehicle, and hence
not callable via a Vehicle_.
The example illustrates that when a pointer to a class is used only the members of that class can be
called. A member’s virtual characteristic only influences the type of binding (early vs. late), not
the set of member functions that is visible to the pointer.
Through virtual members derived classes may redefine the behavior performed by functions called
from base class members or from pointers or references to base class objects. This redefinition of
base class members by derived classes is called overriding members.
14.2 Virtual destructors
When an object ceases to exist the object’s destructor is called. Now consider the following code
fragment (cf. section 13.1):
Vehicle *vp = new Land(1000, 120);
delete vp; // object destroyed
14.3. PURE VIRTUAL FUNCTIONS 393
Here delete is applied to a base class pointer. As the base class defines the available interface
delete vp calls ~Vehicle and ~Land remains out of sight. Assuming that Land allocates memory
a memory leak results. Freeing memory is not the only action destructors can perform. In general
they may perform any action that’s necessary when an object ceases to exist. But here none of the
actions defined by ~Land are performed. Bad news....
In C++ this problem is solved by virtual destructors. A destructor can be declared virtual. When
a base class destructor is declared virtual then the destructor of the actual class pointed to by a base
class pointer bp is going to be called when delete bp is executed. Thus, late binding is realized for
destructors even though the destructors of derived classes have unique names. Example:
class Vehicle
{
public:
virtual ~Vehicle(); // all derived class destructors are
// now virtual as well.
};
By declaring a virtual destructor, the above delete operation (delete vp) correctly calls Land’s
destructor, rather than Vehicle’s destructor.
Once a destructor is called it performs as usual, whether or not it is a virtual destructor. So, ~Land
first executes its own statements and then calls ~Vehicle. Thus, the above delete vp statement
uses late binding to call ~Vehicle and from this point on the object destruction proceeds as usual.
Destructors should always be defined virtual in classes designed as a base class from which
other classes are going to be derived. Often those destructors themselves have no tasks to perform.
In these cases the virtual destructor is given an empty body. For example, the definition of
Vehicle::~Vehicle() may be as simple as:
Vehicle::~Vehicle()
{}
Resist the temptation to define virtual destructors (even empty destructors) inline as this complicates
class maintenance. Section 14.11 discusses the reason behind this rule of thumb.
14.3 Pure virtual functions
The base class Vehicle is provided with its own concrete implementations of its virtual members
(mass and setMass). However, virtual member functions do not necessarily have to be implemented
in base classes.
When the implementations of virtual members are omitted from base classes the class imposes
requirements upon derived classes. The derived classes are required to provide the ‘missing implementations’.
This approach, in some languages (like C#, Delphi and Java) known as an interface, defines a protocol.
Derived classes must obey the protocol by implementing the as yet not implementedmembers.
If a class contains at least one member whose implementation is missing no objects of that class can
be defined.
Such incompletely defined classes are always base classes. They enforce a protocol by merely declaring
names, return values and arguments of some of their members. These classes are call abstract
394 CHAPTER 14. POLYMORPHISM
classes or abstract base classes. Derived classes become non-abtract classes by implementing the as
yet not implemented members.
Abstract base classes are the foundation of many design patterns (cf. Gamma et al. (1995)) , allowing
the programmer to create highly reusable software. Some of these design patterns are covered by the
C++ Annotations (e.g, the Template Method in section 24.2), but for a thorough discussion of design
patterns the reader is referred to Gamma et al.’s book.
Members that are merely declared in base classes are called pure virtual functions. A virtual member
becomes a pure virtual member by postfixing = 0 to its declaration (i.e., by replacing the semicolon
ending its declaration by ‘= 0;’). Example:
#include <iosfwd>
class Base
{
public:
virtual ~Base();
virtual std::ostream &insertInto(std::ostream &out) const = 0;
};
inline std::ostream &operator<<(std::ostream &out, Base const &base)
{
return base.insertInto(out);
}
All classes derived from Base must implement the insertInto member function, or their objects
cannot be constructed. This is neat: all objects of class types derived from Base can now always be
inserted into ostream objects.
Could the virtual destructor of a base class ever be a pure virtual function? The answer to this
question is no. First of all, there is no need to enforce the availability of destructors in derived
classes as destructors are provided by default (unless a destructor is declared with the = delete
attribute). Second, if it is a pure virtual member its implementation does not exist. However,
derived class destructors eventually call their base class destructors. How could they call base class
destructors if their implementations are lacking? More about this in the next section.
Often, but not necessarily, pure virtual member functions are const member functions. This allows
the construction of constant derived class objects. In other situations this might not be necessary
(or realistic), and non-constant member functions might be required. The general rule for const
member functions also applies to pure virtual functions: if the member function alters the object’s
data members, it cannot be a const member function.
Abstract base classes frequently don’t have data members. However, once a base class declares a
pure virtual member it must be declared identically in derived classes. If the implementation of
a pure virtual function in a derived class alters the derived class object’s data, then that function
cannot be declared as a const member. Therefore, the author of an abstract base class should
carefully consider whether a pure virtual member function should be a const member function or
not.
14.3.1 Implementing pure virtual functions
Pure virtual member functions may be implemented. To implement a pure virtual member function,
provide it with its normal = 0; specification, but implement it as well. Since the = 0; ends in
a semicolon, the pure virtual member is always at most a declaration in its class, but an implementation
may either be provided outside from its interface (maybe using inline).
14.3. PURE VIRTUAL FUNCTIONS 395
Pure virtual member functions may be called from derived class objects or from its class or derived
class members by specifying the base class and scope resolution operator together with the member
to be called. Example:
#include <iostream>
class Base
{
public:
virtual ~Base();
virtual void pureimp() = 0;
};
Base::~Base()
{}
void Base::pureimp()
{
std::cout << "Base::pureimp() called\n";
}
class Derived: public Base
{
public:
virtual void pureimp();
};
inline void Derived::pureimp()
{
Base::pureimp();
std::cout << "Derived::pureimp() called\n";
}
int main()
{
Derived derived;
derived.pureimp();
derived.Base::pureimp();
Derived *dp = &derived;
dp->pureimp();
dp->Base::pureimp();
}
// Output:
// Base::pureimp() called
// Derived::pureimp() called
// Base::pureimp() called
// Base::pureimp() called
// Derived::pureimp() called
// Base::pureimp() called
Implementing a pure virtual member has limited use. One could argue that the pure virtual member
function’s implementation may be used to perform tasks that can already be performed at the base
class level. However, there is no guarantee that the base class virtual member function is actually
going to be called. Therefore base class specific tasks could as well be offered by a separate member,
without blurring the distinction between a member doing some work and a pure virtual member
enforcing a protocol.
396 CHAPTER 14. POLYMORPHISM
14.4 Explicit virtual overrides
Consider the following situations:
• A class Value is a value class. It offers a copy constructor, an overloaded assignment operator,
maybe move operations, and a public, non-virtual constructor. In section 14.7 it is argued that
such classes are not suited as base classes. New classes should not inherit from Value. How
to enforce this?
• A polymorphic class Base defines a virtual member v_process(int32_t). A class
derived from Base needs to override this member, but the author mistakingly defined
v_proces(int32_t). How to prevent such errors, breaking the polymorphic behavior of the
derived class?
• A class Derived, derived from a polymorphic Base class overrides the member
Base::v_process, but classes that are in turn derived from Derived should no longer override
v_process, but may override other virtual members like v_call and v_display. How
to enforce this restricted polymorphic character for classes derived from Derived?
Two special identifiers, final and override are used to realize the above. These identifiers are
special in the sense that they only require their special meanings in specific contexts. Outside of
this context they are just plain identifiers, allowing the programmer to define a variable like bool
final.
The identifier final can be applied to class declarations to indicate that the class cannot be used as
a base class. E.g.:
class Base1 final // cannot be a base class
{};
class Derived1: public Base1 // ERR: Base1 is final
{};
class Base2 // OK as base class
{};
class Derived2 final: public Base2 // OK, but Derived2 can’t be
{}; // used as a base class
class Derived: public Derived2 // ERR: Derived2 is final
{};
The identifier final can also be added to virtual member declarations. This indicates that those
virtual members cannot be overridden by derived classes. The restricted polymorphic character of a
class, mentioned above, can thus be realized as follows:
class Base
{
virtual int v_process(); // define polymorphic behavior
virtual int v_call();
virtual int v_display();
};
class Derived: public Base // Derived restricts polymorphism
{ // to v_call and v_display
virtual int v_process() final;
};
14.5. VIRTUAL FUNCTIONS AND MULTIPLE INHERITANCE 397
class Derived2: public Derived
{
// int v_process(); No go: Derived:v_process is final
virtual int v_display(); // OK to override
};
To allow the compiler to detect typos, differences in parameter types, or differences in member
function modifiers (e.g., const vs. non-const) the identifier override can (should) be appended to
derived class members overriding base class members. E.g.,
class Base
{
virtual int v_process();
virtual int v_call() const;
virtual int v_display(std::ostream &out);
};
class Derived: public Base
{
virtual int v_proces() override; // ERR: v_proces != v_process
virtual int v_call() override; // ERR: not const
// ERR: parameter types differ
virtual int v_display(std::istream &out) override;
};
14.5 Virtual functions and multiple inheritance
In chapter 6 we encountered the class fstream, one class offering features of ifstream and
ofstream. In chapter 13 we learned that a class may be derived from multiple base classes. Such
a derived class inherits the properties of all its base classes. Polymorphism can also be used in
combination with multiple inheritance.
Consider what would happen if more than one ‘path’ leads from the derived class up to its (base)
classes. This is illustrated in the next (fictitious) example where a class Derived is doubly derived
from Base:
class Base
{
int d_field;
public:
void setfield(int val);
int field() const;
};
inline void Base::setfield(int val)
{
d_field = val;
}
inline int Base::field() const
{
return d_field;
}
class Derived: public Base, public Base
398 CHAPTER 14. POLYMORPHISM
Figure 14.1: Duplication of a base class in multiple derivation.
{
};
Due to the double derivation, Base’s functionality now occurs twice in Derived. This results in
ambiguity: when the function setfield() is called for a Derived class object, which function will
that be as there are two of them? The scope resolution operator won’t come to the rescue and so the
C++ compiler cannot compile the above example and (correctly) identifies an error.
The above code clearly duplicates its base class in the derivation, which can of course easily be
avoided by not doubly deriving from Base (or by using composition (!)). But duplication of a base
class can also occur through nested inheritance, where an object is derived from, e.g., a Car and from
an Air (cf. section 13.1). Such a class would be needed to represent, e.g., a flying car2. An AirCar
would ultimately contain two Vehicles, and hence two mass fields, two setMass() functions and
two mass() functions. Is this what we want?
14.5.1 Ambiguity in multiple inheritance
Let’s investigate closer why an AirCar introduces ambiguity, when derived from Car and Air.
• An AirCar is a Car, hence a Land, and hence a Vehicle.
• However, an AirCar is also an Air, and hence a Vehicle.
The duplication of Vehicle data is further illustrated in Figure 14.1. The internal organization of
an AirCar is shown in Figure 14.2 The C++ compiler detects the ambiguity in an AirCar object,
and will therefore not compile statements like:
AirCar jBond;
cout << jBond.mass() << ’\n’;
Which member function mass to call cannot be determined by the compiler but the programmer has
two possibilities to resolve the ambiguity for the compiler:
• First, the function call where the ambiguity originates can be modified. The ambiguity is
resolved using the scope resolution operator:
// let’s hope that the mass is kept in the Car
2such as the one in James Bond vs. the Man with the Golden Gun...
14.5. VIRTUAL FUNCTIONS AND MULTIPLE INHERITANCE 399
Figure 14.2: Internal organization of an AirCar object.
// part of the object..
cout << jBond.Car::mass() << ’\n’;
The scope resolution operator and the class name are put right before the name of the member
function.
• Second, a dedicated function mass could be created for the class AirCar:
int AirCar::mass() const
{
return Car::mass();
}
The second possibility is preferred as it does not require the compiler to flag an error; nor does it
require the programmer using the class AirCar to take special precautions.
However, there exists a more elegant solution, discussed in the next section.
14.5.2 Virtual base classes
As illustrated in Figure 14.2, an AirCar represents two Vehicles. This not only results in an
ambiguity about which function to use to access the mass data, but it also defines two mass fields in
an AirCar. This is slightly redundant, since we can assume that an AirCar has but one mass.
It is, however, possible to define an AirCar as a class consisting of but one Vehicle and yet using
multiple derivation. This is realized by defining the base classes that are multiply mentioned in a
derived class’s inheritance tree as a virtual base class.
For the class AirCar this implies a small change when deriving an AirCar from Land and Air
classes:
class Land: virtual public Vehicle
{
// etc
};
class Car: public Land
{
// etc
};
class Air: virtual public Vehicle
400 CHAPTER 14. POLYMORPHISM
Figure 14.3: Internal organization of an AirCar object when the base classes are virtual.
{
// etc
};
class AirCar: public Car, public Air
{
};
Virtual derivation ensures that a Vehicle is only added once to a derived class. This means that
the route along which a Vehicle is added to an AirCar is no longer depending on its direct base
classes; we can only state that an AirCar is a Vehicle. The internal organization of an AirCar
after virtual derivation is shown in Figure 14.3.
When a class Third inherits from a base class Second which in turn inherits from a base class
First then the First class constructor called by the Second class constructor is also used when
this Second constructor is used when constructing a Third object. Example:
class First
{
public:
First(int x);
};
class Second: public First
{
public:
Second(int x)
:
First(x)
{}
};
class Third: public Second
{
public:
Third(int x)
:
Second(x) // calls First(x)
{}
};
The above no longer holds true when Second uses virtual derivation. When Second uses virtual
derivation its base class constructor is ignored when Second’s constructor is called from Third.
14.5. VIRTUAL FUNCTIONS AND MULTIPLE INHERITANCE 401
Instead Second by default calls First’s default constructor. This is illustrated by the next example:
class First
{
public:
First()
{
cout << "First()\n";
}
First(int x);
};
class Second: public virtual First // note: virtual
{
public:
Second(int x)
:
First(x)
{}
};
class Third: public Second
{
public:
Third(int x)
:
Second(x)
{}
};
int main()
{
Third third(3); // displays ‘First()’
}
When constructing Third First’s default constructor is used by default. Third’s constructor, however,
may overrule this default behavior by explicitly specifying the constructor to use. Since the
First object must be available before Second can be constructed it must be specified first. To call
First(int) when constructing Third(int) the latter constructor can be defined as follows:
class Third: public Second
{
public:
Third(int x)
:
First(x), // now First(int) is called.
Second(x)
{}
};
This behavior may seem puzzling when simple linear inheritance is used but it makes sense when
multiple inheritance is used with base classes using virtual inheritance. Consider AirCar: when
Air and Car both virtually inherit from Vehicle will Air and Car both initialize the common
Vehicle object? If so, which one is going to be called first? What if Air and Car use different
Vehicle constructors? All these questions can be avoided by passing the responsibility for the
initialization of a common base class to the class eventually using the common base class object. In
402 CHAPTER 14. POLYMORPHISM
the above example Third. Hence Third is provided an opportunity to specify the constructor to use
when initializing First.
Multiple inheritance may also be used to inherit from classes that do not all use virtual inheritance.
Assume we have two classes, Derived1 and Derived2, both (possibly virtually) derived from Base.
We now address the question which constructors will be called when calling a constructor of the
class Final: public Derived1, public Derived2.
To distinguish the involved constructors Base1 indicates the Base class constructor called as base
class initializer for Derived1 (and analogously: Base2 called from Derived2). A plain Base indicates
Base’s default constructor.
Derived1 and Derived2 indicate the base class initializers used when constructing a Final object.
Now we’re ready to distinguish the various cases when constructing an object of the class Final:
public Derived1, public Derived2:
• classes:
Derived1: public Base
Derived2: public Base
This is normal, non virtual multiple derivation. The following constructors are called
in the order shown:
Base1,
Derived1,
Base2,
Derived2
• classes:
Derived1: public Base
Derived2: virtual public Base
Only Derived2 uses virtual derivation. Derived2’s base class constructor is ignored.
Instead, Base is called and it is called prior to any other constructor:
Base,
Base1,
Derived1,
Derived2
As only one class uses virtual derivation, two Base class objects remain available in
the eventual Final class.
• classes:
Derived1: virtual public Base
Derived2: public Base
Only Derived1 uses virtual derivation. Derived1’s base class constructor is ignored.
Instead, Base is called and it is called prior to any other constructor. Different
from the first (non-virtual) case Base is now called, rather than Base1:
Base,
Derived1,
Base2,
Derived2
14.5. VIRTUAL FUNCTIONS AND MULTIPLE INHERITANCE 403
• classes:
Derived1: virtual public Base
Derived2: virtual public Base
Both base classes use virtual derivation and so only one Base class object will be
present in the Final class object. The following constructors are called in the order
shown:
Base,
Derived1,
Derived2
Virtual derivation is, in contrast to virtual functions, a pure compile-time issue. Virtual inheritance
merely defines how the compiler defines a class’s data organization and construction process.
14.5.3 When virtual derivation is not appropriate
Virtual inheritance can be used to merge multiply occurring base classes. However, situations may
be encountered where multiple occurrences of base classes is appropriate. Consider the definition of
a Truck (cf. section 13.5):
class Truck: public Car
{
int d_trailer_mass;
public:
Truck();
Truck(int engine_mass, int sp, char const *nm,
int trailer_mass);
void setMass(int engine_mass, int trailer_mass);
int mass() const;
};
Truck::Truck(int engine_mass, int sp, char const *nm,
int trailer_mass)
:
Car(engine_mass, sp, nm)
{
d_trailer_mass = trailer_mass;
}
int Truck::mass() const
{
return // sum of:
Car::mass() + // engine part plus
trailer_mass; // the trailer
}
This definition shows how a Truck object is constructed to contain two mass fields: one via its
derivation from Car and one via its own int d_trailer_mass data member. Such a definition
is of course valid, but it could also be rewritten. We could derive a Truck from a Car and from
a Vehicle, thereby explicitly requesting the double presence of a Vehicle; one for the mass of
404 CHAPTER 14. POLYMORPHISM
the engine and cabin, and one for the mass of the trailer. A slight complication is that a class
organization like
class Truck: public Car, public Vehicle
is not accepted by the C++ compiler. As a Vehicle is already part of a Car, it is therefore not needed
once again. This organzation may, however be forced using a small trick. By creating an additional
class inheriting from Vehicle and deriving Truck from that additional class rather than directly
from Vehicle the problem is solved. Simply derive a class TrailerVeh from Vehicle, and then
Truck from Car and TrailerVeh:
class TrailerVeh: public Vehicle
{
public:
TrailerVeh(int mass)
:
Vehicle(mass)
{}
};
class Truck: public Car, public TrailerVeh
{
public:
Truck();
Truck(int engine_mass, int sp, char const *nm, int trailer_mass);
void setMass(int engine_mass, int trailer_mass);
int mass() const;
};
inline Truck::Truck(int engine_mass, int sp, char const *nm,
int trailer_mass)
:
Car(engine_mass, sp, nm),
TrailerVeh(trailer_mass)
{}
inline int Truck::mass() const
{
return // sum of:
Car::mass() + // engine part plus
TrailerVeh::mass(); // the trailer
}
14.6 Run-time type identification
C++ offers two ways to (run-time) retrieve the type of objects and expressions. The possibilities
of C++’s run-time type identification are limited compared to languages like Java. Usually static
type checking and static type identification is used in C++. Static type checking is possibly safer
and certainly more efficient than run-time type identification and should therefore be preferred over
run-time type identification. But situations exist where run-time type identification is appropriate.
C++ offers run-time type identification through the dynamic cast and typeid operators.
• A dynamic_cast is used to convert a base class pointer or reference to a derived class pointer
or reference. This is also known as down-casting.
14.6. RUN-TIME TYPE IDENTIFICATION 405
• The typeid operator returns the actual type of an expression.
These operators can be used with objects of classes having at least one virtual member function.
14.6.1 The dynamic_cast operator
The dynamic_cast<> operator is used to convert a base class pointer or reference to, respectively,
a derived class pointer or reference. This is also called down-casting as direction of the cast is down
the inheritance tree.
A dynamic cast’s actions are determined run-time; it can only be used if the base class declares at
least one virtual member function. For the dynamic cast to succeed, the destination class’s Vtable
must be equal to the Vtable to which the dynamic cast’s argument refers to, lest the cast fails and
returns 0 (if a dynamic cast of a pointer was requested) or throws a std::bad_cast exception (if a
dynamic cast of a reference was requested).
In the following example a pointer to the class Derived is obtained from the Base class pointer bp:
class Base
{
public:
virtual ~Base();
};
class Derived: public Base
{
public:
char const *toString();
};
inline char const *Derived::toString()
{
return "Derived object";
}
int main()
{
Base *bp;
Derived *dp,
Derived d;
bp = &d;
dp = dynamic_cast<Derived *>(bp);
if (dp)
cout << dp->toString() << ’\n’;
else
cout << "dynamic cast conversion failed\n";
}
In the condition of the above if statement the success of the dynamic cast is verified. This verification
is performed at run-time, as the actual class of the objects to which the pointer points is only
known by then.
If a base class pointer is provided, the dynamic cast operator returns 0 on failure and a pointer to
the requested derived class on success.
406 CHAPTER 14. POLYMORPHISM
Assume a vector<Base _> is used. Such a vector’s pointersmay point to objects of various classes,
all derived fromBase. A dynamic cast returns a pointer to the specified class if the base class pointer
indeed points to an object of the specified class and returns 0 otherwise.
We could determine the actual class of an object a pointer points to by performing a series of checks
to find the derived class to which a base class pointer points. Example:
class Base
{
public:
virtual ~Base();
};
class Derived1: public Base;
class Derived2: public Base;
int main()
{
vector<Base *> vb(initializeBase());
Base *bp = vb.front();
if (dynamic_cast<Derived1 *>(bp))
cout << "bp points to a Derived1 class object\n";
else if (dynamic_cast<Derived2 *>(bp))
cout << "bp points to a Derived2 class object\n";
}
Alternatively, a reference to a base class object may be available. In this case the dynamic_cast
operator throws an exception if the down casting fails. Example:
#include <iostream>
#include <typeinfo>
class Base
{
public:
virtual ~Base();
virtual char const *toString();
};
inline char const *Base::toString()
{
return "Base::toString() called";
}
class Derived1: public Base
{};
class Derived2: public Base
{};
Base::~Base()
{}
void process(Base &b)
{
try
{
14.6. RUN-TIME TYPE IDENTIFICATION 407
std::cout << dynamic_cast<Derived1 &>(b).toString() << ’\n’;
}
catch (std::bad_cast)
{}
try
{
std::cout << dynamic_cast<Derived2 &>(b).toString() << ’\n’;
}
catch (std::bad_cast)
{
std::cout << "Bad cast to Derived2\n";
}
}
int main()
{
Derived1 d;
process(d);
}
/*
Generated output:
Base::toString() called
Bad cast to Derived2
*/
In this example the value std::bad_cast is used. A std::bad_cast exception is thrown if the
dynamic cast of a reference to a derived class object fails.
Note the form of the catch clause: bad_cast is the name of a type. Section 17.4.1 describes how
such a type can be defined.
The dynamic cast operator is a useful tool when an existing base class cannot or should not be
modified (e.g., when the sources are not available), and a derived class may be modified instead.
Code receiving a base class pointer or reference may then perform a dynamic cast to the derived
class to access the derived class’s functionality.
You may wonder in what way the behavior of the dynamic_cast differs from that of the
static_cast.
When the static_cast is used, we tell the compiler that it must convert a pointer or reference
to its expression type to a pointer or reference of its destination type. This holds true whether the
base class declares virtual members or not. Consequently, all the static_cast’s actions can be
determined by the compiler, and the following compiles fine:
class Base
{
// maybe or not virtual members
};
class Derived1: public Base
{};
class Derived2: public Base
{};
int main()
{
408 CHAPTER 14. POLYMORPHISM
Derived1 derived1;
Base *bp = &derived1;
Derived1 &d1ref = static_cast<Derived1 &>(*bp);
Derived2 &d2ref = static_cast<Derived2 &>(*bp);
}
Pay attention to the second static_cast: here the Base class object is cast to a Derived2 class
reference. The compiler has no problems with this, as Base and Derived2 are related by inheritance.
Semantically, however, it makes no sense as bp in fact points to a Derived1 class object. This is
detected by a dynamic_cast. A dynamic_cast, like the static_cast, converts related pointer or
reference types, but the dynamic_cast provides a run-time safeguard. The dynamic cast fails when
the requested type doesn’t match the actual type of the object we’re pointing at. In addition, the
dynamic_cast’s use is much more restricted than the static_cast’s use, as the dynamic_cast
can only be used for downcasting to derived classes having virtual members.
In the end a dynamic cast is a cast, and casts should be avoided whenever possible. When the
need for dynamic casting arises ask yourself whether the base class has correctly been designed.
In situations where code expects a base class reference or pointer the base class interface should
be all that is required and using a dynamic cast should not be necessary. Maybe the base class’s
virtual interface can be modified so as to prevent the use of dynamic casts. Start frowning when
encountering code using dynamic casts. When using dynamic casts in your own code always properly
document why the dynamic cast was appropriately used and was not avoided.
14.6.2 The ‘typeid’ operator
As with the dynamic_cast operator, typeid is usually applied to references to base class objects
that refer to derived class objects. Typeid should only be used with base classes offering virtual
members.
Before using typeid the <typeinfo> header file must be included.
The typeid operator returns an object of type type_info. Different compilers may offer different
implementations of the class type_info, but at the very least typeid must offer the following
interface:
class type_info
{
public:
virtual ~type_info();
int operator==(type_info const &other) const;
int operator!=(type_info const &other) const;
bool before(type_info const &rhs) const
char const *name() const;
private:
type_info(type_info const &other);
type_info &operator=(type_info const &other);
};
Note that this class has a private copy constructor and a private overloaded assignment operator.
This prevents code from constructing type_info objects and prevents code from assigning
14.6. RUN-TIME TYPE IDENTIFICATION 409
type_info objects to each other. Instead, type_info objects are constructed and returned by the
typeid operator.
If the typeid operator is passed a base class reference it is able to return the actual name of the
type the reference refers to. Example:
class Base;
class Derived: public Base;
Derived d;
Base &br = d;
cout << typeid(br).name() << ’\n’;
In this example the typeid operator is given a base class reference. It prints the text “Derived”,
being the class name of the class br actually refers to. If Base does not contain virtual functions,
the text “Base” is printed.
The typeid operator can be used to determine the name of the actual type of expressions, not just
of class type objects. For example:
cout << typeid(12).name() << ’\n’; // prints: int
cout << typeid(12.23).name() << ’\n’; // prints: double
Note, however, that the above example is suggestive at most. It may print int and double, but this
is not necessarily the case. If portability is required, make sure no tests against these static, built-in
text-strings are required. Check out what your compiler produces in case of doubt.
In situations where the typeid operator is applied to determine the type of a derived class, a base
class reference should be used as the argument of the typeid operator. Consider the following
example:
class Base; // contains at least one virtual function
class Derived: public Base;
Base *bp = new Derived; // base class pointer to derived object
if (typeid(bp) == typeid(Derived *)) // 1: false
...
if (typeid(bp) == typeid(Base *)) // 2: true
...
if (typeid(bp) == typeid(Derived)) // 3: false
...
if (typeid(bp) == typeid(Base)) // 4: false
...
if (typeid(*bp) == typeid(Derived)) // 5: true
...
if (typeid(*bp) == typeid(Base)) // 6: false
...
Base &br = *bp;
if (typeid(br) == typeid(Derived)) // 7: true
...
410 CHAPTER 14. POLYMORPHISM
if (typeid(br) == typeid(Base)) // 8: false
...
Here, (1) returns false as a Base _ is not a Derived _. (2) returns true, as the two pointer
types are the same, (3) and (4) return false as pointers to objects are not the objects themselves.
On the other hand, if _bp is used in the above expressions, then (1) and (2) return false as an
object (or reference to an object) is not a pointer to an object, whereas (5) now returns true: _bp
actually refers to a Derived class object, and typeid(_bp) returns typeid(Derived). A similar
result is obtained if a base class reference is used: 7 returning true and 8 returning false.
The type_info::before(type_info const &rhs) member is used to determine the collating
order of classes. This is useful when comparing two types for equality. The function returns a nonzero
value if _this precedes rhs in the hierarchy or collating order of the used types. When a derived
class is compared to its base class the comparison returns 0, otherwise a non-zero value. E.g.:
cout << typeid(ifstream).before(typeid(istream)) << ’\n’ << // not 0
typeid(istream).before(typeid(ifstream)) << ’\n’; // 0
With built-in types the implementor may implement that non-0 is returned when a ‘wider’ type is
compared to a ‘smaller’ type and 0 otherwise:
cout << typeid(double).before(typeid(int)) << ’\n’ << // not 0
typeid(int).before(typeid(double)) << ’\n’; // 0
When two equal types are compared, 0 is returned:
cout << typeid(ifstream).before(typeid(ifstream)) << ’\n’; // 0
When a 0-pointer is passed to the operator typeid a bad_typeid exception is thrown.
14.7 Inheritance: when to use to achieve what?
Inheritance should not be applied automatically and thoughtlessly. Often composition can be used
instead, improving on a class’s design by reducing coupling. When inheritance is used public inheritance
should not automatically be used but the type of inheritance that is selected should match the
programmer’s intent.
We’ve seen that polymorphic classes on the one hand offer interface members defining the functionality
that can be requested of base classes and on the other hand offer virtual members that can be
overridden. One of the signs of good class design is that member functions are designed according
to the principle of ‘one function, one task’. In the current context: a class member should either be
a member of the class’s public or protected interface or it should be available as a virtual member
for reimplementation by derived classes. Often this boils down to virtual members that are defined
in the base class’s private section. Those functions shouldn’t be called by code using the base class,
but they exist to be overridden by derived classes using polymorphism to redefine the base class’s
behavior.
The underlying principle was mentioned before in the introductory paragraph of this chapter: according
to the Liskov Substitution Principle (LSP) an is-a relationship between classes (indicating
that a derived class object is a base class object) implies that a derived class object may be used in
code expecting a base class object.
14.7. INHERITANCE: WHEN TO USE TO ACHIEVE WHAT? 411
In this case inheritance is used not to let the derived class use the facilities already implemented by
the base class but to reuse the base class polymorphically by reimplementing the base class’s virtual
members in the derived class.
In this section we’ll discuss the reasons for using inheritance. Why should inheritance (not) be used?
If it is used what do we try to accomplish by it?
Inheritance often competes with composition. Consider the following two alternative class designs:
class Derived: public Base
{ ... };
class Composed
{
Base d_base;
...
};
Why and when prefer Derived over Composed and vice versa? What kind of inheritance should be
used when designing the class Derived?
• Since Composed and Derived are offered as alternatives we are looking at the design of a
class (Derived or Composed) that is-implemented-in-terms-of another class.
• Since Composed does itself not make Base’s interface available, Derived shouldn’t do so either.
The underlying principle is that private inheritance should be used when deriving a classs
Derived from Base where Derived is-implemented-in-terms-of Base.
• Should we use inheritance or composition? Here are some arguments:
– In general terms composition results in looser coupling and should therefore be preferred
over inheritance.
– Composition allows us to define classes having multiple members of the same type (think
about a class having multiple std::string members) which can not be realized using
inheritance.
– Composition allows us to separate the class’s interface from its implementation. This
allows us to modify the class’s data organization without the need to recompile code using
our class. This is also known as the bridge design pattern or the compiler firewall or pimpl
(pointer to the implementation) idiom.
– If Base offers members in its protected interface that must be used when implementing
Derived inheritance must also be used. Again: since we’re implementing-in-terms-of the
inheritance type should be private.
– Protected inheritance may be considered when the derived class (D) itself is intended as a
base class that should only make the members of its own base class (B) available to classes
that are derived from it (i.e., D).
Private inheritance should also be used when a derived class is-a certain type of base class, but in
order to initialize that base class an object of another class type must be available. Example: a new
istream class-type (say: a stream IRandStream from which random numbers can be extracted)
is derived from std::istream. Although an istream can be constructed empty (receiving its
streambuf later using its rdbuf member), it is clearly preferable to initialize the istream base
class right away.
412 CHAPTER 14. POLYMORPHISM
Assuming that a Randbuffer: public std::streambuf has been created for generating random
numbers then IRandStream can be derived from Randbuffer and std::istream. That way
the istream base class can be initialized using the Randbuffer base class.
As a RandStream is definitely not a Randbuffer public inheritance is not appropriate. In this case
IRandStream is-implemented-in-terms-of a Randbuffer and so private inheritance should be used.
IRandStream’s class interface should therefore start like this:
class IRandStream: private Randbuffer, public std::istream
{
public:
IRandStream(int lowest, int highest) // defines the range
:
Randbuffer(lowest, highest),
std::istream(this) // passes &Randbuffer
{}
...
};
Public inheritance should be reserved for classes for which the LSP holds true. In those cases the
derived classes can always be used instead of the base class from which they derive by code merely
using base class references, pointers or members (I.e., conceptually the derived class is-a base class).
This most often applies to classes derived from base classes offering virtual members. To separate
the user interface from the redefinable interface the base class’s public interface should not contain
virtual members (except for the virtual destructor) and the virtual members should all be in the base
class’s private section. Such virtual members can still be overridden by derived classes (this should
not come as a surprise, considering how polymorphism is implemented) and this design offers the
base class full control over the context in which the redefined members are used. Often the public
interface merely calls a virtual member, but those members can always be redefined to perform
additional duties.
The prototypical form of a base class therefore looks like this:
class Base
{
public:
virtual ~Base()
void process(); // calls virtual members (e.g.,
// v_process)
private:
virtual void v_process(); // overridden by derived classes
};
Alternatively a base class may offer a non-virtual destructor, which should then be protected. It
shouldn’t be public to prevent deleting objects through their base class pointers (in which case virtual
destructors should be used). It should be protected to allow derived class destructors to call their
base class destructors. Such base classes should, for the same reasons, have non-public constructors
and overloaded assignment operators.
14.8. THE ‘STREAMBUF’ CLASS 413
14.8 The ‘streambuf’ class
The class std::streambuf receives the character sequences processed by streams and defines the
interface between stream objects and devices (like a file on disk). A streambuf object is usually
not directly constructed, but usually it is used as base class of some derived class implementing the
communication with some concrete device.
The primary reason for existence of the class streambuf is to decouple the stream classes from the
devices they operate upon. The rationale here is to add an extra layer between the classes allowing
us to communicate with devices and the devices themselves. This implements a chain of command
which is seen regularly in software design.
The chain of command is considered a generic pattern when designing reusable software, encountered
also in, e.g., the TCP/IP stack.
A streambuf can be considered yet another example of the chain of command pattern. Here the
program talks to stream objects, which in turn forward their requests to streambuf objects, which
in turn communicate with the devices. Thus, as we will see shortly, we are able to do in user-software
what had to be done via (expensive) system calls before.
The class streambuf has no public constructor, but does make available several public member
functions. In addition to these public member functions, several member functions are only available
to classes derived from streambuf. In section 14.8.2 a predefined specialization of the class
streambuf is introduced. All public members of streambuf discussed here are also available in
filebuf.
The next section shows the streambufmembers that may be overridden when deriving classes from
streambuf. Chapter 24 offers concrete examples of classes derived from streambuf.
The class streambuf is used by streams performing input operations and by streams performing
output operations and their member functions can be ordered likewise. The type std::streamsize
used below may, for all practical purposes, be considered equal to the type size_t.
When inserting information into ostream objects the information is eventually passed on to the
ostream’s streambuf. The streambuf may decide to throw an exception. However, this exception
does not leave the ostream using the streambuf. Rather, the exception is caught by the ostream,
which sets its ios::bad_bit. Exception raised by manipulators inserted into ostream objects are
not caught by the ostream objects.
Public members for input operations
• std::streamsize in_avail():
Returns a lower bound on the number of characters that can be read immediately.
• int sbumpc():
The next available character or EOF is returned. The returned character is removed
from the streambuf object. If no input is available, sbumpc calls the (protected)
member uflow (see section 14.8.1 below) to make new characters available. EOF is
returned if no more characters are available.
• int sgetc():
The next available character or EOF is returned. The character is not removed from
the streambuf object. To remove a character from the streambuf object, sbumpc
(or sgetn) can be used.
414 CHAPTER 14. POLYMORPHISM
• int sgetn(char _buffer, std::streamsize n):
At most n characters are retrieved from the input buffer, and stored in buffer. The
actual number of characters read is returned. The (protected) member xsgetn (see
section 14.8.1 below) is called to obtain the requested number of characters.
• int snextc():
The current character is obtained from the input buffer and returned as the next
available character or EOF is returned. The character is not removed from the
streambuf object.
• int sputback(char c):
Inserts c into the streambuf’s buffer to be returned as the next character to read
from the streambuf object. Caution should be exercised when using this function:
often there is a maximum of just one character that can be put back.
• int sungetc():
Returns the last character read to the input buffer, to be read again at the next input
operation. Caution should be exercised when using this function: often there is a
maximum of just one character that can be put back.
Public members for output operations
• int pubsync():
Synchronizes (i.e., flush) the buffer by writing any information currently available in
the streambuf’s buffer to the device. Normally only used by classes derived from
streambuf.
• int sputc(char c):
Character c is inserted into the streambuf object. If, after writing the character, the
buffer is full, the function calls the (protected) member function overflow to flush
the buffer to the device (see section 14.8.1 below).
• int sputn(char const _buffer, std::streamsize n):
At most n characters from buffer are inserted into the streambuf object. The
actual number of characters inserted is returned. This member function calls the
(protected)member xsputn (see section 14.8.1 below) to insert the requested number
of characters.
Public members for miscellaneous operations
The next three members are normally only used by classes derived from streambuf.
• ios::pos_type pubseekoff(ios::off_type offset, ios::seekdir way,
ios::openmode mode = ios::in |ios::out):
Sets the offset of the next character to be read or written to offset, relative to the
standard ios::seekdir values indicating the direction of the seeking operation.
• ios::pos_type pubseekpos(ios::pos_type offset, ios::openmode mode =
ios::in |ios::out):
Sets the absolute position of the next character to be read or written to pos.
• streambuf _pubsetbuf(char_ buffer, std::streamsize n):
The streambuf object is going to use the buffer accomodating at least n characters.
14.8. THE ‘STREAMBUF’ CLASS 415
14.8.1 Protected ‘streambuf’ members
The protected members of the class streambuf are important for understanding and using
streambuf objects. Although there are both protected data members and protected member functions
defined in the class streambuf the protected datamembers are not mentioned here as using
themwould violate the principle of data hiding. As streambuf’s set ofmember functions is quite extensive,
it is hardly ever necessary to use its data members directly. The following subsections do not
even list all protected member functions but only those are covered that are useful for constructing
specializations.
Streambuf objects control a buffer, used for input and/or output, for which begin-, actual- and endpointers
have been defined, as depicted in figure 14.4.
Streambuf offers two protected constructor:
• streambuf::streambuf():
Default (protected) constructor of the class streambuf.
• streambuf::streambuf(streambuf const &rhs):
(Protected) copy constructor of the class streambuf. Note that this copy constructor
merely copies the values of the data members of rhs: after using the copy constructor
both streambuf objects refer to the same data buffer and initially their
pointers point at identical positions. Also note that these are not shared pointers, but
only ‘raw copies’.
14.8.1.1 Protected members for input operations
Several protected member functions are available for input operations. The member functions
marked virtual may or course be redefined in derived classes:
• char _eback():
Streambufmaintains three pointers controlling its input buffer: eback points to the
‘end of the putback’ area: characters can safely be put back up to this position. See
also figure 14.4. Eback points to the beginning of the input buffer.
• char _egptr():
Egptr points just beyond the last character that can be retrieved from the input
buffer. See also figure 14.4. If gptr equals egptr the buffer must be refilled. This
should be implemented by calling underflow, see below.
• void gbump(int n):
The object’s gptr (see below) is advanced over n positions.
• char _gptr():
Gptr points to the next character to be retrieved from the object’s input buffer. See
also figure 14.4.
• virtual int pbackfail(int c):
This member function may be overridden by derived classes to do something intelligent
when putting back character c fails. One might consider restoring the old read
416 CHAPTER 14. POLYMORPHISM
Figure 14.4: Input- and output buffer pointers of the class ‘streambuf ’
14.8. THE ‘STREAMBUF’ CLASS 417
pointer when input buffer’s begin has been reached. This member function is called
when ungetting or putting back a character fails. In particular, it is called when
– gptr() == 0: no buffering used,
– gptr() == eback(): no more room to push back,
– _gptr() != c: a different character than the next character to be read must be
pushed back.
If c == endOfFile() then the input device must be reset by one character position.
Otherwise c must be prepended to the characters to be read. The function should
return EOF on failure. Otherwise 0 can be returned.
• void setg(char _beg, char _next, char _beyond):
Initializes an input buffer. beg points to the beginning of the input area, next points
to the next character to be retrieved, and beyond points to the location just beyond
the input buffer’s last character. Usually next is at least beg + 1, to allow for a put
back operation. No input buffering is used when this member is called as setg(0,
0, 0). See also the member uflow, below.
• virtual streamsize showmanyc():
(Pronounce: s-how-many-c) This member function may be overridden by derived
classes. It must return a guaranteed lower bound on the number of characters that
can be read from the device before uflow or underflow returns EOF. By default 0 is
returned (meaning no or some characters are returned before the latter two functions
return EOF). When a positive value is returned then the next call of u(nder)flow
does not return EOF.
• virtual int uflow():
This member function may be overridden by derived classes to reload an input buffer
with fresh characters. Its default implementation is to call underflow (see below).
If underflow() fails, EOF is returned. Otherwise, the next available character is returned
as _gptr() following a gbump(-1). Uflow also moves the pending character
that is returned to the backup sequence. This is different from underflow(), which
merely returns the next available character, but does not alter the input pointer positions.
When no input buffering is required this function, rather than underflow, can be
overridden to produce the next available character from the device to read from.
• virtual int underflow():
This member functionmay be overridden by derived classes to read another character
from the device. The default implementation is to return EOF.
It is called when
– there is no input buffer (eback() == 0)
– gptr() >= egptr(): the input buffer is exhausted.
Often, when buffering is used, the complete buffer is not refreshed as this would
make it impossible to put back characters immediately following a reload. Instead,
buffers are often refreshed in halves. This system is called a split buffer.
Classes derived from streambuf for reading normally at least override underflow.
The prototypical example of an overridden underflow function looks like this:
int underflow()
{
if (not refillTheBuffer()) // assume a member d_buffer is available
return EOF;
418 CHAPTER 14. POLYMORPHISM
// reset the input buffer pointers
setg(d_buffer, d_buffer, d_buffer + d_nCharsRead);
// return the next available character
// (the cast is used to prevent
// misinterpretations of 0xff characters
// as EOF)
return static_cast<unsigned char>(*gptr());
}
• virtual streamsize xsgetn(char _buffer, streamsize n):
This member functionmay be overridden by derived classes to retrieve at once n characters
from the input device. The default implementation is to call sbumpc for every
single character meaning that by default this member (eventually) calls underflow
for every single character. The function returns the actual number of characters read
or EOF. Once EOF is returned the streambuf stops reading the device.
14.8.1.2 Protected members for output operations
The following protected members are available for output operations. Again, some members may be
overridden by derived classes:
• virtual int overflow(int c):
This member function may be overridden by derived classes to flush the characters
currently stored in the output buffer to the output device, and then to reset the output
buffer pointers so as to represent an empty buffer. Its parameter c is initialized
to the next character to be processed. If no output buffering is used overflow is
called for every single character that is written to the streambuf object. No output
buffering is accomplised by setting the buffer pointers (using, setp, see below) to
0. The default implementation returns EOF, indicating that no characters can be
written to the device.
Classes derived from streambuf for writing normally at least override overflow.
The prototypical example of an overridden overflow function looks like this:
int OFdStreambuf::overflow(int c)
{
sync(); // flush the buffer
if (c != EOF) // write a character?
{
*pptr() = static_cast<char>(c); // put it into the buffer
pbump(1); // advance the buffer’s pointer
}
return c;
}
• char _pbase():
Streambuf maintains three pointers controlling its output buffer: pbase points to
the beginning of the output buffer area. See also figure 14.4.
• char _epptr():
Streambuf maintains three pointers controlling its output buffer: epptr points just
beyond the output buffer’s last available location. See also figure 14.4. If pptr (see
14.8. THE ‘STREAMBUF’ CLASS 419
below) equals epptr the buffer must be flushed. This is implemented by calling
overflow, see before.
• void pbump(int n):
The location returned by pptr (see below) is advanced by n. The next character
written to the stream will be entered at that location.
• char _pptr():
Streambufmaintains three pointers controlling its output buffer: pptr points to the
location in the output buffer where the next available character should be written.
See also figure 14.4.
• void setp(char _beg, char _beyond):
Streambuf’s output buffer is initialized to the locations passed to setp. Beg points
to the beginning of the output buffer and beyond points just beyond the last available
location of the output buffer. Use setp(0, 0) to indicate that no buffering should
be used. In that case overflow is called for every single character to write to the
device.
• virtual streamsize xsputn(char const _buffer, streamsize n):
This member function may be overridden by derived classes to write a series of at
most n characters to the output buffer. The actual number of inserted characters
is returned. If EOF is returned writing to the device stops. The default implementation
calls sputc for each individual character. Redefine this member if, e.g., the
streambuf should support the ios::openmode ios::app. Assuming the class
MyBuf, derived from streambuf, features a data member ios::openmode d_mode
(representing the requested ios::openmode), and a member write(char const
_buf, streamsize len) (writing len bytes at pptr()), then the following code
acknowledges the ios::app mode:
std::streamsize MyStreambuf::xsputn(char const *buf, std::streamsize len)
{
if (d_openMode & ios::app)
seekoff(0, ios::end);
return write(buf, len);
}
14.8.1.3 Protected members for buffer manipulation
Several protected members are related to buffer management and positioning:
• virtual streambuf _setbuf(char _buffer, streamsize n):
This member function may be overridden by derived classes to install a buffer. The
default implementation performs no actions. It is called by pubsetbuf.
• virtual ios::pos_type seekoff(ios::off_type offset, ios::seekdir way,
ios::openmode mode = ios::in |ios::out):
This member function may be overridden by derived classes to reset the next pointer
for input or output to a new relative position (using ios::beg, ios::cur or
420 CHAPTER 14. POLYMORPHISM
ios::end). The default implementation indicates failure by returning -1. The function
is called when tellg or tellp are called. When derived class supports seeking,
then it should also define this function to handle repositioning requests. It is called
by pubseekoff. The new position or an invalid position (i.e., -1) is returned.
• virtual ios::pos_type seekpos(ios::pos_type offset, ios::openmode mode =
ios::in |ios::out):
This member function may be overridden by derived classes to reset the next pointer
for input or output to a new absolute position (i.e, relative to ios::beg). The default
implementation indicates failure by returning -1.
• virtual int sync():
This member functionmay be overridden by derived classes to flush the output buffer
to the output device or to reset the input device just beyond the position of the character
that was returned last. It returns 0 on success, -1 on failure. The default implementation
(not using a buffer) is to return 0, indicating successful syncing. This
member is used to ensure that any characters that are still buffered are written to
the device or to put unconsumed characters back to the device when the streambuf
object ceases to exist.
14.8.1.4 Deriving classes from ‘streambuf’
When classes are derived from streambuf at least underflow should be overridden by classes
intending to read information fromdevices, and overflow should be overridden by classes intending
to write information to devices. Several examples of classes derived from streambuf are provided
in chapter 24.
Fstream class type objects use a combined input/output buffer. This is a result from that istream
and ostream being virtually derived from ios, which class contains the streambuf. To construct
a class supporting both input and output using separate buffers, the streambuf itself may define
two buffers. When seekoff is called for reading, a mode parameter can be set to ios::in, otherwise
to ios::out. Thus the derived class knows whether it should access the read buffer or the
write buffer. Of course, underflow and overflow do not have to inspect the mode flag as they by
implication know on which buffer they should operate.
14.8.2 The class ‘filebuf’
The class filebuf is a specialization of streambuf used by the file stream classes. Before using
a filebuf the header file <fstream> must be included.
In addition to the (public) members that are available through the class streambuf, filebuf
offers the following (public) members:
• filebuf():
Filebuf offers a public constructor. It initializes a plain filebuf object that is not
yet connected to a stream.
• bool is_open():
True is returned if the filebuf is actually connected to an open file, false otherwise.
See the open member, below.
14.9. A POLYMORPHIC EXCEPTION CLASS 421
• filebuf _open(char const _name, ios::openmode mode):
Associates the filebuf object with a file whose name is provided. The file is opened
according to the provided openmode.
• filebuf _close():
Closes the association between the filebuf object and its file. The association is
automatically closed when the filebuf object ceases to exist.
14.8.3 Safely interfacing streams to another std::streambuf
Consider classes derived from std::istream or std::ostream. Such a class could be designed as
follows:
class XIstream: public std::istream
{
public:
...
};
Assuming that the streambuf to which XIstream interfaces is not yet available construction
time, XIstream only offers default constructors. The class could, however, offer a member void
switchStream(std::streambuf _sb) to provide XIstream objects with a streambuf to interface
to. How to implement switchStream? We could simply call rdbuf, passing it the pointer to
the new streambuf may work, but the problem is that there may be an existing streambuf, which
may have buffered some information that we don’t want to lose.
Instead of using rdbuf the protected member void init(std::streambuf _sb) should be used
for switching to another streambuf in an existing stream.
The init member expects a pointer to a streambuf which should be associated with the istream or
ostream object. The init member properly ends any existing association before switching to the
streambuf whose address is provided to init.
Assuming that the streambuf to which switchStream’s sb points persists, then switchStream
could simply be implemented like this:
void switchStream(streambuf *sb)
{
init(sb);
}
No further actions are required. The init member ends the current association, and only then
switches to using streambuf _sb.
14.9 A polymorphic exception class
Earlier in the C++ Annotations (section 10.3.1) we hinted at the possibility of designing a class
Exception whose process member would behave differently, depending on the kind of exception
that was thrown. Now that we’ve introduced polymorphism we can further develop this example.
422 CHAPTER 14. POLYMORPHISM
It probably does not come as a surprise that our class Exception should be a polymorphic base class
from which special exception handling classes can be derived. In section 10.3.1 a member severity
was used offering functionality that may be replaced by members of the Exception base class.
The base class Exception may be designed as follows:
#ifndef INCLUDED_EXCEPTION_H_
#define INCLUDED_EXCEPTION_H_
#include <iostream>
#include <string>
class Exception
{
std::string d_reason;
public:
Exception(std::string const &reason);
virtual ~Exception();
std::ostream &insertInto(std::ostream &out) const;
void handle() const;
private:
virtual void action() const;
};
inline void Exception::action() const
{
throw;
}
inline Exception::Exception(std::string const &reason)
:
d_reason(reason)
{}
inline void Exception::handle() const
{
action();
}
inline std::ostream &Exception::insertInto(std::ostream &out) const
{
return out << d_reason;
}
inline std::ostream &operator<<(std::ostream &out, Exception const &e)
{
return e.insertInto(out);
}
#endif
Objects of this class may be inserted into ostreams but the core element of this class is the virtual
member function action, by default rethrowing an exception.
A derived class Warning simply prefixes the thrown warning text by the text Warning:, but a
derived class Fatal overrides Exception::action by calling std::terminate, forcefully termi14.9.
A POLYMORPHIC EXCEPTION CLASS 423
nating the program.
Here are the classes Warning and Fatal
#ifndef WARNINGEXCEPTION_H_
#define WARNINGEXCEPTION_H_
#include "exception.h"
class Warning: public Exception
{
public:
Warning(std::string const &reason)
:
Exception("Warning: " + reason)
{}
};
#endif
#ifndef FATAL_H_
#define FATAL_H_
#include "exception.h"
class Fatal: public Exception
{
public:
Fatal(std::string const &reason);
private:
virtual void action() const;
};
inline Fatal::Fatal(std::string const &reason)
:
Exception(reason)
{}
inline void Fatal::action() const
{
std::cout << "Fatal::action() terminates" << ’\n’;
std::terminate();
}
#endif
When the example program is started without arguments it throws a Fatal exception, otherwise
it throws a Warning exception. Of course, additional exception types could also easily be defined.
To make the example compilable the Exception destructor is defined above main. The default
destructor cannot be used, as it is a virtual destructor. In practice the destructor should be defined
in its own little source file:
#include "warning.h"
#include "fatal.h"
424 CHAPTER 14. POLYMORPHISM
Exception::~Exception()
{}
using namespace std;
int main(int argc, char **argv)
try
{
try
{
if (argc == 1)
throw Fatal("Missing Argument") ;
else
throw Warning("the argument is ignored");
}
catch (Exception const &e)
{
cout << e << ’\n’;
e.handle();
}
}
catch(...)
{
cout << "caught rethrown exception\n";
}
14.10 How polymorphism is implemented
This section briefly describes how polymorphism is implemented in C++. It is not necessary to understand
how polymorphism is implemented if you just want to use polymorphism. However, we
think it’s nice to know how polymorphism is possible. Also, knowing how polymorphism is implemented
clarifies why there is a (small) penalty to using polymorphism in terms of memory usage
and efficiency.
The fundamental idea behind polymorphism is that the compiler does not know which function
to call at compile-time. The appropriate function is selected at run-time. That means that the
address of the function must be available somewhere, to be looked up prior to the actual call. This
‘somewhere’ place must be accessible to the object in question. So when a Vehicle _vp points to
a Truck object, then vp->mass() calls Truck’s member function. the address of this function is
obtained through the actual object to which vp points.
Polymorphism is commonly implemented as follows: an object containing virtual member functions
also contains, usually as its first datamember a hidden datamember, pointing to an array containing
the addresses of the class’s virtual member functions. The hidden data member is usually called the
vpointer, the array of virtual member function addresses the vtable.
The class’s vtable is shared by all objects of that class. The overhead of polymorphism in terms of
memory consumption is therefore:
• one vpointer data member per object pointing to:
• one vtable per class.
14.10. HOW POLYMORPHISM IS IMPLEMENTED 425
Figure 14.5: Internal organization objects when virtual functions are defined.
Figure 14.6: Complementary figure, provided by Guillaume Caumon
Consequently, a statement like vp->mass first inspects the hidden data member of the object
pointed to by vp. In the case of the vehicle classification system, this data member points to a
table containing two addresses: one pointer to the function mass and one pointer to the function
setMass (three pointers if the class also defines (as it should) a virtual destructor). The actually
called function is determined from this table.
The internal organization of the objects having virtual functions is illustrated in figures Figure 14.5
and Figure 14.6 (originals provided by Guillaume Caumon3).
As shown by figures Figure 14.5 and Figure 14.6, objects potentially using virtual member functions
must have one (hidden) data member to address a table of function pointers. The objects of the
classes Vehicle and Car both address the same table. The class Truck, however, overrides mass.
Consequently, Truck needs its own vtable.
A small complication arises when a class is derived from multiple base classes, each defining virtual
functions. Consider the following example:
3mailto:[email protected]
426 CHAPTER 14. POLYMORPHISM
class Base1
{
public:
virtual ~Base1();
void fun1(); // calls vOne and vTwo
private:
virtual void vOne();
virtual void vTwo();
};
class Base2
{
public:
virtual ~Base2();
void fun2(); // calls vThree
private:
virtual void vThree();
};
class Derived: public Base1, public Base2
{
public:
virtual ~Derived();
private:
virtual ~vOne();
virtual ~vThree();
};
In the example Derived is multiply derived from Base1 and Base2, each supporting virtual functions.
Because of this, Derived also has virtual functions, and so Derived has a vtable allowing
a base class pointer or reference to access the proper virtual member.
When Derived::fun1 is called (or a Base1 pointer pointing to fun1 calls fun1) then fun1 calls
Derived::vOne and Base1::vTwo. Likewise, when Derived::fun2 is called Derived::vThree
is called.
The complication occurs with Derived’s vtable. When fun1 is called its class type determines
the vtable to use and hence which virtual member to call. So when vOne is called from fun1, it
is presumably the second entry in Derived’s vtable, as it must match the second entry in Base1’s
vtable. However, when fun2 calls vThree it apparently is also the second entry in Derived’s vtable
as it must match the second entry in Base2’s vtable.
Of course this cannot be realized by a single vtable. Therefore, when multiple inheritance is used
(each base class defining virtual members) another approach is followed to determine which virtual
function to call. In this situation (cf. figure Figure 14.7) the class Derived receives two vtables,
one for each of its base classes and each Derived class object harbors two hidden vpointers, each
one pointing to its corresponding vtable.
Since base class pointers, base class references, or base class interface members unambiguously
refer to one of the base classes the compiler can determine which vpointer to use.
The following therefore holds true for classes multiply derived from base classes offering virtual
member functions:
• the derived class defines a vtable for each of its base classes offering virtual members;
• Each derived class object contains as many hidden vpointers as it has vtables.
14.10. HOW POLYMORPHISM IS IMPLEMENTED 427
Figure 14.7: Vtables and vpointers with multiple base classes
428 CHAPTER 14. POLYMORPHISM
• Each of a derived class object’s vpointers points to a unique vtable and the vpointer to use is
determined by the class type of the base class pointer, the base class reference, or the base class
interface function that is used.
14.11 Undefined reference to vtable ...
Occasionaly, the linker generates an error like the following:
In function ‘Derived::Derived()’:
: undefined reference to ‘vtable for Derived’
This error is generated when a virtual function’s implementation is missing in a derived class, but
the function is mentioned in the derived class’s interface.
Such a situation is easily encountered:
• Construct a (complete) base class defining a virtual member function;
• Construct a Derived class mentioning the virtual function in its interface;
• The Derived class’s virtual function is not implemented. Of course, the compiler doesn’t know
that the derived class’s function is not implemented and will, when asked, generate code to
create a derived class object;
• Eventually, the linker is unable to find the derived class’s virtual member function. Therefore,
it is unable to construct the derived class’s vtable;
• The linker complains with the message:
undefined reference to ‘vtable for Derived’
Here is an example producing the error:
class Base
{
virtual void member();
};
inline void Base::member()
{}
class Derived: public Base
{
virtual void member(); // only declared
};
int main()
{
Derived d; // Will compile, since all members were declared.
// Linking will fail, since we don’t have the
// implementation of Derived::member()
}
It’s of course easy to correct the error: implement the derived class’s missing virtual member function.
14.12. VIRTUAL CONSTRUCTORS 429
Virtual functions should never be implemented inline. Since the vtable contains the addresses of the
class’s virtual functions, these functions must have addresses and so they must have been compiled
as real (out-of-line) functions. By defining virtual functions inline you run the risk that the compiler
simply overlooks those functions as they may very well never be explicitly called (but only polymorphically,
from a base class pointer or reference). As a result their addresses may never enter their
class’s vtables (and even the vtable itself might remain undefined), causing linkage problems or resulting
in programs showing unexpected behavior. All these kinds of problems are simply avoided:
never define virtual members inline (see also section 7.8.2.1).
14.12 Virtual constructors
In section 14.2 we learned that C++ supports virtual destructors. Like many other object oriented
languages (e.g., Java), however, the notion of a virtual constructor is not supported. Not having
virtual constructors becomes a liability when only base class references or pointers are available,
and a copy of a derived class object is required. Gamma et al. (1995) discuss the Prototype design
pattern to deal with this situation.
According to the Prototype Design Pattern each derived class is given the responsibility of implementing
a member function returning a pointer to a copy of the object for which the member is called.
The usual name for this function is clone. Separating the user interface from the reimplementation
interface clone is made part of the interface and newCopy is defined in the reimplementation
interface. A base class supporting ‘cloning’ defines a virtual destructor, clone, returning newCopy’s
return value and the virtual copy constructor, a pure virtual function, having the prototype virtual
Base _newCopy() const = 0. As newCopy is a pure virtual function all derived classes must now
implement their own ‘virtual constructor’.
This setup suffices in most situations where we have a pointer or reference to a base class, but
it fails when used with abstract containers. We can’t create a vector<Base>, with Base featuring
the pure virtual copy member in its interface, as Base is called to initialize new elements of
such a vector. This is impossible as newCopy is a pure virtual function, so a Base object can’t be
constructed.
The intuitive solution, providing newCopy with a default implementation, defining it as an ordinary
virtual function, fails too as the container calls Base(Base const &other), which would have to
call newCopy to copy other. At this point it is unclear what to do with that copy, as the new Base
object already exists, and contains no Base pointer or reference data member to assign newCopy’s
return value to.
Alternatively (and preferred) the original Base class (defined as an abstract base class) is kept as-is
and a wrapper class Clonable is used to manage the Base class pointers returned by newCopy. In
chapter 17 ways to merge Base and Clonable into one class are discussed, but for now we’ll define
Base and Clonable as separate classes.
The class Clonable is a very standard class. It contains a pointer member so it needs a copy
constructor, destructor, and overloaded assignment operator. It’s given at least one non-standard
member: Base &base() const, returning a reference to the derived object to which Clonable’s
Base _ data member refers. It is also provided with an additional constructor to initialize its Base
_ data member.
Any non-abstract class derived from Base must implement Base _newCopy(), returning a pointer
to a newly created (allocated) copy of the object for which newCopy is called.
Once we have defined a derived class (e.g., Derived1), we can put our Clonable and Base facilities
to good use. In the next example we see main defining a vector<Clonable>. An anonymous
430 CHAPTER 14. POLYMORPHISM
Derived1 object is then inserted into the vector using the following steps:
• A new anonymous Derived1 object is created;
• It initializes a Clonable using Clonable(Base _bp);
• The just created Clonable object is inserted into the vector, using Clonable’s move constructor.
There are only temporary Derived and Clonable objects at this point, so no copy
construction is required.
In this sequence, only the Clonable object containing the Derived1 _ is used. No additional copies
need to be made (or destroyed).
Next, the base member is used in combination with typeid to show the actual type of the Base &
object: a Derived1 object.
Main then contains the interesting definition vector<Clonable> v2(bv). Here a copy of bv is
created. This copy construction observes the actual types of the Base references, making sure that
the appropriate types appear in the vector’s copy.
At the end of the program, we have created two Derived1 objects, which are correctly deleted by
the vector’s destructors. Here is the full program, illustrating the ‘virtual constructor’ concept 4:
#include <iostream>
#include <vector>
#include <algorithm>
#include <typeinfo>
// Base and its inline member:
class Base
{
public:
virtual ~Base();
Base *clone() const;
private:
virtual Base *newCopy() const = 0;
};
inline Base *Base::clone() const
{
return newCopy();
}
// Clonable and its inline members:
class Clonable
{
Base *d_bp;
public:
Clonable();
explicit Clonable(Base *base);
~Clonable();
Clonable(Clonable const &other);
Clonable(Clonable &&tmp);
4 Jesse
van den Kieboom created an alternative implementation of a class Clonable, implemented as a class template.
His implementation is found here5.
14.12. VIRTUAL CONSTRUCTORS 431
Clonable &operator=(Clonable const &other);
Clonable &operator=(Clonable &&tmp);
Base &base() const;
};
inline Clonable::Clonable()
:
d_bp(0)
{}
inline Clonable::Clonable(Base *bp)
:
d_bp(bp)
{}
inline Clonable::Clonable(Clonable const &other)
:
d_bp(other.d_bp->clone())
{}
inline Clonable::Clonable(Clonable &&tmp)
:
d_bp(tmp.d_bp)
{
tmp.d_bp = 0;
}
inline Clonable::~Clonable()
{
delete d_bp;
}
inline Base &Clonable::base() const
{
return *d_bp;
}
// Derived and its inline member:
class Derived1: public Base
{
public:
~Derived1();
private:
virtual Base *newCopy() const;
};
inline Base *Derived1::newCopy() const
{
return new Derived1(*this);
}
// Members not implemented inline:
Base::~Base()
{}
Clonable &Clonable::operator=(Clonable const &other)
{
Clonable tmp(other);
std::swap(d_bp, tmp.d_bp);
return *this;
}
432 CHAPTER 14. POLYMORPHISM
Clonable &Clonable::operator=(Clonable &&tmp)
{
std::swap(d_bp, tmp.d_bp);
return *this;
}
Derived1::~Derived1()
{
std::cout << "~Derived1() called\n";
}
// The main function:
using namespace std;
int main()
{
vector<Clonable> bv;
bv.push_back(Clonable(new Derived1()));
cout << "bv[0].name: " << typeid(bv[0].base()).name() << ’\n’;
vector<Clonable> v2(bv);
cout << "v2[0].name: " << typeid(v2[0].base()).name() << ’\n’;
}
/*
Output:
bv[0].name: 8Derived1
v2[0].name: 8Derived1
~Derived1() called
~Derived1() called
*/

Chapter 15
Friends
In all examples discussed up to now, we’ve seen that private members are only accessible by the
members of their class. This is good, as it enforces encapsulation and data hiding. By encapsulating
functionality within a class we prevent that a class exposes multiple responsibilities; by hiding
data we promote a class’s data integrity and we prevent that other parts of the software become
implementation dependent on the data that belong to a class.
In this (very) short chapter we introduce the friend keyword and the principles that underly its use.
The bottom line being that by using the friend keyword functions are granted access to a class’s
private members. Even so, this does not imply that the principle of data hiding is abandoned when
the friend keyword is used.
In this chapter the topic of friendship among classes is not discussed. Situations in which it is
natural to use friendship among classes are discussed in chapters 17 and 21 and such situations are
natural extensions of the way friendship is handled for functions.
There should be a well-defined conceptual reason for declaring friendship (i.e., using the friend
keyword). The traditionally offered definition of the class concept usually looks something like this:
A class is a set of data together with the functions that operate on that set of data.
As we’ve seen in chapter 11 some functions have to be defined outside of a class interface. They are
defined outside of the class interface to allow promotions for their operands or to extend the facilities
of existing classes not directly under our control. According to the above traditional definition of the
class concept those functions that cannot be defined in the class interface itself should nevertheless
be considered functions belonging to the class. Stated otherwise: if permitted by the language’s
syntax they would certainly have been defined inside the class interface. There are two ways to
implement such functions. One way consists of implementing those functions using available public
member functions. This approach was used, e.g., in section 11.2. Another approach applies the
definition of the class concept to those functions. By stating that those functions in fact belong to
the class they should be given direct access to the data members of objects. This is accomplished by
the friend keyword.
As a general principle we state that all functions operating on the data of objects of a class that are
declared in the same file as the class interface itself belong to that class and may be granted direct
access to the class’s data members.
433
434 CHAPTER 15. FRIENDS
15.1 Friend functions
In section 11.2 the insertion operator of the class Person (cf. section 9.3) was implemented like this:
ostream &operator<<(ostream &out, Person const &person)
{
return
out <<
"Name: " << person.name() << ", "
"Address: " << person.address() << ", "
"Phone: " << person.phone();
}
Person objects can now be inserted into streams.
However, this implementation required three member functions to be called, which may be
considered a source of inefficiency. An improvement would be reached by defining a member
Person::insertInto and let operator<< call that function. These two functions could be defined
as follows:
std::ostream &operator<<(std::ostream &out, Person const &person)
{
return person.insertInto(out);
}
std::ostream &Person::insertInto(std::ostream &out)
{
return
out << "Name: " << d_name << ", "
"Address: " << d_address << ", "
"Phone: " << d_phone;
}
As insertInto is a member function it has direct access to the object’s data members so no additional
member functions must be called when inserting person into out.
The next step consists of realizing that insertInto is only defined for the benefit of operator<<,
and that operator<<, as it is declared in the header file containing Person’s class interface should
be considered a function belonging to the class Person. The member insertInto can therefore be
omitted when operator<< is declared as a friend.
Friend functions must be declared as friends in the class interface. These friend declarations are not
member functions, and so they are independent of the class’s private, protected and public
sections. Friend declaration may be placed anywhere in the class interface. Convention dictates
that friend declaractions are listed directly at the top of the class interface. The class Person, using
friend declaration for its extraction and insertion operators starts like this:
class Person
{
friend std::ostream &operator<<(std::ostream &out, Person &pd);
friend std::istream &operator>>(std::istream &in, Person &pd);
// previously shown interface (data and functions)
};
15.2. EXTENDED FRIEND DECLARATIONS 435
The insertion operator may now directly access a Person object’s data members:
std::ostream &operator<<(std::ostream &out, Person const &person)
{
return
cout << "Name: " << person.d_name << ", "
"Address: " << person.d_address << ", "
"Phone: " << person.d_phone;
}
Friend declarations are true declarations. Once a class contains friend declarations these friend
functions do not have to be declared again below the class’s interface. This also clearly indicates the
class designer’s intent: the friend functions are declared by the class, and can thus be considered
functions belonging to the class.
15.2 Extended friend declarations
C++ has added extended friend declarations to the language. When a class is declared as a friend,
then the class keyword no longer has to be provided. E.g.,
class Friend; // declare a class
typedef Friend FriendType; // and a typedef for it
using FName = Friend; // and a using declaration
class Class1
{
friend Friend; // FriendType and FNaem: also OK
};
In the pre-C++11 standards the friend declaration required an explicit class; e.g., friend class
Friend.
The explicit use of class remains required if the compiler hasn’t seen the friend’s name yet. E.g.,
class Class1
{
// friend Unseen; // fails to compile: Unseen unknown.
friend class Unseen; // OK
};
Section 22.10 covers the use of extended friend declarations in class templates.
436 CHAPTER 15. FRIENDS

Chapter 16
Classes Having Pointers To
Members
Classes having pointer data members have been discussed in detail in chapter 9. Classes defining
pointer data-members deserve some special attention, as they usually require the definitions of
copy constructors, overloaded assignment operators and destructors
Situations exist where we do not need a pointer to an object but rather a pointer to members of
a class. Pointers to members can profitably be used to configure the behavior of objects of classes.
Depending on which member a pointer to a member points to objects will show certain behavior.
Although pointers to members have their use, polymorphism can frequently be used to realize comparable
behavior. Consider a class having a member process performing one of a series of alternate
behaviors. Instead of selecting the behavior of choice at object construction time the class could use
the interface of some (abstract) base class, passing an object of some derived class to its constructor
and could thus configure its behavior. This allows for easy, extensible and flexible configuration,
but access to the class’s data members would be less flexible and would possibly require the use of
‘friend’ declarations. In such cases pointers to members may actually be preferred as this allows for
(somewhat less flexible) configuration as well as direct access to a class’s data members.
So the choice apparently is between on the one hand ease of configuration and on the other hand
ease of access to a class’s data members. In this chapter we’ll concentrate on pointers to members,
investigating what these pointers have to offer.
16.1 Pointers to members: an example
Knowing how pointers to variables and objects are used does not intuitively lead to the concept of
pointers to members . Even if the return types and parameter types of member functions are taken
into account, surprises can easily be encountered. For example, consider the following class:
class String
{
char const *(*d_sp)() const;
public:
char const *get() const;
};
437
438 CHAPTER 16. CLASSES HAVING POINTERS TO MEMBERS
For this class, it is not possible to let char const _(_d_sp)() const point to the String::get
member function as d_sp cannot be given the address of the member function get.
One of the reasons why this doesn’t work is that the variable d_sp has global scope (it is a pointer
to a function, not a pointer to a function within String), while the member function get is defined
within the String class, and thus has class scope. The fact that d_sp is a data member of
the class String is irrelevant here. According to d_sp’s definition, it points to a function living
somewhere outside of the class.
Consequently, to define a pointer to a member (either data or function, but usually a function)
of a class, the scope of the pointer must indicate class scope. Doing so, a pointer to the member
String::get is defined like this:
char const *(String::*d_sp)() const;
So, by prefixing the _d_sp pointer data member by String::, it is defined as a pointer in the context
of the class String. According to its definition it is a pointer to a function in the class String, not
expecting arguments, not modifying its object’s data, and returning a pointer to constant characters.
16.2 Defining pointers to members
Pointers to members are defined by prefixing the normal pointer notation with the appropriate
class plus scope resolution operator. Therefore, in the previous section, we used char const _
(String::_d_sp)() const to indicate that d_sp
• is a pointer (_d_sp);
• points to something in the class String (String::_d_sp);
• is a pointer to a const function, returning a char const _ (char const _
(String::_d_sp)() const).
The prototype of a matching function is therefore:
char const *String::somefun() const;
which is any const parameterless function in the class String, returning a char const _.
When defining pointers to members the standard procedure for constructing pointers to functions
can still be applied:
• put parentheses around the fully qualified function name (i.e., the function’s header, including
the function’s class name):
char const * ( String::somefun ) () const
• Put a pointer (a star (_)) character immediately before the function name itself:
char const * ( String:: * somefun ) () const
• Replace the function name with the name of the pointer variable:
char const * (String::*d_sp)() const
16.2. DEFINING POINTERS TO MEMBERS 439
Here is another example, defining a pointer to a data member. Assume the class String contains
a string d_text member. How to construct a pointer to this member? Again we follow standard
procedure:
• put parentheses around the fully qualified variable name:
std::string (String::d_text)
• Put a pointer (a star (_)) character immediately before the variable-name itself:
std::string (String::*d_text)
• Replace the variable name with the name of the pointer variable:
std::string (String::*tp)
In this case, the parentheses are superfluous and may be omitted:
string String::*tp
Alternatively, a very simple rule of thumb is
• Define a normal (i.e., global) pointer variable,
• Prefix the class name to the pointer character, once you point to something inside a class
For example, the following pointer to a global function
char const * (*sp)() const;
becomes a pointer to a member function after prefixing the class-scope:
char const * (String::*sp)() const;
Nothing forces us to define pointers to members in their target (String) classes. Pointers to members
may be defined in their target classes (so they become data members), or in another class, or
as a local variable or as a global variable. In all these cases the pointer to member variable can be
given the address of the kind of member it points to. The important part is that a pointer to member
can be initialized or assigned without requiring the existence an object of the pointer’s target class.
Initializing or assigning an address to such a pointer merely indicates to which member the pointer
points. This can be considered some kind of relative address; relative to the object for which the
function is called. No object is required when pointers to members are initialized or assigned. While
it is allowed to initialize or assign a pointer to member, it is (of course) not possible to call those
members without specifying an object of the correct type.
In the following example initialization of and assignment to pointers to members is illustrated (for
illustration purposes all members of the class PointerDemo are defined public). In the example
itself the &-operator is used to determine the addresses of the members. These operators as well as
the class-scopes are required. Even when used inside member implementations:
#include <cstddef>
440 CHAPTER 16. CLASSES HAVING POINTERS TO MEMBERS
class PointerDemo
{
public:
size_t d_value;
size_t get() const;
};
inline size_t PointerDemo::get() const
{
return d_value;
}
int main()
{ // initialization
size_t (PointerDemo::*getPtr)() const = &PointerDemo::get;
size_t PointerDemo::*valuePtr = &PointerDemo::d_value;
getPtr = &PointerDemo::get; // assignment
valuePtr = &PointerDemo::d_value;
}
This involves nothing special. The difference with pointers at global scope is that we’re now restricting
ourselves to the scope of the PointerDemo class. Because of this restriction, all pointer
definitions and all variables whose addresses are used must be given the PointerDemo class scope.
Pointers to members can also be used with virtual member functions. No special syntax is required
when pointing to virtual members. Pointer construction, initialization and assignment is
done identically to the way it is done with non-virtual members.
16.3 Using pointers to members
Using pointers to members to call a member function requires the existence of an object of the class
of the members to which the pointer to member refers to. With pointers operating at global scope,
the dereferencing operator _ is used. With pointers to objects the field selector operator operating on
pointers (->) or the field selector operating operating on objects (.) can be used to select appropriate
members.
To use a pointer to member in combination with an object the pointer to member field selector (._)
must be specified. To use a pointer to a member via a pointer to an object the ‘pointer to member
field selector through a pointer to an object’ (->_) must be specified. These two operators combine
the notions of a field selection (the . and -> parts) to reach the appropriate field in an object and
of dereferencing: a dereference operation is used to reach the function or variable the pointer to
member points to.
Using the example from the previous section, let’s see how we can use pointers to member functions
and pointers to data members:
#include <iostream>
class PointerDemo
{
public:
size_t d_value;
16.3. USING POINTERS TO MEMBERS 441
size_t get() const;
};
inline size_t PointerDemo::get() const
{
return d_value;
}
using namespace std;
int main()
{ // initialization
size_t (PointerDemo::*getPtr)() const = &PointerDemo::get;
size_t PointerDemo::*valuePtr = &PointerDemo::d_value;
PointerDemo object; // (1) (see text)
PointerDemo *ptr = &object;
object.*valuePtr = 12345; // (2)
cout << object.*valuePtr << ’\n’ <<
object.d_value << ’\n’;
ptr->*valuePtr = 54321; // (3)
cout << object.d_value << ’\n’ <<
(object.*getPtr)() << ’\n’ << // (4)
(ptr->*getPtr)() << ’\n’;
}
We note:
• At (1) a PointerDemo object and a pointer to such an object is defined.
• At (2) we specify an object (and hence the ._ operator) to reach the member valuePtr points
to. This member is given a value.
• At (3) the same member is assigned another value, but this time using the pointer to a
PointerDemo object. Hence we use the ->_ operator.
• At (4) the ._ and ->_ are used once again, this time to call a function through a pointer to
member. As the function argument list has a higher priority than the pointer to member field
selector operator, the latter must be protected by parentheses.
Pointers to members can be used profitably in situations where a class has a member that behaves
differently depending on a configuration setting. Consider once again the class Person from section
9.3. Person defines data members holding a person’s name, address and phone number. Assume
we want to construct a Person database of employees. The employee database can be queried,
but depending on the kind of person querying the database either the name, the name and phone
number or all stored information about the person is made available. This implies that a member
function like address must return something like ‘<not available>’ in cases where the person
querying the database is not allowed to see the person’s address, and the actual address in other
cases.
The employee database is opened specifying an argument reflecting the status of the employee who
wants to make some queries. The status could reflect his or her position in the organization, like
442 CHAPTER 16. CLASSES HAVING POINTERS TO MEMBERS
BOARD, SUPERVISOR, SALESPERSON, or CLERK. The first two categories are allowed to see all information
about the employees, a SALESPERSON is allowed to see the employee’s phone numbers, while
the CLERK is only allowed to verify whether a person is actually a member of the organization.
We now construct a member string personInfo(char const _name) in the database class. A
standard implementation of this class could be:
string PersonData::personInfo(char const *name)
{
Person *p = lookup(name); // see if ‘name’ exists
if (!p)
return "not found";
switch (d_category)
{
case BOARD:
case SUPERVISOR:
return allInfo(p);
case SALESPERSON:
return noPhone(p);
case CLERK:
return nameOnly(p);
}
}
Although it doesn’t take much time, the switch must nonetheless be evaluated every time
personInfo is called. Instead of using a switch, we could define a member d_infoPtr as a pointer
to a member function of the class PersonData returning a string and expecting a pointer to a
Person as its argument.
Instead of evaluating the switch this pointer can be used to point to allInfo, noPhone or
nameOnly. Furthermore, the member function the pointer points to will be known by the time
the PersonData object is constructed and so its value needs to be determined only once (at the
PersonData object’s construction time).
Having initialized d_infoPtr the personInfo member function is now implemented simply as:
string PersonData::personInfo(char const *name)
{
Person *p = lookup(name); // see if ‘name’ exists
return p ? (this->*d_infoPtr)(p) : "not found";
}
The member d_infoPtr is defined as follows (within the class PersonData, omitting other members):
class PersonData
{
string (PersonData::*d_infoPtr)(Person *p);
};
Finally, the constructor initializes d_infoPtr. This could be realized using a simple switch:
16.4. POINTERS TO STATIC MEMBERS 443
PersonData::PersonData(PersonData::EmployeeCategory cat)
:
switch (cat)
{
case BOARD:
case SUPERVISOR:
d_infoPtr = &PersonData::allInfo;
break;
case SALESPERSON:
d_infoPtr = &PersonData::noPhone;
break;
case CLERK:
d_infoPtr = &PersonData::nameOnly;
break;
}
}
Note how addresses of member functions are determined. The class PersonData scope must be
specified, even though we’re already inside a member function of the class PersonData.
An example using pointers to data members is provided in section 19.1.60, in the context of the
stable_sort generic algorithm.
16.4 Pointers to static members
Static members of a class can be used without having available an object of their class. Public static
members can be called like free functions, albeit that their class names must be specified when they
are called.
Assume a class String has a public static member function count, returning the number of string
objects created so far. Then, without using any String object the function String::count may be
called:
void fun()
{
cout << String::count() << ’\n’;
}
Public static members can be called like free functions (but see also section 8.2.1). Private static
members can only be called within the context of their class, by their class’s member or friend
functions.
Since static members have no associated objects their addresses can be stored in ordinary function
pointer variables, operating at the global level. Pointers to members cannot be used to store
addresses of static members. Example:
void fun()
{
size_t (*pf)() = String::count;
// initialize pf with the address of a static member function
cout << (*pf)() << ’\n’;
444 CHAPTER 16. CLASSES HAVING POINTERS TO MEMBERS
// displays the value returned by String::count()
}
16.5 Pointer sizes
An interesting characteristic of pointers to members is that their sizes differ from those of ‘normal’
pointers. Consider the following little program:
#include <string>
#include <iostream>
class X
{
public:
void fun();
std::string d_str;
};
inline void X::fun()
{
std::cout << "hello\n";
}
using namespace std;
int main()
{
cout <<
"size of pointer to data-member: " << sizeof(&X::d_str) << "\n"
"size of pointer to member function: " << sizeof(&X::fun) << "\n"
"size of pointer to non-member data: " << sizeof(char *) << "\n"
"size of pointer to free function: " << sizeof(&printf) << ’\n’;
}
/*
generated output (on 32-bit architectures):
size of pointer to data-member: 4
size of pointer to member function: 8
size of pointer to non-member data: 4
size of pointer to free function: 4
*/
On a 32-bit architecture a pointer to a member function requires eight bytes, whereas other kind of
pointers require four bytes (Using Gnu’s g++ compiler).
Pointer sizes are hardly ever explicitly used, but their sizes may cause confusion in statements like:
printf("%p", &X::fun);
Of course, printf is likely not the right tool to produce the value of these C++ specific pointers.
The values of these pointers can be inserted into streams when a union, reinterpreting the 8-byte
pointers as a series of size_t char values, is used:
16.5. POINTER SIZES 445
#include <string>
#include <iostream>
#include <iomanip>
class X
{
public:
void fun();
std::string d_str;
};
inline void X::fun()
{
std::cout << "hello\n";
}
using namespace std;
int main()
{
union
{
void (X::*f)();
unsigned char *cp;
}
u = { &X::fun };
cout.fill(’0’);
cout << hex;
for (unsigned idx = sizeof(void (X::*)()); idx-- > 0; )
cout << setw(2) << static_cast<unsigned>(u.cp[idx]);
cout << ’\n’;
}
446 CHAPTER 16. CLASSES HAVING POINTERS TO MEMBERS

Chapter 17
Nested Classes
Classes can be defined inside other classes. Classes that are defined inside other classes are called
nested classes. Nested classes are used in situations where the nested class has a close conceptual relationship
to its surrounding class. For example, with the class string a type string::iterator
is available which provides all characters that are stored in the string. This string::iterator
type could be defined as an object iterator, defined as nested class in the class string.
A class can be nested in every part of the surrounding class: in the public, protected or
private section. Such a nested class can be considered a member of the surrounding class. The
normal access and rules in classes apply to nested classes. If a class is nested in the public section
of a class, it is visible outside the surrounding class. If it is nested in the protected section it is
visible in subclasses, derived from the surrounding class, if it is nested in the private section, it is
only visible for the members of the surrounding class.
The surrounding class has no special privileges towards the nested class. The nested class has full
control over the accessibility of its members by the surrounding class. For example, consider the
following class definition:
class Surround
{
public:
class FirstWithin
{
int d_variable;
public:
FirstWithin();
int var() const;
};
private:
class SecondWithin
{
int d_variable;
public:
SecondWithin();
int var() const;
};
};
447
448 CHAPTER 17. NESTED CLASSES
inline int Surround::FirstWithin::var() const
{
return d_variable;
}
inline int Surround::SecondWithin::var() const
{
return d_variable;
}
Here access to the members is defined as follows:
• The class FirstWithin is visible outside and inside Surround. The class FirstWithin thus
has global visibility.
• FirstWithin’s constructor and its member function var are also globally visible.
• The data member d_variable is only visible to the members of the class FirstWithin.
Neither the members of Surround nor the members of SecondWithin can directly access
FirstWithin::d_variable.
• The class SecondWithin is only visible inside Surround. The public members of the class
SecondWithin can also be used by the members of the class FirstWithin, as nested classes
can be considered members of their surrounding class.
• SecondWithin’s constructor and its member function var also can only be reached by the
members of Surround (and by the members of its nested classes).
• SecondWithin::d_variable is only visible to SecondWithin’smembers. Neither the members
of Surround nor the members of FirstWithin can access d_variable of the class
SecondWithin directly.
• As always, an object of the class type is required before its members can be called. This also
holds true for nested classes.
To grant the surrounding class access rights to the private members of its nested classes or to grant
nested classes access rights to the private members of the surrounding class, the classes can be
defined as friend classes (see section 17.3).
Nested classes can be considered members of the surrounding class, but members of nested classes
are not members of the surrounding class. So, a member of the class Surround may not access
FirstWithin::var directly. This is understandable considering that a Surround object is not
also a FirstWithin or SecondWithin object. In fact, nested classes are just typenames. It is not
implied that objects of such classes automatically exist in the surrounding class. If a member of
the surrounding class should use a (non-static) member of a nested class then the surrounding class
must define a nested class object, which can thereupon be used by the members of the surrounding
class to use members of the nested class.
For example, in the following class definition there is a surrounding class Outer and a nested class
Inner. The class Outer contains a member function caller. The member function caller uses
the d_inner object that is composed within Outer to call Inner::infunction:
class Outer
{
public:
void caller();
17.1. DEFINING NESTED CLASS MEMBERS 449
private:
class Inner
{
public:
void infunction();
};
Inner d_inner; // class Inner must be known
};
void Outer::caller()
{
d_inner.infunction();
}
Inner::infunction can be called as part of the inline definition of Outer::caller, even though
the definition of the class Inner is yet to be seen by the compiler. On the other hand, the compiler
must have seen the definition of the class Inner before a data member of that class can be defined.
17.1 Defining nested class members
Member functions of nested classes may be defined as inline functions. Inline member functions
can be defined as if they were defined outside of the class definition. To define the member function
Outer::caller outside of the class Outer, the function’s fully qualified name (starting from the
outermost class scope (Outer)) must be provided to the compiler. Inline and in-class functions can
be defined accordingly. They can be defined and they can use any nested class. Even if the nested
class’s definition appears later in the outer class’s interface.
When (nested) member functions are defined inline, their definitions should be put below their class
interface. Static nested data members are also usually defined outside of their classes. If the class
FirstWithin would have had a static size_t datamember epoch, it could have been initialized
as follows:
size_t Surround::FirstWithin::epoch = 1970;
Furthermore, multiple scope resolution operators are needed to refer to public static members in
code outside of the surrounding class:
void showEpoch()
{
cout << Surround::FirstWithin::epoch;
}
Within the class Surround only the FirstWithin:: scope must be used; within the class
FirstWithin there is no need to refer explicitly to the scope.
What about the members of the class SecondWithin? The classes FirstWithin and
SecondWithin are both nested within Surround, and can be considered members of the surrounding
class. Since members of a class may directly refer to each other, members of the
class SecondWithin can refer to (public) members of the class FirstWithin. Consequently,
members of the class SecondWithin could refer to the epoch member of FirstWithin as
FirstWithin::epoch.
450 CHAPTER 17. NESTED CLASSES
17.2 Declaring nested classes
Nested classes may be declared before they are actually defined in a surrounding class. Such forward
declarations are required if a class contains multiple nested classes, and the nested classes contain
pointers, references, parameters or return values to objects of the other nested classes.
For example, the following class Outer contains two nested classes Inner1 and Inner2. The class
Inner1 contains a pointer to Inner2 objects, and Inner2 contains a pointer to Inner1 objects.
Cross references require forward declarations. Forward declarations must be given an access specification
that is identical to the access specification of their definitions. In the following example the
Inner2 forward declaration must be given in a private section, as its definition is also part of the
class Outer’s private interface:
class Outer
{
private:
class Inner2; // forward declaration
class Inner1
{
Inner2 *pi2; // points to Inner2 objects
};
class Inner2
{
Inner1 *pi1; // points to Inner1 objects
};
};
17.3 Accessing private members in nested classes
To grant nested classes access rights to the private members of other nested classes, or to grant a
surrounding class access to the private members of its nested classes the friend keyword must be
used.
Note that no friend declaration is required to grant a nested class access to the private members of
its surrounding class. After all, a nested class is a type defined by its surrounding class and as such
objects of the nested class are members of the outer class and thus can access all the outer class’s
members. Here is an example showing this principle. The example won’t compile as members of
the class Extern are denied access to Outer’s private members, but Outer::Inner’s members can
access Outer’s private memebrs:
class Outer
{
int d_value;
static int s_value;
public:
Outer()
:
d_value(12)
{}
class Inner
17.3. ACCESSING PRIVATE MEMBERS IN NESTED CLASSES 451
{
public:
Inner()
{
cout << "Outer’s static value: " << s_value << ’\n’;
}
Inner(Outer &outer)
{
cout << "Outer’s value: " << outer.d_value << ’\n’;
}
};
};
class Extern // won’t compile!
{
public:
Extern(Outer &outer)
{
cout << "Outer’s value: " << outer.d_value << ’\n’;
}
Extern()
{
cout << "Outer’s static value: " << Outer::s_value << ’\n’;
}
};
int Outer::s_value = 123;
int main()
{
Outer outer;
Outer::Inner in1;
Outer::Inner in2(outer);
}
Now consider the situation where a class Surround has two nested classes FirstWithin and
SecondWithin. Each of the three classes has a static data member int s_variable:
class Surround
{
static int s_variable;
public:
class FirstWithin
{
static int s_variable;
public:
int value();
};
int value();
private:
class SecondWithin
{
static int s_variable;
public:
int value();
452 CHAPTER 17. NESTED CLASSES
};
};
If the class Surround should be able to access FirstWithin and SecondWithin’s privatemembers,
these latter two classes must declare Surround to be their friend. The function Surround::value
can thereupon access the private members of its nested classes. For example (note the friend
declarations in the two nested classes):
class Surround
{
static int s_variable;
public:
class FirstWithin
{
friend class Surround;
static int s_variable;
public:
int value();
};
int value();
private:
class SecondWithin
{
friend class Surround;
static int s_variable;
public:
int value();
};
};
inline int Surround::FirstWithin::value()
{
FirstWithin::s_variable = SecondWithin::s_variable;
return (s_variable);
}
Friend declarations may be provided beyond the definition of the entity that is to be considered a
friend. So a class can be declared a friend beyond its definition. In that situation in-class code may
already use the fact that it is going to be declared a friend by the upcoming class.
Note that members named identically in outer and inner classes (e.g., ‘s_variable’) may be accessed
using the proper scope resolution expressions, as illustrated below:
class Surround
{
static int s_variable;
public:
class FirstWithin
{
friend class Surround;
static int s_variable; // identically named
public:
int value();
};
int value();
17.3. ACCESSING PRIVATE MEMBERS IN NESTED CLASSES 453
private:
class SecondWithin
{
friend class Surround;
static int s_variable; // identically named
public:
int value();
};
static void classMember();
};
inline int Surround::value()
{ // scope resolution expression
FirstWithin::s_variable = SecondWithin::s_variable;
return s_variable;
}
inline int Surround::FirstWithin::value()
{
Surround::s_variable = 4; // scope resolution expressions
Surround::classMember();
return s_variable;
}
inline int Surround::SecondWithin::value()
{
Surround::s_variable = 40; // scope resolution expression
return s_variable;
}
Nested classes aren’t automatically each other’s friends. Here friend declarations must be applied
to grant one nested classes access to another one’s private members. To grant FirstWithin access
to SecondWithin’s private members a friend declaration in SecondWithin is required. But to
grant SecondWithin access to FirstWithin’s private members the class FirstWithin cannot
simply use friend class SecondWithin, as SecondWithin’s definition is as yet unknown.
Now a forward declaration of SecondWithin is required. This forward declarationmust be provided
by the class Surround, rather than by the class FirstWithin. It makes no sense to specify a
forward declaration like ‘class SecondWithin;’ in the class FirstWithin itself, as this would
refer to an external (global) class SecondWithin. SecondWithin’s forward declaration can also
not be specified inside FirstWithin as ‘class Surround::SecondWithin;’. This attempt would
generate the following error message:
‘Surround’ does not have a nested type named ‘SecondWithin’
Instead of providing a forward declaration for SecondWithin inside the nested classes the class
SecondWithin must be declared by the class Surround, before the class FirstWithin has been
defined. This way SecondWithin’s friend declaration is accepted inside FirstWithin. Here is an
example in which all classes have full access to all private members of all involved classes:
class Surround
{
// class SecondWithin; not required: friend declarations (see
// below) double as forward declarations
static int s_variable;
454 CHAPTER 17. NESTED CLASSES
public:
class FirstWithin
{
friend class Surround;
friend class SecondWithin;
static int s_variable;
public:
int value();
};
int value(); // implementation given above
private:
class SecondWithin
{
friend class Surround;
friend class FirstWithin;
static int s_variable;
public:
int value();
};
};
inline int Surround::FirstWithin::value()
{
Surround::s_variable = SecondWithin::s_variable;
return s_variable;
}
inline int Surround::SecondWithin::value()
{
Surround::s_variable = FirstWithin::s_variable;
return s_variable;
}
17.4 Nesting enumerations
Enumerations may also be nested in classes. Nesting enumerations is a good way to show the close
connection between the enumeration and its class. Nested enumerations have the same controlled
visibility as other class members. They may be defined in the private, protected or public sections
of classes and are inherited by derived classes. In the class ios we’ve seen values like ios::beg
and ios::cur. In the current Gnu C++ implementation these values are defined as values of the
seek_dir enumeration:
class ios: public _ios_fields
{
public:
enum seek_dir
{
beg,
cur,
end
};
};
As an illustration assume that a class DataStructure represents a data structure that may be
17.4. NESTING ENUMERATIONS 455
traversed in a forward or backward direction. Such a class can define an enumeration Traversal
having the values FORWARD and BACKWARD. Furthermore, a member function setTraversal can be
defined requiring a Traversal type of argument. The class can be defined as follows:
class DataStructure
{
public:
enum Traversal
{
FORWARD,
BACKWARD
};
setTraversal(Traversal mode);
private:
Traversal
d_mode;
};
Within the class DataStructure the values of the Traversal enumeration can be used directly.
For example:
void DataStructure::setTraversal(Traversal mode)
{
d_mode = mode;
switch (d_mode)
{
FORWARD:
// ... do something
break;
BACKWARD:
// ... do something else
break;
}
}
Ouside of the class DataStructure the name of the enumeration type is not used to refer to the
values of the enumeration. Here the classname is sufficient. Only if a variable of the enumeration
type is required the name of the enumeration type is needed, as illustrated by the following piece of
code:
void fun()
{
DataStructure::Traversal // enum typename required
localMode = DataStructure::FORWARD; // enum typename not required
DataStructure ds;
// enum typename not required
ds.setTraversal(DataStructure::BACKWARD);
}
In the above example the constant DataStructure;:FORWARD was used to specify a value of an
enumdefined in the class DataStructure. Instead of DataStructure::FORWARD the construction
456 CHAPTER 17. NESTED CLASSES
ds.FORWARD is also accepted. In my opinion this syntactic liberty is ugly: FORWARD is a symbolic
value that is defined at the class level; it’s not a member of ds, which is suggested by the use of the
member selector operator.
Only if DataStructure defines a nested class Nested, in turn defining the enumeration
Traversal, the two class scopes are required. In that case the latter example should have been
coded as follows:
void fun()
{
DataStructure::Nested::Traversal
localMode = DataStructure::Nested::FORWARD;
DataStructure ds;
ds.setTraversal(DataStructure::Nested::BACKWARD);
}
Here the construction DataStructure::Nested::Traversal localMode =
ds.Nested::FORWARD could also be used. I would avoid it.
17.4.1 Empty enumerations
Enum types usually define symbolic values. However, this is not required. In section 14.6.1 the
std::bad_cast type was introduced. A bad_cast is thrown by the dynamic_cast<> operator
when a reference to a base class object cannot be cast to a derived class reference. The bad_cast
could be caught as type, irrespective of any value it might represent.
Types may be defined without any associated values. An empty enum can be defined which is an
enum not defining any values. The empty enum’s type name may thereupon be used as a legitimate
type in, e.g. a catch clause.
The example shows how an empty enum is defined (often, but not necessarily within a class) and
how it may be thrown (and caught) as exceptions:
#include <iostream>
enum EmptyEnum
{};
int main()
try
{
throw EmptyEnum();
}
catch (EmptyEnum)
{
std::cout << "Caught empty enum\n";
}
17.5. REVISITING VIRTUAL CONSTRUCTORS 457
17.5 Revisiting virtual constructors
In section 14.12 the notion of virtual constructors was introduced. In that section a class Base was
defined as an abstract base class. A class Clonable was defined to manage Base class pointers in
containers like vectors.
As the class Base is a minute class, hardly requiring any implementation, it can very well be defined
as a nested class in Clonable. This emphasizes the close relationship between Clonable and Base.
Nesting Base under Clonable changes
class Derived: public Base
into:
class Derived: public Clonable::Base
Apart from defining Base as a nested class and deriving from Clonable::Base rather than from
Base (and providing Base members with the proper Clonable:: prefix to complete their fully
qualified names), no further modifications are required. Here are the modified parts of the program
shown earlier (cf. section 14.12), now using Base nested under Clonable:
// Clonable and nested Base, including their inline members:
class Clonable
{
public:
class Base;
private:
Base *d_bp;
public:
class Base
{
public:
virtual ~Base();
Base *clone() const;
private:
virtual Base *newCopy() const = 0;
};
Clonable();
explicit Clonable(Base *base);
~Clonable();
Clonable(Clonable const &other);
Clonable(Clonable &&tmp);
Clonable &operator=(Clonable const &other);
Clonable &operator=(Clonable &&tmp);
Base &base() const;
};
inline Clonable::Base *Clonable::Base::clone() const
{
return newCopy();
}
inline Clonable::Base &Clonable::base() const
{
458 CHAPTER 17. NESTED CLASSES
return *d_bp;
}
// Derived and its inline member:
class Derived1: public Clonable::Base
{
public:
~Derived1();
private:
virtual Clonable::Base *newCopy() const;
};
inline Clonable::Base *Derived1::newCopy() const
{
return new Derived1(*this);
}
// Members not implemented inline:
Clonable::Base::~Base()
{}

Chapter 18
The Standard Template Library
The Standard Template Library (STL) is a general purpose library consisting of containers,
generic algorithms, iterators, function objects, allocators, adaptors and data structures. The data
structures used by the algorithms are abstract in the sense that the algorithms can be used with
(practically) any data type.
The algorithms can process these abstract data types because they are template based. This chapter
does not cover template construction (see chapter 21 for that). Rather, it focuses on the use of the
algorithms.
Several elements also used by the standard template library have already been discussed in the C++
Annotations. In chapter 12 abstract containers were discussed, and in section 11.10 function objects
were introduced. Also, iterators were mentioned at several places in this document.
The main components of the STL are covered in this and the next chapter. Iterators, adaptors, smart
pointers, multi threading and other features of the STL are discussed in coming sections. Generic
algorithms are covered in the next chapter (19).
Allocators take care of the memory allocation within the STL. The default allocator class suffices for
most applications, and is not further discussed in the C++ Annotations.
All elements of the STL are defined in the standard namespace. Therefore, a using namespace
std or a comparable directive is required unless it is preferred to specify the required namespace
explicitly. In header files the std namespace should explicitly be used (cf. section 7.11.1).
In this chapter the empty angle bracket notation is frequently used. In code a typename must be
supplied between the angle brackets. E.g., plus<> is used in the C++ Annotations, but in code
plus<string> may be encountered.
18.1 Predefined function objects
Before using the predefined function objects presented in this section the <functional> header
file must be included.
Function objects play important roles in generic algorithms. For example, there exists a generic
algorithm sort expecting two iterators defining the range of objects that should be sorted, as well
as a function object calling the appropriate comparison operator for two objects. Let’s take a quick
look at this situation. Assume strings are stored in a vector, and we want to sort the vector in
459
460 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
descending order. In that case, sorting the vector stringVec is as simple as:
sort(stringVec.begin(), stringVec.end(), greater<string>());
The last argument is recognized as a constructor: it is an instantiation of the greater<> class template,
applied to strings. This object is called as a function object by the sort generic algorithm.
The generic algorithm calls the function object’s operator() member to compare two string objects.
The function object’s operator() will, in turn, call operator> of the string data type.
Eventually, when sort returns, the first element of the vector will contain the string having the
greatest string value of all.
The function object’s operator() itself is not visible at this point. Don’t confuse the parentheses
in the ‘greater<string>()’ argument with calling operator(). When operator()
is actually used inside sort, it receives two arguments: two strings to compare for ‘greaterness’.
Since greater<string>::operator() is defined inline, the call itself is not actually
present in the above sort call. Instead sort calls string::operator> through
greater<string>::operator().
Now that we know that a constructor is passed as argument to (many) generic algorithms, we can
design our own function objects. Assume we want to sort our vector case-insensitively. How do
we proceed? First we note that the default string::operator< (for an incremental sort) is not
appropriate, as it does case sensitive comparisons. So, we provide our own CaseInsensitive class,
which compares two strings case insensitively. Using the POSIX function strcasecmp, the following
program performs the trick. It case-insensitively sorts its command-line arguments in ascending
alphabetic order:
#include <iostream>
#include <string>
#include <cstring>
#include <algorithm>
using namespace std;
class CaseInsensitive
{
public:
bool operator()(string const &left, string const &right) const
{
return strcasecmp(left.c_str(), right.c_str()) < 0;
}
};
int main(int argc, char **argv)
{
sort(argv, argv + argc, CaseInsensitive());
for (int idx = 0; idx < argc; ++idx)
cout << argv[idx] << " ";
cout << ’\n’;
}
The default constructor of the class CaseInsensitive is used to provide sort with its final
argument. So the only member function that must be defined is CaseInsensitive::operator().
Since we know it’s called with string arguments, we define it to expect two string arguments,
which are used when calling strcasecmp. Furthermore, operator() function is defined inline, so
that it does not produce overhead when called by the sort function. The sort function calls the
function object with various combinations of strings. If the compiler grants our inline requests, it
will in fact call strcasecmp, skipping two extra function calls.
18.1. PREDEFINED FUNCTION OBJECTS 461
The comparison function object is often a predefined function object. Predefined function object
classes are available for many commonly used operations. In the following sections the available
predefined function objects are presented, together with some examples showing their use. Near the
end of the section about function objects function adaptors are introduced.
Predefined function objects are used predominantly with generic algorithms. Predefined function
objects exists for arithmetic, relational, and logical operations. In section 24.3 predefined function
objects are developed performing bitwise operations.
18.1.1 Arithmetic function objects
The arithmetic function objects support the standard arithmetic operations: addition, subtraction,
multiplication, division, modulo and negation. These function objects invoke the corresponding operators
of the data types for which they are instantiated. For example, for addition the function object
plus<Type> is available. If we replace Type by size_t then the addition operator for size_t
values is used, if we replace Type by string, the addition operator for strings is used. For example:
#include <iostream>
#include <string>
#include <functional>
using namespace std;
int main(int argc, char **argv)
{
plus<size_t> uAdd; // function object to add size_ts
cout << "3 + 5 = " << uAdd(3, 5) << ’\n’;
plus<string> sAdd; // function object to add strings
cout << "argv[0] + argv[1] = " << sAdd(argv[0], argv[1]) << ’\n’;
}
/*
Output when called as: a.out going
3+5=8
argv[0] + argv[1] = a.outgoing
*/
Why is this useful? Note that the function object can be used with all kinds of data types (not only
with the predefined datatypes) supporting the operator called by the function object.
Suppose we want to perform an operation on a left hand side operand which is always the same
variable and a right hand side argument for which, in turn, all elements of an array should be used.
E.g., we want to compute the sum of all elements in an array; or we want to concatenate all the
strings in a text-array. In situations like these function objects come in handy.
As stated, function objects are heavily used in the context of the generic algorithms, so let’s take a
quick look ahead at yet another one.
The generic algorithm accumulate visits all elements specified by an iterator-range, and performs
a requested binary operation on a common element and each of the elements in the range, returning
the accumulated result after visiting all elements specified by the iterator range. It’s easy to use this
algorithm. The next program accumulates all command line arguments and prints the final string:
462 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
#include <iostream>
#include <string>
#include <functional>
#include <numeric>
using namespace std;
int main(int argc, char **argv)
{
string result =
accumulate(argv, argv + argc, string(), plus<string>());
cout << "All concatenated arguments: " << result << ’\n’;
}
The first two arguments define the (iterator) range of elements to visit, the third argument is
string. This anonymous string object provides an initial value. We could also have used
string("All concatenated arguments: ")
in which case the cout statement could simply have been cout << result << ’\n’. The stringaddition
operation is used, called from plus<string>. The final concatenated string is returned.
Now we define a class Time, overloading operator+. Again, we can apply the predefined function
object plus, now tailored to our newly defined datatype, to add times:
#include <iostream>
#include <string>
#include <vector>
#include <functional>
#include <numeric>
using namespace std;
class Time
{
friend ostream &operator<<(ostream &str, Time const &time);
size_t d_days;
size_t d_hours;
size_t d_minutes;
size_t d_seconds;
public:
Time(size_t hours, size_t minutes, size_t seconds);
Time &operator+=(Time const &rValue);
};
Time &&operator+(Time const &lValue, Time const &rValue)
{
Time ret(lValue);
return std::move(ret += rValue);
}
Time::Time(size_t hours, size_t minutes, size_t seconds)
:
d_days(0),
d_hours(hours),
d_minutes(minutes),
d_seconds(seconds)
18.1. PREDEFINED FUNCTION OBJECTS 463
{}
Time &Time::operator+=(Time const &rValue)
{
d_seconds += rValue.d_seconds;
d_minutes += rValue.d_minutes + d_seconds / 60;
d_hours += rValue.d_hours + d_minutes / 60;
d_days += rValue.d_days + d_hours / 24;
d_seconds %= 60;
d_minutes %= 60;
d_hours %= 24;
return *this;
}
ostream &operator<<(ostream &str, Time const &time)
{
return cout << time.d_days << " days, " << time.d_hours <<
" hours, " <<
time.d_minutes << " minutes and " <<
time.d_seconds << " seconds.";
}
int main(int argc, char **argv)
{
vector<Time> tvector;
tvector.push_back(Time( 1, 10, 20));
tvector.push_back(Time(10, 30, 40));
tvector.push_back(Time(20, 50, 0));
tvector.push_back(Time(30, 20, 30));
cout <<
accumulate
(
tvector.begin(), tvector.end(), Time(0, 0, 0), plus<Time>()
) <<
’\n’;
}
// Displays: 2 days, 14 hours, 51 minutes and 30 seconds.
The design of the above program is fairly straightforward. Time defines a constructor, it defines
an insertion operator and it defines its own operator+, adding two time objects. In main four
Time objects are stored in a vector<Time> object. Then, accumulate is used to compute the
accumulated time. It returns a Time object, which is inserted into cout.
While the first example did show the use of a named function object, the last two examples showed
the use of anonymous objects that were passed to the (accumulate) function.
The STL supports the following set of arithmetic function objects. The function call operator
(operator()) of these function objects calls the matching arithmetic operator for the objects that
are passed to the function call operator, returning that arithmetic operator’s return value. The
arithmetic operator that is actually called is mentioned below:
• plus<>: calls the binary operator+;
• minus<>: calls the binary operator-;
• multiplies<>: calls the binary operator_;
464 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
• divides<>: calls operator/;
• modulus<>: calls operator%;
• negate<>: calls the unary operator-. This arithmetic function object is a unary function
object as it expects one argument.
In the next example the transform generic algorithm is used to toggle the signs of all elements
of an array. Transform expects two iterators, defining the range of objects to be transformed; an
iterator defining the begin of the destination range (which may be the same iterator as the first
argument); and a function object defining a unary operation for the indicated data type.
#include <iostream>
#include <string>
#include <functional>
#include <algorithm>
using namespace std;
int main(int argc, char **argv)
{
int iArr[] = { 1, -2, 3, -4, 5, -6 };
transform(iArr, iArr + 6, iArr, negate<int>());
for (int idx = 0; idx < 6; ++idx)
cout << iArr[idx] << ", ";
cout << ’\n’;
}
// Displays: -1, 2, -3, 4, -5, 6,
18.1.2 Relational function objects
The relational operators are called by the relational function objects. All standard relational operators
are supported: ==, !=, >, >=, < and <=.
The STL supports the following set of relational function objects. The function call operator
(operator()) of these function objects calls the matching relational operator for the objects that
are passed to the function call operator, returning that relational operator’s return value. The relational
operator that is actually called is mentioned below:
• equal_to<>: calls operator==;
• not_equal_to<>: calls operator!=;
• greater<>: calls operator>;
• greater_equal<>: calls operator>=;
• less<>: this object’s operator() member calls operator<;
• less_equal<>: calls operator<=.
An example using the relational function objects in combination with sort is:
#include <iostream>
18.1. PREDEFINED FUNCTION OBJECTS 465
#include <string>
#include <functional>
#include <algorithm>
using namespace std;
int main(int argc, char **argv)
{
sort(argv, argv + argc, greater_equal<string>());
for (int idx = 0; idx < argc; ++idx)
cout << argv[idx] << " ";
cout << ’\n’;
sort(argv, argv + argc, less<string>());
for (int idx = 0; idx < argc; ++idx)
cout << argv[idx] << " ";
cout << ’\n’;
}
The example illustrates how strings may be sorted alphabetically and reversed alphabetically. By
passing greater_equal<string> the strings are sorted in decreasing order (the first word will be
the ’greatest’), by passing less<string> the strings are sorted in increasing order (the first word
will be the ’smallest’).
Note that argv contains char _ values, and that the relational function object expects a string.
The promotion from char const _ to string is silently performed.
18.1.3 Logical function objects
The logical operators are called by the logical function objects. The standard logical operators are
supported: and, or, and not.
The STL supports the following set of logical function objects. The function call operator
(operator()) of these function objects calls the matching logical operator for the objects that are
passed to the function call operator, returning that logical operator’s return value. The logical operator
that is actually called is mentioned below:
• logical_and<>: calls operator&&;
• logical_or<>: calls operator||;
• logical_not<>: calls operator!.
An example using operator! is provided in the following trivial program, using transform to
transform the logicalvalues stored in an array:
#include <iostream>
#include <string>
#include <functional>
#include <algorithm>
using namespace std;
466 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
int main(int argc, char **argv)
{
bool bArr[] = {true, true, true, false, false, false};
size_t const bArrSize = sizeof(bArr) / sizeof(bool);
for (size_t idx = 0; idx < bArrSize; ++idx)
cout << bArr[idx] << " ";
cout << ’\n’;
transform(bArr, bArr + bArrSize, bArr, logical_not<bool>());
for (size_t idx = 0; idx < bArrSize; ++idx)
cout << bArr[idx] << " ";
cout << ’\n’;
}
/*
Displays:
111000
000111
*/
18.1.4 Function adaptors
Function adaptors modify the working of existing function objects. The STL offers three kinds of
function adaptors: binders, negators and member function wrappers. Binders and negators are
described in the next two subsections; member function adaptors are covered in section 19.2 of the
next chapter, which is a more natural point of coverage than the current chapter.
18.1.4.1 The ‘bind’ function template
Binders are function adaptors converting binary function objects to unary function objects. They do
so by binding one parameter of a binary function object to a constant value. For example, if the first
parameter of the minus<int> function object is bound to 100, then the resulting value is always
equal to 100 minus the value of the function object’s second argument.
Originally two binder adapters (bind1st and bind2nd) binding, respectively, the first and the second
argument of a binary function were defined. However, in the next C++17 standard bind1st
and bind2nd are likely to be removed, as they are superseded by the more general bind binder.
Bind itself is likely to become a deprecated function, as it can easily be replaced by (generic) lambda
functions (cf. section 18.7).
As bind1st and bind2nd are still available, a short example showing their use (concentrating on
bind2nd) is provided. A more elaborate example, using bind is shown next. Existing code should
be modified so that either bind or a lambda function is used.
Before using bind (or the namespace std::placeholders, see below) the <functional> header
file must be included.
Here is an example showing how to use bind2nd to count the number of strings that are equal to
a string (target) in a vector of strings (vs) (it is assumed that the required headers and using
namespace std have been specified):
18.1. PREDEFINED FUNCTION OBJECTS 467
count_if(vs.begin(), vs.end(), bind2nd(equal_to<string>(), target));
In this example the function object equal_to is instantiated for strings, receiving target as its
second argument, and each of the strings in vs are passed in sequence to its first argument. In this
particular example, where equality is being determined, bind1st could also have been used.
The bind adaptor expects a function as its first argument, and then any number of arguments that
the function may need. Although an unspecified number of arguments may be specified when using
bind it is not a variadic function the way the C programming language defines them. Bind is a
variadic function template, which are covered in section 22.5.
By default bind returns the function that is specified as its first argument, receiving the remaining
arguments that were passed to bind as its arguments. The function returned by bind may then be
called. Depending on the way bind is called, calling the returned function may or may not required
arguments.
Here is an example:
int sub(int lhs, int rhs); // returns lhs - rhs
bind(sub, 3, 4); // returns a function object whose
// operator() returns sub(3, 4)
Since bind’s return value is a function object it can be called:
bind(sub, 3, 4)();
but more commonly bind’s return value is assigned to a variable, which then represents the returned
function object, as in:
auto functor = bind(sub, 3, 4); // define a variable for the functor
cout << functor() << ’\n’; // call the functor, returning -1.
Instead of specifying the arguments when using bind, placeholders (cf. section 4.1.3.1) can be specified.
Explicit argument values must then be specified when the returned functor is called. Here are
some examples:
using namespace placeholders;
auto ftor1 = bind(sub, _1, 4); // 1st argument must be specified
ftor1(10); // returns 10 - 4 = 6
auto ftor2 = bind(sub, 5, _1); // 2nd argument must be specified
ftor2(10); // returns 5 - 10 = -5
auto ftor3 = bind(sub, _1, _2); // Both arguments must be specified
ftor3(10, 2); // returns 10 - 2 = 8
auto ftor3 = bind(sub, _2, _1); // Both arguments must be specified
ftor3(10, 2); // but in reversed order: returns
// 2 - 10 = -8
Alternatively, the first argument can be the address of a member function. In that case, the first
argument specifies the object for which the member function will be called, while the remaining
arguments specify the arguments (if any) that are passed to the member function. Some examples:
468 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
struct Object // Object contains the lhs of a
{ // subtraction operation
int d_lhs;
Object(int lhs)
:
d_lhs(lhs)
{}
int sub(int rhs) // sub modifies d_lhs
{
return d_lhs -= rhs;
}
};
int main()
{
using namespace placeholders;
Object obj(5);
auto ftor = bind(&Object::sub, obj, 12);
cout << ftor() << ’\n’; // shows -7
cout << obj.d_x << ’\n’; // obj not modified, bind uses a copy
auto ftor = bind(&Object::sub, ref(obj), 12);
cout << ftor() << ’\n’; // shows -7
cout << obj.d_x << ’\n’; // obj modified, cout shows -7
}
Note the use of ref in the second bind call: here obj is passed by reference, forwarding obj itself,
rather than its copy, to the for2 functor. This is realized using a facility called perfect forwarding,
which is discussed in detail in section 22.5.2.
If the return type of the function that is called by the functor doesn’t match its context (e.g., the
functor is called in an expression where its return value is compared with a size_t) then the
return type of the functor can easily be coerced into the appropriate type (of course, provided that
the requested type conversion is possible). In those cases the requested return type can be specified
between pointed brackets immediately following bind. E.g.,
auto ftor = bind<size_t>(sub, _1, 4); // ftor’s return type is size_t
size_t target = 5;
if (target < ftor(3)) // -1 becomes a large positive value
cout << "smaller\n"; // and so ’smaller’ is shown.
Finally, the example given earlier, using bind2nd can be rewritten using bind like this:
using namespace placeholders;
count_if(vs.begin(), vs.end(), bind(equal_to<string>(), _1, target));
Here, bind returns a functor expecting one argument (represented by _1) and count_if will pass
the strings in vs will in sequence to the functor returned by bind. The second argument (target)
is embedded inside the functor’s implementation, where it is passed as second argument to the
equal_to<string>() function object.
18.1. PREDEFINED FUNCTION OBJECTS 469
18.1.4.2 Negators
Negators are function adaptors converting the values returned by predicate function. Traditionally,
matching bind1st and bind2nd, two negator function adaptors were predefined: not1 is the
negator to use with unary predicates, not2 is the negator to with binary function objects. In specific
situations they may still be usable in combination with the bind function template, but since
bind1st and bind2nd will be deprecated in C++17, alternative implementations are being considered
for not1 and not2 as well (see, e.g., https://isocpp.org/files/papers/n4076.html).
Since not1 and not2 are still part of the C++ standard, their use is briefly illustrated here. An
alternate implementation, suggesting how a future not_fn might be designed and how it can be
used is provided in section 22.5.5.
Here are some examples illustrating the use of not1 and not2: To count the number of elements
in a vector of strings (vs) that are alphabetically ordered before a certain reference string (target)
one of the following alternatives could be used:
• a binary predicate directly performing the required comparison:
count_if(vs.begin(), vs.end(), bind2nd(less<string>(), target))
or, using bind:
count_if(vs.begin(), vs.end(), bind(less<string>(), _1, target));
• The comparison can be reversed, using the not2 negator:
count_if(vs.begin(), vs.end(),
bind2nd(not2(greater_equal<string>()), target));
Here not2 is used as it negates the truth value of greater_equal’s truth value. Not2 receives
two arguments (one of vs’s elements and target), passes them on to greater_equal,
and returns the negated return value of the called greater_equal function.
In this example bind could also have been used:
count_if(vs.begin(), vs.end(),
bind(not2(greater_equal<string>()), _1, target));
• not1 in combination with the bind2nd predicate: here the arguments that are passed to
not1’s function call operator (i.e., the elements of the vs vector) are passed on to bind2nd’s
function call operator, which in turn calls greater_equal, using target as its second argument.
The value that is returned by bind2nd’s function call operator is then negated and
subsequently returned as the return value of not1’s function call operator:
count_if(vs.begin(), vs.end(),
not1(bind2nd(greater_equal<string>(), target)))
When using bind in this example a compilation error results, which can be avoided using
not_fn (section 22.5.5).
470 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
18.2 Iterators
In addition to the conceptual iterator types presented in this section the STL defines several adaptors
allowing objects to be passed as iterators. These adaptors are presented in the upcoming sections.
Before those adaptors can be used the <iterator> header file must be included.
Iterators are objects acting like pointers. Iterators have the following general characteristics:
• Two iterators may be compared for (in)equality using the == and != operators. The ordering
operators (e.g., >, <) can usually not be used.
• Given an iterator iter, _iter represents the object the iterator points to (alternatively,
iter-> can be used to reach the members of the object the iterator points to).
• ++iter or iter++ advances the iterator to the next element. The notion of advancing an iterator
to the next element is consequently applied: several containers support reversed_iterator
types, in which the ++iter operation actually reaches a previous element in a sequence.
• Pointer arithmetic may be used with iterators of containers storing their elements consecutively
in memory like vector and deque. For such containers iter + 2 points to the
second element beyond the one to which iter points. See also section 18.2.1, covering
std::distance.
• Merely defining an iterator is comparable to having a 0-pointer. Example:
#include <vector>
#include <iostream>
using namespace std;
int main()
{
vector<int>::iterator vi;
cout << &*vi; // prints 0
}
STL containers usually define members offering iterators (i.e., they define their own type
iterator). These members are commonly called begin and end and (for reversed iterators (type
reverse_iterator)) rbegin and rend.
Standard practice requires iterator ranges to be left inclusive. The notation [left, right) indicates
that left is an iterator pointing to the first element, while right is an iterator pointing just
beyond the last element. The iterator range is empty when left == right.
The following example shows how all elements of a vector of strings can be inserted into cout using
its iterator ranges [begin(), end()), and [rbegin(), rend()). Note that the for-loops for
both ranges are identical. Furthermore it nicely illustrates how the auto keyword can be used to
define the type of the loop control variable instead of using a much more verbose variable definition
like vector<string>::iterator (see also section 3.3.5):
#include <iostream>
#include <vector>
#include <string>
using namespace std;
18.2. ITERATORS 471
int main(int argc, char **argv)
{
vector<string> args(argv, argv + argc);
for (auto iter = args.begin(); iter != args.end(); ++iter)
cout << *iter << " ";
cout << ’\n’;
for (auto iter = args.rbegin(); iter != args.rend(); ++iter)
cout << *iter << " ";
cout << ’\n’;
}
Furthermore, the STL defines const_iterator types that must be used when visiting a series of
elements in a constant container. Whereas the elements of the vector in the previous example
could have been altered, the elements of the vector in the next example are immutable, and
const_iterators are required:
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int main(int argc, char **argv)
{
vector<string> const args(argv, argv + argc);
for
(
vector<string>::const_iterator iter = args.begin();
iter != args.end();
++iter
)
cout << *iter << " ";
cout << ’\n’;
for
(
vector<string>::const_reverse_iterator iter = args.rbegin();
iter != args.rend();
++iter
)
cout << *iter << " ";
cout << ’\n’;
}
The examples also illustrates that plain pointers can be used as iterators. The initialization
vector<string> args(argv, argv + argc) provides the args vector with a pair of pointerbased
iterators: argv points to the first element to initialize args with, argv + argc points just
beyond the last element to be used, ++argv reaches the next command line argument. This is a
general pointer characteristic, which is why they too can be used in situations where iterators
are expected.
The STL defines five types of iterators. These iterator types are expected by generic algorithms, and
472 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
in order to create a particular type of iterator yourself it is important to know their characteristics.
In general, iterators (see also section 22.14) must define:
• operator==, testing two iterators for equality,
• operator!=, testing two iterators for inequality,
• operator++, incrementing the iterator, as prefix operator,
• operator_, to access the element the iterator refers to,
The following types of iterators are used when describing generic algorithms in chapter 19:
• InputIterators:
InputIterators are used to read from a container. The dereference operator is guaranteed
to work as rvalue in expressions. Instead of an InputIterator it is also possible
to use (see below) Forward-, Bidirectional- or RandomAccessIterators. Notations like
InputIterator1 and InputIterator2 may be used as well. In these cases, numbers
are used to indicate which iterators ‘belong together’. E.g., the generic algorithm
inner_product has the following prototype:
Type inner_product(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, Type init);
InputIterator1 first1 and InputIterator1 last1 define a pair of input iterators
on one range, while InputIterator2 first2 defines the beginning of another
range. Analogous notations may be used with other iterator types.
• OutputIterators:
OutputIterators can be used to write to a container. The dereference operator is guaranteed
to work as an lvalue in expressions, but not necessarily as rvalue. Instead
of an OutputIterator it is also possible to use (see below) Forward-, Bidirectional- or
RandomAccessIterators.
• ForwardIterators:
ForwardIterators combine InputIterators and OutputIterators. They can be used to
traverse containers in one direction, for reading and/or writing. Instead of a ForwardIterator
it is also possible to use (see below) Bidirectional- or RandomAccessIterators.
• BidirectionalIterators:
BidirectionalIterators can be used to traverse containers in both directions, for reading
and writing. Instead of a BidirectionalIterator it is also possible to use (see below)
a RandomAccessIterator.
• RandomAccessIterators:
RandomAccessIterators provide random access to container elements. An algorithm
like sort requires a RandomAccessIterator, and can therefore not be used to sort the
elements of lists or maps, which only provide BidirectionalIterators.
The example given with the RandomAccessIterator illustrates how to relate iterators and generic
algorithms: look for the iterator that’s required by the (generic) algorithm, and then see whether
the datastructure supports the required type of iterator. If not, the algorithm cannot be used with
the particular datastructure.
18.2. ITERATORS 473
18.2.1 std::distance
Earlier, in section 18.2 it was stated that iterators support pointer arithmetic for containers storing
their elements consecutively in memory. This is not completely true: to determine the number of elements
between the elements to which two iterators refer the iterator must support the subtraction
operator.
Using pointer arithmetic to compute the number of elements between two iterators in, e.g., a
std::list or std::unordered_map is not possible, as these containers do not store their elements
consecutively in memory.
The function std::distance fills in that little gap: std::distance expects two InputIterators
and returns the number of elements between them.
Before using distance the <iterator> header file must be included.
If the iterator specified as first argument exceeds the iterator specified as its second argument
then the number of elements is non-positive, otherwise it is non-negative. If the number of elements
cannot be determined (e.g., the iterators do not refer to elements in the same container), then
distance’s return value is undefined.
Example:
#include <iostream>
#include <unordered_map>
using namespace std;
int main()
{
unordered_map<int, int> myMap = {{1, 2}, {3, 5}, {-8, 12}};
cout << distance(++myMap.begin(), myMap.end()) << ’\n’; // shows: 2
}
18.2.2 Insert iterators
Generic algorithms often require a target container into which the results of the algorithm are
deposited. For example, the copy generic algorithm has three parameters. The first two define the
range of visited elements, the third defines the first position where the results of the copy operation
should be stored.
With the copy algorithm the number of elements to copy is usually available beforehand, since that
number can usually be provided by pointer arithmetic. However, situations exist where pointer
arithmetic cannot be used. Analogously, the number of resulting elements sometimes differs from
the number of elements in the initial range. The generic algorithm unique_copy is a case in point.
Here the number of elements that are copied to the destination container is normally not known
beforehand.
In situations like these an inserter adaptor function can often be used to create elements in the
destination container. There are three types of inserter adaptors:
• back_inserter: calls the container’s push_back member to add new elements at the end
of the container. E.g., to copy all elements of source in reversed order to the back of
destination, using the copy generic algorithm:
474 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
copy(source.rbegin(), source.rend(), back_inserter(destination));
• front_inserter calls the container’s push_front member, adding new elements at the beginning
of the container. E.g., to copy all elements of source to the front of the destination
container (thereby also reversing the order of the elements):
copy(source.begin(), source.end(), front_inserter(destination));
• inserter calls the container’s insert member adding new elements starting at a specified
starting point. E.g., to copy all elements of source to the destination container, starting at
the beginning of destination, shifting up existing elements to beyond the newly inserted
elements:
copy(source.begin(), source.end(), inserter(destination,
destination.begin()));
The inserter adaptors require the existence of two typedefs:
• typedef Data value_type, where Data is the data type stored in the class offering
push_back, push_front or insert members (Example: typedef std::string
value_type);
• typedef value_type const &const_reference
Concentrating on back_inserter, this iterator expects the name of a container supporting a member
push_back. The inserter’s operator() member calls the container’s push_back member. Objects
of any class supporting a push_back member can be passed as arguments to back_inserter
provided the class adds
typedef DataType const &const_reference;
to its interface (where DataType const & is the type of the parameter of the class’s member
push_back). Example:
#include <iostream>
#include <algorithm>
#include <iterator>
using namespace std;
class Insertable
{
public:
typedef int value_type;
typedef int const &const_reference;
void push_back(int const &)
{}
};
int main()
{
int arr[] = {1};
Insertable insertable;
copy(arr, arr + 1, back_inserter(insertable));
}
18.2. ITERATORS 475
18.2.3 Iterators for ‘istream’ objects
The istream_iterator<Type> can be used to define a set of iterators for istream objects. The
general form of the istream_iterator iterator is:
istream_iterator<Type> identifier(istream &in)
Here, Type is the type of the data elements read from the istream stream. It is used as the
‘begin’ iterator in an interator range. Type may be any type for which operator>> is defined in
combination with istream objects.
The default constructor is used as the end-iterator and corresponds to the end-of-stream. For example,
istream_iterator<string> endOfStream;
The stream object that was specified when defining the begin-iterator is not mentioned with the
default constructor.
Using back_inserter and istream_iterator adaptors, all strings from a stream can easily be
stored in a container. Example (using anonymous istream_iterator adaptors):
#include <iostream>
#include <iterator>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
vector<string> vs;
copy(istream_iterator<string>(cin), istream_iterator<string>(),
back_inserter(vs));
for
(
vector<string>::const_iterator begin = vs.begin(), end = vs.end();
begin != end; ++begin
)
cout << *begin << ’ ’;
cout << ’\n’;
}
18.2.3.1 Iterators for ‘istreambuf’ objects
Input iterators are also available for streambuf objects.
To read fromstreambuf objects supporting input operations istreambuf_iterators can be used,
supporting the operations that are also available for istream_iterator. Different from the latter
iterator type istreambuf_iterators support three constructors:
476 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
• istreambuf_iterator<Type>:
The end iterator of an iterator range is created using the default
istreambuf_iterator constructor. It represents the end-of-stream condition
when extracting values of type Type from the streambuf.
• istreambuf_iterator<Type>(streambuf _):
A pointer to a streambuf may be used when defining an istreambuf_iterator.
It represents the begin iterator of an iterator range.
• istreambuf_iterator<Type>(istream):
An istream may be also used when defining an istreambuf_iterator. It accesses
the istream’s streambuf and it also represents the begin iterator of an iterator
range.
In section 18.2.4.1 an example is given using both istreambuf_iterators and
ostreambuf_iterators.
18.2.4 Iterators for ‘ostream’ objects
An ostream_iterator<Type> adaptor can be used to pass an ostream to algorithms expecting
an OutputIterator. Two constructors are available for defining ostream_iterators:
ostream_iterator<Type> identifier(ostream &outStream);
ostream_iterator<Type> identifier(ostream &outStream, char const *delim);
Type is the type of the data elements that should be inserted into an ostream. Itmay be any type for
which operator<< is defined in combinations with ostream objects. The latter constructor can be
used to separate the individual Type data elements by delimiter strings. The former constructor
does not use any delimiters.
The example shows how istream_iterators and an ostream_iterator may be used to
copy information of a file to another file. A subtlety here is that you probably want to use
in.unsetf(ios::skipws). It is used to clear the ios::skipws flag. As a consequence white
space characters are simply returned by the operator, and the file is copied character by character.
Here is the program:
#include <iostream>
#include <algorithm>
#include <iterator>
using namespace std;
int main()
{
cin.unsetf(ios::skipws);
copy(istream_iterator<char>(cin), istream_iterator<char>(),
ostream_iterator<char>(cout));
}
18.3. THE CLASS ’UNIQUE_PTR’ 477
18.2.4.1 Iterators for ‘ostreambuf’ objects
Output iterators are also available for streambuf objects.
To write to streambuf objects supporting output operations ostreambuf_iterators
can be used, supporting the operations that are also available for ostream_iterator.
Ostreambuf_iterators support two constructors:
• ostreambuf_iterator<Type>(streambuf _):
A pointer to a streambuf may be used when defining an ostreambuf_iterator.
It can be used as an OutputIterator.
• ostreambuf_iterator<Type>(ostream):
An ostream may be also used when defining an ostreambuf_iterator. It accesses
the ostream’s streambuf and it can also be used as an OutputIterator.
The next example illustrates the use of both istreambuf_iterators and
ostreambuf_iterators when copying a stream in yet another way. Since the stream’s
streambufs are directly accessed the streams and stream flags are bypassed. Consequently there
is no need to clear ios::skipws as in the previous section, while the next program’s efficiency
probably also exceeds the efficiency of the program shown in the previous section.
#include <iostream>
#include <algorithm>
#include <iterator>
using namespace std;
int main()
{
istreambuf_iterator<char> in(cin.rdbuf());
istreambuf_iterator<char> eof;
ostreambuf_iterator<char> out(cout.rdbuf());
copy(in, eof, out);
}
18.3 The class ’unique_ptr’
Before using the unique_ptr class presented in this section the <memory> header file must be
included.
When pointers are used to access dynamically allocated memory strict bookkeeping is required to
prevent memory leaks. When a pointer variable referring to dynamically allocated memory goes out
of scope, the dynamically allocated memory becomes inaccessible and the program suffers from a
memory leak.
To prevent suchmemory leaks strict bookkeeping is required: the programmer has to make sure that
the dynamically allocated memory is returned to the common pool just before the pointer variable
goes out of scope.
478 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
When a pointer variable points to a dynamically allocated single value or object, bookkeeping requirements
are greatly simplified when the pointer variable is defined as a std::unique_ptr object.
Unique_ptrs are objects masquerading as pointers. Since they are objects, their destructors are
called when they go out of scope. Their destructors automatically delete the dynamically allocated
memory.
Unique_ptrs have some special characteristics:
• when assigning a unique_ptr to another move semantics is used. If move semantics is not
available compilation fails. On the other hand, if compilation succeeds then the used containers
or generic algorithms support the use of unique_ptrs. Here is an example:
std::unique_ptr<int> up1(new int);
std::unique_ptr<int> up2(up1); // compilation error
The second definition fails to compile as unique_ptr’s copy constructor is private (the same
holds true for the assignment operator). But the unique_ptr class does offer facilities to
initialize and assign from rvalue references:
class unique_ptr // interface partially shown
{
public:
unique_ptr(unique_ptr &&other); // rvalues bind here
private:
unique_ptr(const unique_ptr &other);
};
In the next example move semantics is used and so it compiles correctly:
unique_ptr<int> cp(unique_ptr<int>(new int));
• a unique_ptr object should only point to memory that was made available dynamically, as
only dynamically allocated memory can be deleted.
• multiple unique_ptr objects should not be allowed to point to the same block of dynamically
allocated memory. The unique_ptr’s interface was designed to prevent this from happening.
Once a unique_ptr object goes out of scope, it deletes the memory it points to, immediately
changing any other object also pointing to the allocated memory into a wild pointer.
• When a class Derived is derived from Base, then a newly allocated Derived class object can
be assigned to a unique_ptr<Base>, without having to define a virtual destructor for Base.
The Base _ pointer that is returned by the unique_ptr object can simply be cast statically to
Derived, as shown in the following example:
class Base
{ ... };
class Derived: public Base
{
// assume Derived has a member void process()
};
int main()
{
shared_ptr<Base> bp(new Derived);
static_cast<Derived *>(bp)->process(); // OK!
} // here ~Derived is called: no polymorphism required.
18.3. THE CLASS ’UNIQUE_PTR’ 479
The class unique_ptr offers several member functions to access the pointer itself or to have a
unique_ptr point to another block of memory. These member functions (and unique_ptr constructors)
are introduced in the next few sections.
A unique_ptr (as well as a shared_ptr, see section 18.4) can be used as a safe alternative to
the now deprecated auto_ptr. Unique_ptr also augments auto_ptr as it can be used with containers
and (generic) algorithms as it adds customizable deleters. Arrays can also be handled by
unique_ptrs.
18.3.1 Defining ‘unique_ptr’ objects
There are three ways to define unique_ptr objects. Each definition contains the usual <type>
specifier between angle brackets:
• The default constructor simply creates a unique_ptr object that does not point to a particular
block of memory. Its pointer is initialized to 0 (zero):
unique_ptr<type> identifier;
This form is discussed in section 18.3.2.
• The move constructor initializes an unique_ptr object. Following the use of the move constructor
its unique_ptr argument no longer points to the dynamically allocated memory and
its pointer data member is turned into a zero-pointer:
unique_ptr<type> identifier(another unique_ptr for type);
This form is discussed in section 18.3.3.
• The formthat is usedmost often initializes a unique_ptr object to the block of dynamically allocated
memory that is passed to the object’s constructor. Optionally deleter can be provided.
A (free) function (or function object) receiving the unique_ptr’s pointer as its argument can
be passed as deleter. It is supposed to return the dynamically allocated memory to the common
pool (doing nothing if the pointer equals zero).
unique_ptr<type> identifier (new-expression [, deleter]);
This form is discussed in section 18.3.4.
18.3.2 Creating a plain ‘unique_ptr’
Unique_ptr’s default constructor defines a unique_ptr not pointing to a particular block of memory:
unique_ptr<type> identifier;
The pointer controlled by the unique_ptr object is initialized to 0 (zero). Although the unique_ptr
object itself is not the pointer, its value can be compared to 0. Example:
unique_ptr<int> ip;
if (!ip)
cout << "0-pointer with a unique_ptr object\n";
Alternatively, the member get can be used (cf. section 18.3.5).
480 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
18.3.3 Moving another ‘unique_ptr’
A unique_ptr may be initialized using an rvalue reference to a unique_ptr object for the same
type:
unique_ptr<type> identifier(other unique_ptr object);
The move constructor is used, e.g., in the following example:
void mover(unique_ptr<string> &&param)
{
unique_ptr<string> tmp(move(param));
}
Analogously, the assignment operator can be used. A unique_ptr object may be assigned to a
temporary unique_ptr object of the same type (again move-semantics is used). For example:
#include <iostream>
#include <memory>
#include <string>
using namespace std;
int main()
{
unique_ptr<string> hello1(new string("Hello world"));
unique_ptr<string> hello2(move(hello1));
unique_ptr<string> hello3;
hello3 = move(hello2);
cout << // *hello1 << /\n’ << // would have segfaulted
// *hello2 << ’\n’ << // same
*hello3 << ’\n’;
}
// Displays: Hello world
The example illustrates that
• hello1 is initialized by a pointer to a dynamically alloctated string (see the next section).
• The unique_ptr hello2 grabs the pointer controlled by hello1 using a move constructor.
This effectively changes hello1 into a 0-pointer.
• Then hello3 is defined as a default unique_ptr<string>. But then it grabs its value using
move-assignment from hello2 (which, as a consequence, is changed into a 0-pointer as well)
If hello1 or hello2 had been inserted into cout a segmentation fault would have resulted. The
reason for this should now be clear: it is caused by dereferencing 0-pointers. In the end, only hello3
actually points to the originally allocated string.
18.3. THE CLASS ’UNIQUE_PTR’ 481
18.3.4 Pointing to a newly allocated object
A unique_ptr is most often initialized using a pointer to dynamically allocated memory. The
generic form is:
unique_ptr<type [, deleter_type]> identifier(new-expression
[, deleter = deleter_type()]);
The second (template) argument (deleter(_type)) is optional and may refer to a free function or
function object handling the destruction of the allocated memory. A deleter is used, e.g., in situations
where a double pointer is allocated and the destruction must visit each nested pointer to destroy the
allocated memory (see below for an illustration).
Here is an example initializing a unique_ptr pointing to a string object:
unique_ptr<string> strPtr(new string("Hello world"));
The argument that is passed to the constructor is the pointer returned by operator new. Note
that type does not mention the pointer. The type that is used in the unique_ptr construction is
the same as the type that is used in new expressions.
Here is an example showing how an explicitly defined deleter may be used to delete a dynamically
allocated array of pointers to strings:
#include <iostream>
#include <string>
#include <memory>
using namespace std;
struct Deleter
{
size_t d_size;
Deleter(size_t size = 0)
:
d_size(size)
{}
void operator()(string **ptr) const
{
for (size_t idx = 0; idx < d_size; ++idx)
delete ptr[idx];
delete[] ptr;
}
};
int main()
{
unique_ptr<string *, Deleter> sp2(new string *[10], Deleter(10));
Deleter &obj = sp2.get_deleter();
}
A unique_ptr can be used to reach the member functions that are available for objects allocated by
the new expression. These members can be reached as if the unique_ptr was a plain pointer to the
482 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
dynamically allocated object. For example, in the following program the text ‘C++’ is inserted behind
the word ‘hello’:
#include <iostream>
#include <memory>
#include <cstring>
using namespace std;
int main()
{
unique_ptr<string> sp(new string("Hello world"));
cout << *sp << ’\n’;
sp->insert(strlen("Hello "), "C++ ");
cout << *sp << ’\n’;
}
/*
Displays:
Hello world
Hello C++ world
*/
18.3.5 Operators and members
The class unique_ptr offers the following operators:
• unique_ptr<Type> &operator=(unique_ptr<Type> &&tmp):
This operator transfers the memory pointed to by the rvalue unique_ptr object to
the lvalue unique_ptr object using move semantics. So, the rvalue object loses the
memory it pointed at and turns into a 0-pointer. An existing unique_ptr may be
assigned to another unique_ptr by converting it to an rvalue reference first using
std::move. Example:
unique_ptr<int> ip1(new int);
unique_ptr<int> ip2;
ip2 = std::move(ip1);
• operator bool() const:
This operator returns false if the unique_ptr does not point to memory (i.e., its
get member, see below, returns 0). Otherwise, true is returned.
• Type &operator_():
This operator returns a reference to the information accessible via a unique_ptr
object . It acts like a normal pointer dereference operator.
• Type _operator->():
This operator returns a pointer to the information accessible via a unique_ptr
object. This operator allows you to select members of an object accessible via a
unique_ptr object. Example:
unique_ptr<string> sp(new string("hello"));
cout << sp->c_str();
18.3. THE CLASS ’UNIQUE_PTR’ 483
The class unique_ptr supports the following member functions:
• Type _get():
A pointer to the information controlled by the unique_ptr object is returned. It
acts like operator->. The returned pointer can be inspected. If it is zero the
unique_ptr object does not point to any memory.
• Deleter &unique_ptr<Type>::get_deleter():
A reference to the deleter object used by the unique_ptr is returned.
• Type _release():
A pointer to the information accessible via a unique_ptr object is returned. At the
same time the object itself becomes a 0-pointer (i.e., its pointer data member is turned
into a 0-pointer). This member can be used to transfer the information accessible
via a unique_ptr object to a plain Type pointer. After calling this member the
proper destruction of the dynamically allocated memory is the responsibility of the
programmer.
• void reset(Type _):
The dynamically allocated memory controlled by the unique_ptr object is returned
to the common pool; the object thereupon controls the memory to which the argument
that is passed to the function points. It can also be called without argument, turning
the object into a 0-pointer. This member function can be used to assign a new block
of dynamically allocated memory to a unique_ptr object.
• void swap(unique_ptr<Type> &):
Two identically typed unique_ptrs are swapped.
18.3.6 Using ‘unique_ptr’ objects for arrays
When a unique_ptr is used to store arrays the dereferencing operator makes little sense but with
arrays unique_ptr objects benefit from index operators. The distinction between a single object
unique_ptr and a unique_ptr referring to a dynamically allocated array of objects is realized
through a template specialization.
With dynamically allocated arrays the following syntax is available:
• the index ([]) notation is used to specify that the smart pointer controls a dynamically allocated
array. Example:
unique_ptr<int[]> intArr(new int[3]);
• the index operator can be used to access the array’s elements. Example:
intArr[2] = intArr[0];
In these cases the smart pointer’s destructors call delete[] rather than delete.
484 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
18.3.7 The deprecated class ’auto_ptr’
This class is deprecated and will most likely be removed in the upcoming C++17 standard.
The smart pointer class std::auto_ptr<Type> has traditionally been offered by C++. This class
does not support move semantics, but when an auto_ptr object is assigned to another, the righthand
object loses its information.
The class unique_ptr does not have auto_ptr’s drawbacks and consequently using auto_ptr is
now deprecated. Auto_ptrs suffer from the following drawbacks:
• they do not support move semantics;
• they should not be used to point to arrays;
• they cannot be used as data types of abstract containers.
Because of its drawbacks and available replacements the auto_ptr class is no longer covered by
the C++ Annotations. Existing software should be modified to use smart pointers (unique_ptrs
or shared_ptrs) and new software should, where applicable, directly be implemented in terms of
these new smart pointer types.
18.4 The class ’shared_ptr’
In addition to the class unique_ptr the class std::shared_ptr<Type> is available, which is a
reference counting smart pointer.
Before using shared_ptrs the <memory> header file must be included.
The shared pointer automatically destroys its contents once its reference count has decayed to zero.
As with unique_ptr, when defining a shared_ptr<Base> to store a newly allocated Derived
class object, the returned Base _ may be cast to a Derived _ using a static_cast: polymorphism
isn’t required, and when resetting the shared_ptr or when the shared_ptr goes out of scope, no
slicing occurs, and Derived’s destructor is called (cf. section 18.3).
Shared_ptrs support copy and move constructors as well as standard and move overloaded assignment
operators.
Like unique_ptrs, shared_ptrs may refer to dynamically allocated arrays.
18.4.1 Defining ‘shared_ptr’ objects
There are four ways to define shared_ptr objects. Each definition contains the usual <type>
specifier between angle brackets:
• The default constructor simply creates a shared_ptr object that does not point to a particular
block of memory. Its pointer is initialized to 0 (zero):
shared_ptr<type> identifier;
This form is discussed in section 18.4.2.
18.4. THE CLASS ’SHARED_PTR’ 485
• The copy constructor initializes a shared_ptr so that both objects share the memory pointed
at by the existing object. The copy constructor also increments the shared_ptr’s reference
count. Example:
shared_ptr<string> org(new string("hi there"));
shared_ptr<string> copy(org); // reference count now 2
• The move constructor initializes a shared_ptr with the pointer and reference count of a
temporary shared_ptr. The temporary shared_ptr is changed into a 0-pointer. An existing
shared_ptr may have its data moved to a newly defined shared_ptr (turning the
existing shared_ptr into a 0-pointer as well). In the next example a temporary, anonymous
shared_ptr object is constructed, which is then used to construct grabber. Since grabber’s
constructor receives an anonymous temporary object, the compiler uses shared_ptr’s move
constructor:
shared_ptr<string> grabber(shared_ptr<string>(new string("hi there")));
• The formthat is usedmost often initializes a shared_ptr object to the block of dynamically allocated
memory that is passed to the object’s constructor. Optionally deleter can be provided.
A (free) function (or function object) receiving the shared_ptr’s pointer as its argument can
be passed as deleter. It is supposed to return the dynamically allocated memory to the common
pool (doing nothing if the pointer equals zero).
shared_ptr<type> identifier (new-expression [, deleter]);
This form is discussed in section 18.4.3.
18.4.2 Creating a plain ‘shared_ptr’
Shared_ptr’s default constructor defines a shared_ptr not pointing to a particular block of memory:
shared_ptr<type> identifier;
The pointer controlled by the shared_ptr object is initialized to 0 (zero). Although the shared_ptr
object itself is not the pointer, its value can be compared to 0. Example:
shared_ptr<int> ip;
if (!ip)
cout << "0-pointer with a shared_ptr object\n";
Alternatively, the member get can be used (cf. section 18.4.4).
18.4.3 Pointing to a newly allocated object
Most often a shared_ptr is initialized by a dynamically allocated block of memory. The generic
form is:
shared_ptr<type> identifier(new-expression [, deleter]);
486 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
The second argument (deleter) is optional and refers to a function object or free function handling
the destruction of the allocated memory. A deleter is used, e.g., in situations where a double pointer
is allocated and the destruction must visit each nested pointer to destroy the allocated memory (see
below for an illustration). It is used in situations comparable to those encounteredwith unique_ptr
(cf. section 18.3.4).
Here is an example initializing a shared_ptr pointing to a string object:
shared_ptr<string> strPtr(new string("Hello world"));
The argument that is passed to the constructor is the pointer returned by operator new. Note
that type does not mention the pointer. The type that is used in the shared_ptr construction is
the same as the type that is used in new expressions.
The next example illustrates that two shared_ptrs indeed share their information. After modifying
the information controlled by one of the objects the information controlled by the other object is
modified as well:
#include <iostream>
#include <memory>
#include <cstring>
using namespace std;
int main()
{
shared_ptr<string> sp(new string("Hello world"));
shared_ptr<string> sp2(sp);
sp->insert(strlen("Hello "), "C++ ");
cout << *sp << ’\n’ <<
*sp2 << ’\n’;
}
/*
Displays:
Hello C++ world
Hello C++ world
*/
18.4.4 Operators and members
The class shared_ptr offers the following operators:
• shared_ptr &operator=(shared_ptr<Type> const &other):
Copy assignment: the reference count of the operator’s left hand side operand is
reduced. If the reference count decays to zero the dynamically allocated memory
controlled by the left hand side operand is deleted. Then it shares the information
with the operator’s right hand side operand, incrementing the information’s reference
count.
• shared_ptr &operator=(shared_ptr<Type> &&tmp):
Move assignment: the reference count of the operator’s left hand side operand is
reduced. If the reference count decays to zero the dynamically allocated memory
18.4. THE CLASS ’SHARED_PTR’ 487
controlled by the left hand side operand is deleted. Then it grabs the information
controlled by the operator’s right hand side operand which is turned into a 0-pointer.
• operator bool() const:
If the shared_ptr actually points to memory true is returned, otherwise, false is
returned.
• Type &operator_():
A reference to the information stored in the shared_ptr object is returned. It acts
like a normal pointer.
• Type _operator->():
A pointer to the information controlled by the shared_ptr object is returned. Example:
shared_ptr<string> sp(new string("hello"));
cout << sp->c_str() << ’\n’;
The following member function member functions are supported:
• Type _get():
A pointer to the information controlled by the shared_ptr object is returned. It
acts like operator->. The returned pointer can be inspected. If it is zero the
shared_ptr object does not point to any memory.
• Deleter &get_deleter():
A reference to the shared_ptr’s deleter (function or function object) is returned.
• void reset(Type _):
The reference count of the information controlled by the shared_ptr object is reduced
and if it decays to zero the memory it points to is deleted. Thereafter the
object’s information will refer to the argument that is passed to the function, setting
its shared count to 1. It can also be called without argument, turning the object into
a 0-pointer. This member function can be used to assign a new block of dynamically
allocated memory to a shared_ptr object.
• void shared_ptr<Type>::swap(shared_ptr<Type> &&):
Two identically typed shared_ptrs are swapped.
• bool unique() const:
If the current object is the only object referring to the memory controlled by the object
true is returned otherwise (including the situation where the object is a 0-pointer)
false is returned.
• size_t use_count() const:
The number of objects sharing the memory controlled by the object is returned.
488 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
18.4.5 Casting shared pointers
Be cautious when using standard C++ style casts in combination with shared_ptr objects. Consider
the following two classes:
struct Base
{};
struct Derived: public Base
{};
As with unique_ptr, when defining a shared_ptr<Base> to store a newly allocated Derived
class object, the returned Base _ may be cast to a Derived _ using a static_cast: polymorphism
isn’t required, and when resetting the shared_ptr or when the shared_ptr goes out of scope, no
slicing occurs, and Derived’s destructor is called (cf. section 18.3).
Of course, a shared_ptr<Derived> can easily be defined. Since a Derived object is also a
Base object, a pointer to Derived can be considered a pointer to Base without using casts, but
a static_cast could be used for force the interpretation of a Derived _ to a Base _:
Derived d;
static_cast<Base *>(&d);
However, a plain static_cast cannot be used when initializing a shared pointer to a Base using
the get member of a shared pointer to a Derived object. The following code snipped eventually
results in an attempt to delete the dynamically allocated Base object twice:
shared_ptr<Derived> sd(new Derived);
shared_ptr<Base> sb(static_cast<Base *>(sd.get()));
Since sd and sb point at the same object ~Base will be called for the same object when sb goes out
of scope and when sd goes out of scope, resulting in premature termination of the program due to a
double free error.
These errors can be prevented using casts that were specifically designed for being used with
shared_ptrs. These casts use specialized constructors that create a shared_ptr pointing to memory
but shares ownership (i.e., a reference count) with an existing shared_ptr. These special casts
are:
• std::static_pointer_cast<Base>(std::shared_ptr<Derived> ptr):
A shared_ptr to a Base class object is returned. The returned shared_ptr refers
to the base class portion of the Derived class to which the shared_ptr<Derived>
ptr refers. Example:
shared_ptr<Derived> dp(new Derived());
shared_ptr<Base> bp = static_pointer_cast<Base>(dp);
• std::const_pointer_cast<Class>(std::shared_ptr<Class const> ptr):
A shared_ptr to a Class class object is returned. The returned shared_ptr refers
to a non-const Class object whereas the ptr argument refers to a Class const
object. Example:
shared_ptr<Derived const> cp(new Derived());
shared_ptr<Derived> ncp = const_pointer_cast<Derived>(cp);
18.4. THE CLASS ’SHARED_PTR’ 489
• std::dynamic_pointer_cast<Derived>(std::shared_ptr<Base> ptr):
A shared_ptr to a Derived class object is returned. The Base class must have
at least one virtual member function, and the class Derived, inheriting from Base
may have overridden Base’s virtual member(s). The returned shared_ptr refers to
a Derived class object if the dynamic cast from Base _ to Derived _ succeeded. If
the dynamic cast did not succeed the shared_ptr’s get member returns 0. Example
(assume Derived and Derived2 were derived from Base):
shared_ptr<Base> bp(new Derived());
cout << dynamic_pointer_cast<Derived>(bp).get() << ’ ’ <<
dynamic_pointer_cast<Derived2>(bp).get() << ’\n’;
The first get returns a non-0 pointer value, the second get returns 0.
18.4.6 Using ‘shared_ptr’ objects for arrays
Different from the unique_ptr class no specialization exists for the shared_ptr class to handle
dynamically allocated arrays of objects.
But like unique_ptrs, with shared_ptrs referring to arrays the dereferencing operator makes
little sense while in these circumstances shared_ptr objects would benefit from index operators.
It is not difficult to create a class shared_array offering such facilities. The class template
shared_array, derived from shared_ptr merely should provide an appropriate deleter to make
sure that the array and its elements are properly destroyed. In addition it should define the index
operator and optionally could declare the derefencing operators using delete.
Here is an example showing how shared_array can be defined and used:
struct X
{
~X()
{
cout << "destr\n"; // show the object’s destruction
}
};
template <typename Type>
class shared_array: public shared_ptr<Type>
{
struct Deleter // Deleter receives the pointer
{ // and calls delete[]
void operator()(Type* ptr)
{
delete[] ptr;
}
};
public:
shared_array(Type *p) // other constructors
: // not shown here
shared_ptr<Type>(p, Deleter())
{}
Type &operator[](size_t idx) // index operators
{
return shared_ptr<Type>::get()[idx];
490 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
}
Type const &operator[](size_t idx) const
{
return shared_ptr<Type>::get()[idx];
}
Type &operator*() = delete; // delete pointless members
Type const &operator*() const = delete;
Type *operator->() = delete;
Type const *operator->() const = delete;
};
int main()
{
shared_array<X> sp(new X[3]);
sp[0] = sp[1];
}
18.5 Smart smart pointer construction: ‘make_shared’ and
‘make_unique’
Usually a shared_ptr is initialized at definition time with a pointer to a newly allocated object.
Here is an example:
std::shared_ptr<string> sptr(new std::string("hello world"))
In such statements two memory allocation calls are used: one for the allocation of the std::string
and one used interally by std::shared_ptr’s constructor itself.
The two allocations can be combined into one single allocation (which is also slightly more efficient
than explicitly calling shared_ptr’s constructor) using the make_shared template. The function
template std::make_shared has the following prototype:
template<typename Type, typename ...Args>
std::shared_ptr<Type> std::make_shared(Args ...args);
Before using make_shared the <memory> header file must be included.
This function template allocates an object of type Type, passing args to its constructor (using perfect
forwarding, see section 22.5.2), and returns a shared_ptr initialized with the address of the newly
allocated Type object.
Here is how the above sptr object can be initialized using std::make_shared. Notice the use of
auto which frees us from having to specify sptr’s type explicitly:
auto sptr(std::make_shared<std::string>("hello world"));
After this initialization std::shared_ptr<std::string> sptr has been defined and initialized.
It could be used as follows:
std::cout << *sptr << ’\n’;
The C++14 standard also offers std::make_unique, which can be used like make_shared but
constructs a std::unique_ptr rather than a shared_ptr.
18.6. CLASSES HAVING POINTER DATA MEMBERS 491
18.6 Classes having pointer data members
Classes having pointer data members require special attention. In particular at construction time
one must be careful to prevent wild pointers and/or memory leaks. Consider the following class
defining two pointer data members:
class Filter
{
istream *d_in;
ostream *d_out;
public:
Filter(char const *in, char const *out);
};
Assume that Filter objects filter information read from _d_in and write the filtered information
to _d_out. Using pointers to streams allows us to have them point at any kind of stream like
istreams, ifstreams, fstreams or istringstreams. The shown constructor could be implemented
like this:
Filter::Filter(char const *in, char const *out)
:
d_in(new ifstream(in)),
d_out(new ofstream(out))
{
if (!*d_in || !*d_out)
throw string("Input and/or output stream not available");
}
Of course, the construction could fail. new could throw an exception; the stream constructors could
throw exceptions; or the streams could not be opened in which case an exception is thrown from the
constructor’s body. Using a function try block helps. Note that if d_in’s initialization throws, there’s
nothing to be worried about. The Filter object hasn’t been constructed, its destructor is not be
called and processing continues at the point where the thrown exception is caught. But Filter’s
destructor is also not called when d_out’s initialization or the constructor’s if statement throws:
no object, and hence no destructor is called. This may result in memory leaks, as delete isn’t called
for d_in and/or d_out. To prevent this, d_in and d_out must first be initialized to 0 and only then
the initialization can be performed:
Filter::Filter(char const *in, char const *out)
try
:
d_in(0),
d_out(0)
{
d_in = new ifstream(in);
d_out = new ofstream(out);
if (!*d_in || !*d_out)
throw string("Input and/or output stream not available");
}
catch (...)
{
492 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
delete d_out;
delete d_in;
}
This quickly gets complicated, though. If Filter harbors yet another data member of a class whose
constructor needs two streams then that data cannot be constructed or it must itself be converted
into a pointer:
Filter::Filter(char const *in, char const *out)
try
:
d_in(0),
d_out(0)
d_filterImp(*d_in, *d_out) // won’t work
{ ... }
// instead:
Filter::Filter(char const *in, char const *out)
try
:
d_in(0),
d_out(0),
d_filterImp(0)
{
d_in = new ifstream(in);
d_out = new ofstream(out);
d_filterImp = new FilterImp(*d_in, *d_out);
...
}
catch (...)
{
delete d_filterImp;
delete d_out;
delete d_in;
}
Although the latter alternative works, it quickly gets hairy. In situations like these smart pointers
should be used to prevent the hairiness. By defining the stream pointers as (smart pointer)
objects they will, once constructed, properly be destroyed even if the rest of the constructor’s code
throws exceptions. Using a FilterImp and two unique_ptr data members Filter’s setup and its
constructor becomes:
class Filter
{
std::unique_ptr<std::ifstream> d_in;
std::unique_ptr<std::ofstream> d_out;
FilterImp d_filterImp;
...
};
Filter::Filter(char const *in, char const *out)
try
18.7. LAMBDA EXPRESSIONS 493
:
d_in(new ifstream(in)),
d_out(new ofstream(out)),
d_filterImp(*d_in, *d_out)
{
if (!*d_in || !*d_out)
throw string("Input and/or output stream not available");
}
We’re back at the original implementation but this time without having to worry about wild pointers
and memory leaks. If one of the member initializers throws the destructors of previously constructed
data members (which are now objects) are always called.
As a rule of thumb: when classes need to define pointer data members they should define those
pointer data members as smart pointers if there’s any chance that their constructors throw exceptions.
18.7 Lambda expressions
C++ supports lambda expressions. As we’ll see in chapter 19 generic algorithms often accept arguments
that can either be function objects or plain functions. Examples are the sort (cf. section
19.1.58) and find_if (cf. section 19.1.16) generic algorithms. As a rule of thumb: when a called
function must remember its state a function object is appropriate, otherwise a plain function can be
used.
Frequently the function or function object is not readily available, and it must be defined in or
near the location where it is used. This is commonly realized by defining a class or function in the
anonymous namespace (say: class or function A), passing an A to the code needing A. If that code is
itself a member function of the class B, then A’s implementation might benefit from having access to
the members of class B.
This scheme usually results in a significant amount of code (defining the class), or it results in
complex code (to make available software elements that aren’t automatically accessible to A’s code).
It may also result in code that is irrelevant at the current level of specification. Nested classes don’t
solve these problems either. Moreover, nested classes can’t be used in templates.
Lamba expressions solve these problems. A lambda expression defines an anonymous function object
which may immediately be passed to functions expecting function object arguments, as explained in
the next few sections.
18.7.1 Lambda expressions: syntax
A lambda expression defines an anonymous function object, also called a closure object. When a
lambda expression is evaluated it results in a temporary object (the closure object). The type of a
closure object is called its closure type.
Lambda expressions are used inside blocks, classes or namespaces (i.e., pretty much anywhere you
like). Their implied closure type is defined in the smallest block, class or namespace scope containing
the lamba expression. The closure object’s visibility starts at its point of definition and ends where
its closure type ends.
The closure type defines a (const) public inline function call operator. Here is an example of a
494 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
lambda expression:
[] // the ‘lambda-introducer’
(int x, int y) // the ‘lambda-declarator’
{ // a normal compound-statement
return x * y;
}
The function call operator of the closure object created by this lambda expression expects two int
arguments and returns their product. It is an inline const member of the closure type. To drop the
const attribute, the lamba expression should specify mutable, as follows:
[](int x, int y) mutable
...
The lambda-declarator may be omitted, if no parameters are defined. The parameters in a lamba
declarator may not be given default arguments.
A closure object as defined by the above lamda expression could be used e.g., in combination with the
accumulate (cf. section 19.1.1) generic algorithm to compute the product of a series of int values
stored in a vector:
cout << accumulate(vi.begin(), vi.end(), 1,
[](int x, int y) { return x * y; });
The above lambda function uses the implicit return type decltype(x _ y). An implicit return
type can be used in these cases:
• the lambda expression does not contain a return statement (i.e., a void lambda expression);
• the lambda expression contains a single return statement; or
• the lambda expression contains multiple return statements returning values of identical
types (e.g., all int values).
If there are multiple return statements returning values of different types then the lambda expression’s
return type must specified be explicitly using a late-specified return type, (cf. section
3.3.5):
[](int x, int y) -> int
{
return y < 0 ?
x / static_cast<double>(y)
:
z + x;
}
Variables that are visible at the location of a lambda expression can be accessed by the lambda
expression. How these variables are accessed depends on the contents of the lambda-introducer (the
area between the square brackets, called the lambda-capture). The lambda-capture allows passing
a local context to lambda expressions.
18.7. LAMBDA EXPRESSIONS 495
Visible global and static variables as well as local variables defined in the lambda expression’s compound
statement itself can directly be accessed and, when applicable, modified. Example:
int global;
void fun()
{
[]() // [] may contain any specification
{
int localVariable = 0;
localVariable = ++global;
};
}
Lambda expressions that are defined inside a (non-static) class member function then using an initial
& or = character in the lambda-capture enables the this pointer, allowing the lambda expression
access to all class members (data and functions). In that case the lambda expression may modify
the class’s data members.
If a lambda expression is defined inside a function then the lambda expression may access all the
function’s local variables which are visible at the lambda expression’s point of definition.
An initial & character in the lambda-capture accesses these local variables by reference. These
variables can then be modified from within the lambda expression.
An initial = character in the lambda-capture creates a local copy of the referred-to local variables.
Note that in this case the values of these local copies can only be changed by the lambda expression
if the lambda expression is defined using the mutable keyword. E.g.,
struct Class
{
void fun()
{
int var = 0;
[=]() mutable
{
++var; // modifies the local
} // copy, not fun’s var
}
}
Fine-tuning is also possible. With an initial =, comma-separated &var specifications indicate that
the mentioned local variables should be processed by reference, rather than as copies; with an initial
&, comma separated var specifications indicate that local copies should be used of the mentioned
local variables. Again, these copies have immutable values unless the lambda expression is provided
with the mutable keyword.
Another fine-tuning consists of using this in the lambda-capture: it also allows the lambdaexpression
to access the surrounding class members. Example:
class Data
{
std::vector<std::string> d_names;
public:
496 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
void show() const
{
int count = 0;
std::for_each(d_names.begin(), d_names.end(),
[this, &count](std::string const &name)
{
std::cout << ++count << ’ ’ <<
capitalized(name) << ’\n’;
}
);
}
private:
std::string capitalized(std::string name);
};
Although lambda expressions are anonymous function objects, they can be assigned to variables.
Often, the variable is defined using the keyword auto. E.g.,
auto sqr = [](int x)
{
return x * x;
};
The lifetime of such lambda expressions is equal to the lifetime of the variable receiving the lambda
expression as its value.
18.7.2 Using lambda expressions
Now that the syntax of lambda expressions have been covered let’s see how they can be used in
various situations.
First we consider named lambda expressions. Named lambda expressions nicely fit in the niche of
local functions: when a function needs to perform computations which are at a conceptually lower
level than the function’s task itself, then it’s attractive to encapsulate these computations in a separate
support function and call the support function where needed. Although support functions can
be defined in anonymous namespaces, that quickly becomes awkward when the requiring function
is a class member and the support function also must access the class’s members.
In that case a named lambda expression can be used: it can be defined inside a requiring function,
and it may be given full access to the surrounding class. The name to which the lambda expression is
assigned becomes the name of a function which can be called from the surrounding function. Here is
an example, converting a numeric IP address to a dotted decimal string, which can also be accessed
directly from an Dotted object (all implementations in-class to conserve space):
class Dotted
{
std::string d_dotted;
public:
std::string const &dotted() const
{
return d_dotted;
18.7. LAMBDA EXPRESSIONS 497
}
std::string const &dotted(size_t ip)
{
auto octet =
[](size_t idx, size_t numeric)
{
return to_string(numeric >> idx * 8 & 0xff);
};
d_dotted =
octet(3, ip) + ’.’ + octet(2, ip) + ’.’ +
octet(1, ip) + ’.’ + octet(0, ip);
return d_dotted;
}
};
Next we consider the use of generic algorithms, like the for_each (cf. section 19.1.17):
void showSum(vector<int> const &vi)
{
int total = 0;
for_each(
vi.begin(), vi.end(),
[&](int x)
{
total += x;
}
);
std::cout << total << ’\n’;
}
Here the variable int total is passed to the lambda expression by reference and is directly accessed
by the function. Its parameter list merely defines an int x, which is initialized in sequence
by each of the values stored in vi. Once the generic algorithm has completed showSum’s variable
total has received a value that is equal to the sum of all the vector’s values. It has outlived the
lambda expression and its value is displayed.
But although generic algorithms are extremely useful, there may not always be one that fits the
task at hand. Furthermore, an algorithm like for_each looks a bit unwieldy, now that the language
offers range-based for-loops. So let’s try this, instead of the above implementation:
void showSum(vector<int> const &vi)
{
int total = 0;
for (auto el: vi)
[&](int x)
{
total += x;
};
std::cout << total << ’\n’;
}
498 CHAPTER 18. THE STANDARD TEMPLATE LIBRARY
But when showSum is now called, its cout statement consistently reports 0. What’s happening here?
When a generic algorithm is given a lambda function, its implementation instantiates a reference to
a function. that referenced function is thereupon called from within the generic algorithm. But, in
the above example the range-based for-loop’s nested statement merely represents the defintion of a
lamba function. Nothing is actually called, and hence total remains equal to 0.
Thus, to make the above example work we not only must define the lambda expression, but we must
also call the lambda function. We can do this by giving the lambda function a name, and then call
the lamba function by its given name:
void showSum(vector<int> const &vi)
{
int total = 0;
for (auto el: vi)
{
auto lambda = [&](int x)
{
total += x;
};
lambda(el);
}
std::cout << total << ’\n’;
}
In fact, there is no need to give the lambda function a name: the auto lambda definition represents
the lambda function, which could also directly be called. The syntax for doing this may look a bit
weird, but there’s nothing wrong with it, and it allows us to drop the compound statment, required
in the last example, completely. Here goes:
void showSum(vector<int> const &vi)
{
int total = 0;
for (auto el: vi)
[&](int x)
{
total += x;
}(el); // immediately append the
// argument list to the lambda
// function’s definition
std::cout << total << ’\n’;
}
lambda expressions can also be used to prevent spurious returns from condition_variable’s
wait calls (cf. section 20.5.3).
The class condition_variable allows us to do so by offering wait members expecting a lock and
a predicate. The predicate checks the data’s state, and returns true if the data’s state allows the
data’s processing. Here is an alternative implementation of the down member shown in section
20.5.3, checking for the data’s actual availability:
void down()
{
18.7. LAMBDA EXPRESSIONS 499
unique_lock<mutex> lock(sem_mutex);
condition.wait(lock,
[&]()
{
return semaphore != 0
}
);
--semaphore;
}
The lambda expression ensures that wait only returns once semaphore has been incremented.
Lambda expression are primarily used to obtain functors that are used in a very localized section
of a program. Since they are used inside an existing function we should realize that once we use
lambda functions multiple aggregation levels are mixed. Normally a function implements a task
which can be described at its own aggregation level using just a few sentences. E.g., “the function
std::sort sorts a data structure by comparing its elements in a way that is appropriate to the
context where sort is called”. By using an existing comparison method the aggregation level is
kept, and the statement is clear by itself. E.g.,
sort(data.begin(), data.end(), greater<DataType>());
If an existing comparison method is not available, a tailor-made function object must be created.
This could be realized using a lambda expression. E.g.,
sort(data.begin(), data.end(),
[&](DataType const &lhs, DataType const &rhs)
{
return lhs.greater(rhs);
}
);

You might also like