C++ Course Content
C++ Course Content
[email protected] [email protected]
1. C++ Course Home . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1 A Blatant Plug for the Cadence Quality Initiative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Notation and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 OO Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 Object Example: A Layout System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.2 Data-less Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.4 Problem Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5 Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6 Modularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.7 Assignment #1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
1.8 Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
1.9 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
1.10 Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
1.11 Generic Programming in OOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
1.12 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
1.13 Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
1.14 The Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
1.15 Backward Compatibility with C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
1.16 What is this? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
1.17 Casting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
1.18 Assignment #2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
1.19 A Sidebar on the Default Constructor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
1.20 Operator Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
1.21 Memory Allocation and Deallocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
1.22 I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
1.23 std::string . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
1.24 Templates And Generic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
1.24.1 A Handy Trick to Get Template Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
1.25 Standard Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
1.26 Cadence Proprietary C++ Class Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
1.27 Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
1.28 Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
1.29 Explicit Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
1.30 RTTI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
1.31 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
1.32 Machine Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
1.33 Design Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
1.34 Template Metaprogramming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
1.35 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
1.35.1 Public, Private, and Contract Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
1.35.2 Templates and Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
1.35.3 Creating Class Headers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
1.35.4 Compiler-Generated Class Members . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
2. C++ Course Syllabus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
C++ Course Home
TOC
Instructors
Structure
Feedback
Syllabus
Part I - Object-Oriented Programming
Part II - C++ Programming
Reference Books
Instructors
John Croix
26 years C++ programming experience
Taught programming at the University of Louisville and has done a number of programming tutorials on various CS subjects
including GPU programming and multi-threading
[email protected]
Joe Rahmeh
Involved in the design, implementation, and deployment of 4 ECAD systems written in C++
Taught courses on C++ at the University of Texas at Austin and a number of technology companies
[email protected]
Structure
The course interleaves abstract OO concepts with concrete C++ constructs. At the end of the lesson, the student should understand what the
introduced OO concept is and why it is important. Source code is presented to show how the construct is implemented in C++, often times along
with an equivalent C implementation, along with an analysis of the specific implementation. It's main purpose is illustrative. The class will not dwell
on aspects of a C++ implementation that are beyond the scope of the class in which it is presented.
The C++ code presented may be significantly more advanced than the concepts presented to date in the class. That is intentional. The goal of the
code is two-fold:
Highlight a specific aspect of the C++ implementation for the introduced concept. The more advanced aspects of the code can be ignored
for this introduction.
Act as a reference for students that have already participated in the class, and who now have a better understanding of the code, so that
the concepts are seen within a larger, more complete framework.
During the second part of the class, where the focus turns specifically to the C++ language, specific implementation concepts will be dealt with in
more detail. For example, where the C++ keyword "virtual" may be mentioned with respect to objects and inheritance, a great deal of time will
be spent on this keyword when delving into the specifics of the C++ language.
Your Feedback
Your feedback on each class session will help to make subsequent classes better. Please take 5-10 minutes to fill out a survey on the class. The
survey can be found online at https://www.surveymonkey.com/s/CPPClass.
Syllabus
This course covers object-oriented (OO) programming generically, and delves into the C++ language specifically. Generic OO concepts are useful
across a broad array of programming languages including Java, Python, Perl 5, C++, C#, Smalltalk, and many, many others. The C++ language
was designed by Bjarne Stroustrup during his time at Bell Labs in the late 1970's and early 1980's. He was attempting to marry some of the OO
constructs of the Simula programming language to C while maintaining backward compatibility with C. Backward compatibility was a requirement
due to the millions of lines of C code already in place, but it led to compromises in the language. As a result, there are many things that C++ does
well from an OO standpoint, but others that remain awkward, to put it mildly. Despite its limitations, though, C++ continues to be one of the most
popular programming languages in use today, especially in environments in which execution speed and tight integration to hardware is critical.
After the generic OO constructs have been covered, C++ features and libraries are presented. In this section of the course, specific C++
implementation specifics, and the relationship of the language to the hardware, is given greater emphasis. The goal of this portion of the course is
to give the student a sufficient understanding of C++ so that various implementation tradeoffs can be analyzed so that the desired speed or
memory footprint can be achieved for a given object or module.
NOTE: The syllabus is still under construction and may change as the course content is finalized.
Other Topics
Safe programming
No goto statements
Avoiding "void *"
const implies thread safe -- mutable can thwart this
Principle of least surprise
Optimization
Inline APIs
Compile-time decisions vs runtime decisions
Implication of hardware/compiler <---> software design
Object model on memory layout
Memory model (ex volatile, atomic, memory barriers)
Thread safety
Resource management
Resources allocation and release through construction and destruction (RAII)
Code organization
What goes into header vs .cpp file
Cost of building a project due to the layout
Declaration forwarding
Inline
Understanding how compilers work
If statements are assumed to be true, for example
Putting most likely and/or least expensive failure test first in compound condition
Cost of function call
APIs/member functions
Self-documenting APIs
gets() as an example of a bad API
Separation of concerns (messaging)
Defensive programming
Assertions
Exception handling
Tools (GDB, profiler, race condition detection)
Ellipses
While ellipses are a valid C/C++ construct1, within the context of this class, they are shown within classes and code examples simply to indicate
that additional logic or code may appear where the ellipses are. Any code that might be inserted in lieu of the ellipses is not germane to the
example. For example, the code below shows an API to draw a circle onto a X11 display2. The actual X11 commands to draw the circle aren't
important, but the definition of the render() API for the Circle class is.
Ellipse Example
void Circle::render()
{
// This routine draws the circle onto a X11 display
...
}
Naming Convention
class SomeClass
{
enum SomeEnum
{
ENUM_VALUE_1,
ENUM_VALUE_2,
ENUM_VALUE_3
};
unsigned mVal1;
static char gChar;
public:
SomeClass();
SomeClass(const SomeClass ©);
~SomeClass();
SomeClass::SomeClass()
{
...
}
Guidelines
Values that don't change are explicitly passed into APIs as const parameters
Compilers can optimize constant values in ways that they can't optimize non-constant values, especially in inline APIs
If a pointer is passed in, and if the class assumes ownership of the memory, explicit commenting within the class should indicate that
memory belongs to the class after the call
As a matter of taste, some people prefer to see the public APIs at the top of a class instead of at the bottom
Makes it easier to see the interface
Must put "public:" at the top of the class
By default, C++ class members are private unless marked otherwise
All members of a C++ struct are public unless marked otherwise
There is no private or public attribute in C, so this maintains backwards compatibility between C++ and C
public:
SomeClass();
SomeClass(const SomeClass ©);
~SomeClass();
private:
/*
* In this example, the data is at the bottom of the class instead
* of at the top. An explicit "private" must be specified in this
* case to separate the access permissions of the APIs from the data
*/
enum SomeEnum
{
ENUM_VALUE_1,
ENUM_VALUE_2,
ENUM_VALUE_3
};
unsigned mVal1;
static char gChar;
};
...
SomeStruct *s = (SomeStruct *) malloc( sizeof( SomeStruct ) + 3 * sizeof( float ) );
s->mNumFloats = 3;
s->mFloats[ 0 ] = 3.141592654;
s->mFloats[ 1 ] = 2.718281828;
s->mFloats[ 2 ] = 1.414213562;
Only put 1 class definition into a given header file or source file
Name the files with the name of the object
Use namespaces instead of acronyms to prefaces your object names
Create a directory for each namespace and put all files for code within that namespace inside of that directory
Inline APIs
C++ allows APIs to be both defined and declared in a class (like Java, Python, and others)
In C++, this creates inline APIs
Inline member APIs can also be defined after the class declaration
Can specify inline on the member to let the compiler know that it's going to be inlined
When defined after the class, an API should be defined before it is used so that the compiler knows it should use an inline A
PI instead of a function call
Unless inline is specified in the class definition
The coding style choice is largely a matter of preference
void incrementX()
{
/*
* The API definition is embedded in the class definition
*/
mX++;
}
};
Inline APIs - Example 2
class SomeClass
{
unsigned mX;
...
void incrementX();
void otherAPI();
...
};
mX++;
}
...
incrementX();
...
}
Inline APIs - Example 3
class SomeClass
{
unsigned mX;
...
...
incrementX();
...
}
mX++;
}
Memory
Compilers usually pad the size of a data structure (struct or class)
Think carefully about memory utilization in your class because an extra byte might result in wasted bytes
Can "pack" your data structure with compiler directives, but probably not portable
Memory allocators usually return memory aligned to 8 bytes
Not a guarantee
16-byte alignment is not common but does occur, especially on numerical applications making heavy use of SSE instructions
Data Padding
$ cat > j.cpp << EOF
#include <iostream>
struct SomeStruct
{
char y;
double x;
};
Final Thoughts
It's called "computer science", but it's really an art
Knuth series: The Art of Computer Programming
Another Cadence employee3 that used to teach C put it this way: "When teaching C, I always told my students I would
distinguish fact from opinion. I think that's a good dividing line between the science and art of programming."
Even very experienced C++ programmers may disagree on style and form
There is no single right answer
Some of the decisions come down to experience while others have to do with the way one person thinks vs the way another
person thinks
Even respected OO programming books contradict one another quite frequently
Consistency can be the key to ensuring high code quality
Suppose programmer A always prefaces their member variables with an underscore
Programmer B has to enhance a module written by programmer A while A is on vacation
Programmer B doesn't like underscores and starts all of their member variables with a capital letter
Programmer B decides to utilize their style in A's code, making extensive changes
Programmer A comes back from vacation to find their code is no longer consistent but lives with it because there's no time to do
anything about it
Programmer A and B are transferred, and junior programmer C is put on the project to debug a major problem and expand the
code
Because the code is not consistent, programmer C has a hard time following the code, especially as it weaves between routines
written by programmers A and B
Knowing how the compiler converts your code into an executable and how it allocates memory can make a huge difference to the
performance and memory footprint of your product
Think defensively
If you had to support your own code 3 years from now, having not seen it at all over that time, what would you need to see in the
code to come up to speed rapidly?
1 Ellipses in C/C++ are used to indicate a variable number of unspecified arguments that can be passed to a function or class member API.
2 For information on drawing a circle under X11, please refer to a book on X11 programming.
3 Steve Esposito
OO Programming
Definitions
Wikipedia: "Object-oriented programming (OOP) is a programming paradigm that represents concepts as 'objects' that have data fields
(attributes that describe the object) and associated procedures known as methods. Objects, which are usually instances of classes, are
used to interact with one another to design applications and computer programs."
Whatis.com: "Object-oriented programming (OOP) is a programming language model organized around objects rather than "actions" and
data rather than logic."
Webopedia.com: "A type of programming in which programmers define not only the data type of a data structure, but also the types of
operations (functions) that can be applied to the data structure. In this way, the data structure becomes an object that includes both data
and functions. In addition, programmers can create relationships between one object and another. For example, objects can inherit
characteristics from other objects."
But what does all of this mean?
Procedural Programming
Examples: C, Pascal, FORTRAN, Tcl
Programs are comprised of procedures or subroutines
Subroutines are computational steps that operate on data passed into them (or on globals)
Of course, subroutines can invoke other subroutines
There's no association between the data and the subroutine other than through the parameters passed to the subroutines or the global
variables that they operate upon
Callers must understand not just the data passed to a subroutine but also the organization of the data passed in
There's no generic entity called a container that hides the format of the data passed in (linked list, vector, map, set, etc.)
Data is organized in structures or groups (a struct in C or record in Pascal, for example)
Subroutines that utilize these structures must understand the contents of each structure by both name and function
A field should have a descriptive name to indicate what it's doing
Limitations on the use and/or modification of a field in a structure is via convention and is not enforceable
Removing data from a struct means scouring code to search for all subroutines that operate upon that data
There is no central area that contains the only code responsible for using or modifying a structure
Adding data to a struct requires scouring code to search for all subroutines that create, copy, initialize, or destroy the structs
(as a collection of data) to ensure that new fields are handled appropriately
Any subroutine can modify the contents of any non-constant data structure passed to it
If it is discovered that, during runtime, a data structure becomes corrupt, debugging requires that each subroutine that operates
on a data structure be validated
This can be especially challenging if access to some source code is not possible (linking in .o, .so, or .a files instead of compiling
all source code into the executable)
Performance tuning requires tuning of each routine that uses the data structure as the code is potentially distributed throughout the
program
Note that many scripting languages support objects, though they don't refer to them that way
Lists
Associative arrays/maps
Object-Oriented Languages
Should provide support for 3 key features
Data abstraction
Implementation details of an object are hidden
APIs (methods) are used to have the object perform an action
May be referred to as sending a message to the object
Implication is that the data in an object and the methods that act upon an object are bound together and treated as a
single entity
Inheritance
New objects are derived from other objects
Exploits the "is-a" relationship
A circle is a shape
A rectangle is a shape
In inheritance, the new objects are typically specialized forms of the objects that they inherit from and add new features
Though not explicitly stated, it is implied that any inheriting object instance can also be passed to a routine expecting an
instance of the object that it inherits from
Object Circle inherits from object Shape
A function expects to receive a pointer to an instance of object Shape
An instance of object Circle can be passed to this function because Circle is a type of object Shape
Polymorphism (also known as dynamic binding or runtime binding)
Inheriting objects can override the methods associated with the objects they inherit from, even when the object is
referenced as the object inherited from
Object Shape has an API named draw()
Object Rectangle inherits from Shape and also has an API named draw()
When calling draw() on an instance of Rectangle, even when doing it as if it was an instance of Shape, Rec
tangle's draw() is executed
Objects
Nicholas Wirth wrote Algorithms + Data Structures = Programs
Still holds true in OO programming
The difference is that the data structures and algorithms are internal to an object
In dealing with objects in programming, we create a paradigm that most closely resembles how people interact with the world in life
People deal with objects as a whole, not with each particular component of an object
Examples:
Car: To drive a car, we associate actions with the object "car": turn left, brake, accelerate, open door, etc.
Mouse: Scroll down, move left, left-click, middle-click, etc.
Home: Doors, windows, roof, flooring, walls, etc.
Objects can be a collection of other objects, establishing relationships between them (composition as opposed to inheritance)
A car is a collection of parts: wheels, brakes, doors, etc.
The engineers that designed the car created relationships between the parts that make up the car
The majority of these parts are hidden from us, and we are unaware of how they operate in order to cause the car to
behave according to our directions
We provide commands (send messages) to the car through our actions that cause the parts of the car to work in concert with
one another to obtain the desired result
We are protected from using parts of the car in a manner for which they were not designed
When driving down the road we have no means to reach through the mechanics associated with the car in real time to
redirect the gas from the fuel injector directly to the muffler
A common attribute to objects across multiple OO languages is that they typically have a mechanism to initialize the object and to destroy
the object
Initialization
May have multiple forms for initialization, depending on the way an instance is declared
Sets the initial state of the object
May be empty
Beware: If not defined explicitly by the programmer, the language may provide a default implementation which may not
yield valid/desired results
Destruction
Invoked when an instance goes out of scope
Cleans up the contents of the object
Example: a file object used to read and/or write files
If the initializer has a name and mode passed to it, it attempts to open the file for reading or writing, depending on the
mode
If the initializer has no name, no file is opened for reading and writing, and another method must be used to open a
specific named file
The destructor purges any buffered contents, closes any open file, and returns dynamic memory to the heap
In C++, they keyword class is used to define an object. The initializer is called the constructor. The method used to destroy an
object is called the destructor
Constructor
The constructor is identified by using the name of the class as the API name
Multiple constructors are allowed, as long as they have different parameters
There is no return type for the constructor
Not required, but compiler will generate default constructors if the programmer doesn't supply them
Destructor
The destructor is identified by a tilde (~) followed by the name of the class
Only one destructor exists per class
There is no return type for the destructor
Not required, but the compiler will generate an empty destructor if not supplied
OO Programming
In object-oriented programming, the data and methods that work on the data are bound together
The mechanics associated with "how" something is done is abstracted away
Typically this is done by making the internal components of the object inaccessible to the outside world
Python doesn't restrict variables from direct access, though it does provide methods
Using a method for a Python object is convenient and is typically adhered to via convention instead of restriction
The user of an object tells the object "what" it should do (vs "how" it should be done)
The object is responsible for executing the command, utilizing it's internal components, and maintaining a correct state upon
completion of the command
It is possible (though not advisable) to break the OO paradigm and separate data from methods, or expose data such that it can be
accessed directly outside of an object
This basically regresses the system from an object-oriented system into a procedural system
Working with objects in this fashion, though, doesn't make it object-oriented. It's still procedural programming under a different
guise
Once explained, OO programming appears to be easy to quickly become an expert at
It's actually not
It requires a mindset change
It takes practice
Personal experience from John Croix, learning C++ in 1987
Took 6 months to call myself an OO programmer
Another 6 months before I said that I was good at it
YMMV
Programmers tend to want to initially go overboard so that even a single character becomes a unique object
Some languages are built with this in mind (Simula, Smalltalk, others)
Knowing how to group related data into an object, and how granular that data should be, is one of the biggest
challenges when learning OO programming
One immediate benefit of OO programming is access to libraries of powerful object types and to patterns that make OO programming
easier
More on these topics later in the course
Example
Click here to see how objects can be used to design a layout system of wires and vias
/*
* Helpful APIs supplied so that the user doesn't manipulate the struct
* Animal directly
*/
...
unsigned i;
const unsigned numOffspring = otherAnimal->mNumOffspring;
animal->mNumOffspring = numOffspring;
if (numOffspring)
{
animal->mOffspring = (Animal *) malloc( numOffspring * sizeof( Animal ) );
}
for (i = 0; i < numOffspring; i++)
{
copyAnimal( animal->mOffspring + i, otherAnimal->mOffspring + i );
}
animal->mNumLegs = otherAnimal->mNumLegs;
animal->mScalesOrFur = otherAnimal->mScalesOrFur;
animal->mHasTail = otherAnimal->mHasTail;
}
Animal *createAnimal()
{
/*
* Allocate memory for the struct Animal from the heap and initialize
* the memory
*/
unsigned i;
if (animal->mNumOffspring)
{
deleteAnimalOffspring( animal );
}
Animal *offspring = (Animal *) malloc( numOffspring * sizeof( Animal ) );
for (i = 0; i < numOffspring; i++)
{
initializeAnimal( offspring + i );
}
animal->mOffspring = offspring;
animal->mNumOffspring = numOffspring;
}
unsigned i;
for (i = 0; i < animal->mNumOffspring; i++)
{
destroyAnimal( animal->mOffspring + i );
}
free( animal->mOffspring );
animal->mOffspring = NULL;
animal->mNumOffspring = 0;
}
deleteAnimalOffspring( animal );
initializeAnimal( animal );
}
animal->mOffspring = NULL;
animal->mNumOffspring = 0;
animal->mNumLegs = 0;
animal->mScalesOrFur = UNDEFINED_COAT;
animal->mHasTail = 0;
}
...
Data hiding not really possible in C because there's no programming construct to keep somebody from directly accessing a data member
If somebody doesn't use an API to access data from an Animal instance, there's no way to easily stop them as long as their
code has visibility into the definition of the struct Animal
Their code will need to change if any struct member name changes
A programmer doesn't have to use any of the APIs just because they're there
Must use "Animal" in every API name so that programmers can tell which API they should call
Another C struct might have similar data members and APIs, so the use of the word "Animal" distinguishes the calls
Instead of showing the struct in the header file, only put it into the source file
Deliver object files and header files, but not source files
If another programmer can't see the struct definition, they can't use the fields directly
They must use the APIs provided to get and set values in the struct
Problems with this approach
Can no longer create an Animal on the stack. Every created instance of an Animal requires a call to allocate and deallocate
heap memory, which can be quite expensive
This is Java's approach
Doing anything requires a function call. Nothing can be inline'ed. This can be quite expensive, too.
C++ solves the problem
Associated APIs with the data
Protects the data from direct modification by the user of the class
class Animal
{
public:
enum ScalesOrFur
{
SCALES,
FUR,
NONE,
UNDEFINED_COAT
};
private:
Animal *mOffspring;
unsigned mNumOffspring;
unsigned mNumLegs;
bool mHasTail;
ScalesOrFur mScalesOrFur;
private:
void assign(const Animal &other);
void destroy();
void init();
public:
Animal();
Animal(const Animal &other);
~Animal();
/*****************************************************************************/
/* Private APIs below */
/*****************************************************************************/
/*****************************************************************************/
/* Public APIs below */
/*****************************************************************************/
inline Animal::Animal()
{
init();
}
inline Animal::~Animal()
{
destroy();
}
Data-less Objects
Typically objects bind data together with APIs that operate on the data
Objects that contain no data can also be created
Such objects may also be referred to as "interfaces"
APIs may do real work, or they may simply define APIs that must be defined within other objects that inherit from them
More on this when talking about inheritance (virtual APIs)
C++ Data-less Object
struct SomeObject
{
...
};
struct CompareSomeObject
{
/*
* This struct in C++ has no data and only 1 API. Remember: a
* struct is special type of class in C++
*/
/*
* Can actually declare an instance of CompareSomeObject;
*/
SomeObject a, b;
...
CompareSomeObject compareAB;
const bool aLessThanB = compareAB( a, b );
...
Why would you ever use a data-less object as shown below? Click here to see an example of one in use.
Summary
This was just an introduction to objects and OO programming
Students need to become comfortable with the concept of binding data and APIs together to form an object
Some of the features of C++ can be implemented in C
The first C++ "compiler" was actually a translator to C, so it only makes sense that there is a way to obtain the same capabilities,
though performance will ultimately suffer
C cannot enforce OO constructs because they aren't part of the language
To get the same capabilities, the programmer must enforce the OO constructs through explicit data hiding
Dynamic binding would require a significant amount of effort to emulate in C
OO languages enforce OO styles implicitly
The Animal class was introduced to show both differences and similarities between C and C++
Nobody is expected to understand the concepts fully right now
Future lessons will delve into C++ classes in detail
At the end of class, this information should be well understood
NOTE: This is only for instructional purposes. It's not supposed to be optimized.
Wire Example
class Display { … };
class Coordinate
{
/*
* A coordinate is just a (x,y) point
*/
long mX;
long mY;
public:
Coordinate();
Coordinate(const long x, const long y);
Coordinate(const Coordinate ©);
class Rectangle { … };
class RoundedRectangle { … };
class Via
{
/*
* A via is a rectangle (defined by 2 points) that has a from and to layer
*/
Coordinate mLowerLeft;
Coordinate mUpperRight;
int mStartLayer;
int mEndLayer;
public:
Via();
Via(const Via ©);
~Via();
class WireSegment
{
/*
* A wire segment is a line with a width on a specific layer
*/
int mLayer;
int mWireWidth;
Coordinate mCoords[ 2 ];
public:
WireSegment();
WireSegment(const WireSegment ©);
~WireSegment();
class Wire
{
/*
* A complete wire is a collection of wire segments and bias that connect them
*/
typedef std::deque<WireSegment> WireSegmentDeque;
typedef std::deque<Via> ViaDeque;
WireSegmentDeque mWireSegments;
ViaDeque mVias;
public:
typedef WireSegmentDeque::const_iterator WireSegmentIterator;
typedef ViaDeque::const_iterator ViaIterator;
public:
Wire();
Wire(const Wire ©);
~Wire();
class LayoutSystem
{
/*
* A layout system is, among other things, a collection of wires
*/
typedef std::deque<Wire> WireDeque;
WireDeque mWires;
...
public:
typedef WireDeque::const_iterator WireIterator;
public:
LayoutSystem();
LayoutSystem(const LayoutSystem ©);
~LayoutSystem();
...
void draw(Display *display) const;
/*
* The following APIs show how wires might be drawn on a display
*/
Things to note:
Data-less Objects
Why would you want to create an object that has no data? Suppose that you have a vector of pointers to coordinates, and you wanted to sort the
coordinates. How would you do it?
Sorting With Pointers
class Coordinate
{
long mX;
long mY;
public:
Coordinate();
Coordinate(const Coordinate ©);
~Coordinate();
...
...
struct CompareCoordinatePointers
{
bool operator()(const Coordinate *a, const Coordinate *b) const
{
return( *a < *b );
}
};
...
std::deque<Coordinate*> coords;
...
std::sort( coords.begin(), coords.end(), CompareCoordinatePointers() );
...
Problem Specification
In theory, a programmer should be able to get a specification for a project, code to that specification, and return working code to the project.
Reality is often quite different from theory
Even the best specifications often become obsolete before the project is finished
Many projects are burned by the "rush to implementation" and the "rush to optimization"
When specifications change, as they invariably do, it becomes difficult to adapt the code to the new specifications
Realizing that specifications are bound to change can help keep the programmer from falling into constant traps
"Insanity: doing the same thing over and over again and expecting different results." - Albert Einstein (brainyquote.com)
Spend time to gather requirements, write specifications, review specifications, and write pseudo-code
Where there is uncertainty about the specifications, generalize the problem and avoid optimizing for the current understanding of the
problem
Premature optimization is a cardinal sin
Example:
Avoid an interface that exposes the fact that you are using a hash table for object lookup even though exposing that
implementation detail may yield a small performance boost
If you keep the interface generic, you can change the underlying implementation if the need arises
If the implementation doesn't need to change, you can optimize the interface if performance profiling necessitates that
you do so (but not unless it really is a critical bottleneck)
Reduce algorithmic complexity
Example: Use an associative array for lookup instead of a linked list that you must hand code in assembly to obtain performance
Use standard components instead of custom components when possible
Also reduces debug issues
Decomposition
Definitions
Wikipedia: "Decomposition in computer science, also known as factoring, refers to the process by which a complex problem or
system is broken down into parts that are easier to conceive, understand, program, and maintain."
Object decomposition involves breaking larger concepts into smaller ones until the concept is sufficiently small and specialized to be
described by an object
A car is composed of many different parts that could be hard to support within a single object
Decomposing the car into individual objects might result in tires, steering wheel, seats, lights, etc. that are much easier to
manage as individual objects
Composition (to be covered later) can aggregate the individual objects to create the car without loss of generalization
Prototype vs Implementation
Definitions:
Wikipedia: "A prototype is an early sample, model or release of a product built to test a concept or process or to act as a thing to
be replicated or learned from."
Prototypes can be an invaluable tool to help define the ultimate form of the solution
Prototypes are rapid approximations of a solution
They are not the solution
Resist the urge to turn a prototype into a solution
Because they test vehicles and aren't meant to cover all corner cases of a program, they typically have flaws when
applied in general
An attempt to convert a prototype into a product can result in more time can be spent covering up the flaws than
designing and implementing the program correctly would have taken
There is a software debt that is accumulated when the prototype becomes the implementation
Paid in support and maintenance as well as future design goal growth
If there isn't time to both write a prototype and write the production-version correctly, don't do the prototype
Spend time in the specification
Keep the implementation generalized until all corner cases are known
Optimize the implementation only once it handles all of the known input test data types, and only with runtime information
available to identify hotspots in the code
Contract Programming
Definition
Wikipedia: "Design by contract (DbC), also known as contract programming, programming by contract and design-by-contract
programming, is an approach for designing software. It prescribes that software designers should define formal, precise and
verifiable interface specifications for software components, which extend the ordinary definition of abstract data types with
preconditions, postconditions and invariants. These specifications are referred to as "contracts", in accordance with a conceptual
metaphor with the conditions and obligations of business contracts."
Dr. Dobbs: "Programming with Contracts (PwC) is a method of developing software using contracts to explicitly state and test
design requirements. The contract is used to define the obligations and benefits of program elements such as subroutines and
classes." (http://www.drdobbs.com/cpp/programming-with-contracts-in-c/184405997)
Encapsulates several ideas
Precondition: Before an API is invoked, a certain condition is guaranteed to exist
Example: Before reading a file, the file must be open
APIs can insert checks to ensure that the preconditions have been met
Postcondition: Upon completion of the API, the object will match an expected state
Example: Opening a file
Precondition
The file must not already be open
Postconditions
The file will be opened if it exists
A status flag will indicate the current state of the object opening the file, including potential error states
Class Invariants: A property of the class is required to be true both before the call to an API and after the completion of an API
Example: Calling an API in a threaded environment
Invariant is that some mutex was locked upon entering the API and is still locked upon exiting the API
Contract programming typically deals with public interfaces
Interfaces that are private may not need the same level of scrutiny since they can only be called by APIs of the object
For an in-depth discussion, see Public, Private, and Contract Programming in the Appendix
One reason that private interfaces tend to be private is that access to these APIs is limited
Private Interfaces
class SomeClass
{
...
private:
void assign(const SomeClass ©);
void destroy();
void init();
public:
SomeClass();
SomeClass(const SomeClass ©);
~SomeClass();
...
SomeClass &operator=(const SomeClass ©);
};
////////////////////////////////////////////////////////////////////////////////
// Private APIs below //
////////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
// Public APIs below //
////////////////////////////////////////////////////////////////////////////////
SomeClass::SomeClass()
{
init();
}
SomeClass::~SomeClass()
{
// We destroy the instance and then reset all variable. If somebody tries
// to access the instance after destruction, the old contents will no longer
// be accessible because of the variable reset(). This means that, during
// program execution, we can detect the deletion instead of potentially
// getting data back that *looks* like it's valid, even though it's not.
destroy();
init();
}
In SomeClass, contracts are spelled out for the APIs assign(), destroy(), and init()
Because they are private APIs, they can only be called from other APIs of the class
It's easier to guarantee internal consistency and usage of private APIs, which can reduce the verification burden
The "contract" is spelled out within the comments and may or may not be further enforced through runtime or compile-time
checks
Public Interfaces
Interfaces that are public require a greater degree of checking because the order of calls to a public API, and the state of an object
when the call is made, may not follow the "contract"
The C macro assert() is used to test the validity of a condition
If the condition argument to the macro is false, the macro will print the textual condition, file, and line number when the
assertion fails, and the program will terminate
Example of a runtime check as it does not stop compilation if there is a contract violation
Some contracts are public contracts but are internally maintained
Example: In a netlist, the net always points to the pins it connects to, and each pin points to the net it connects to
If a user calls removePin(), the pin will be removed from the netlist and the net
If the user calls removeNet(), the net will be removed from the netlist, and the pins will be updated to reflect that they
are unconnected
This is an example of a class invariant
Example: A fast square-root routine will for numbers between 1.0 and 50.0 will never result in an error of more than 0.01 when
compared to the slower runtime library API (sqrt())
This is supposed to be a fast square root algorithm, so shipping the product with this assertion enabled will actually be slower
than just calling the normal sqrt() API
Pre- and post-condition assertions are seen in the example
Assertion Debugging
The assert() macro takes a single condition
A debug API that returns a bool can be used as an argument to the assert() macro for complex assertion checking
Example: Graph debugging (this came from a real situation)
Supposed you had a splice operation on a graph that you thought was producing invalid results under certain outlier
circumstances
The state of this object's graph should be complete as should the graph being spiced in
Upon return from the splicing API, this object's graph should still be complete with the passed graph spliced into it
Multiple debugging APIs can be written that validate the state of the graph at each point in the splicing operation
The assert() macro can be sprinkled liberally throughout the API to validate the state of the instance at each point in the
operation by invoking these debug APIs
The assertion checks may slow down the splicing operation by 2-3 orders of magnitude, but they are not meant to be used in the
final product
Compile with assertions enabled and let run on all test cases over the weekend
The bug was fixed, the assertions removed, but the debug code kept in case it was ever needed in the future
It's OK to have debug APIs that aren't used in production code, though they also require maintenance as the class evol
ves
Summary
Before coding, create the best specifications possible
Avoid the trap of early optimization
Optimization can be performed once the hotspots of a program have been identified
Utilize contract programming to define the interfaces and expectations
Enforce expectations through the use of compile-time and runtime checks
Preconditions validate input assumptions and the state of an object upon calling an API
Postconditions validate assumptions of the result and the state of an object upon completion of an API
Invariants validate that the state of a variable or object remains the same upon completing the API as when the API started
The C assert() macro is one tool for contract programming, but there are others
Be aware of the hidden costs of the assert() macro
Ensure that your use of the macro is consistent with the understanding of everybody else contributing code to your project
Abstraction
Definition
Google ("abstraction definition"): "(1) the quality of dealing with ideas rather than events (2) freedom from representational qualities in art"
Dictionary.com: "the act of considering something as a general quality or characteristic, apart from concrete realities, specific objects, or
actual instances"
What Does It Mean With Respect To OO Programming?
Hiding the actual implementation from the user while providing the necessary functionality, accessible via methods
Providing a simplified interface to complex mechanisms
Encapsulation
Encapsulation is a technique, used in C++, whereby the details of the implementation are unavailable to the programmer except via API
Often people interchange the two terms (encapsulation and abstraction) when talking about C++
Abstraction is the concept while encapsulation is the mechanism
Date Classes in C
struct Date1
{
unsigned mMonth:4,
mDay:5,
mYear:6; /* Add 1970 to get actual year */
};
struct Date2
{
unsigned mDaysSinceEpoch;
};
#include <time.h>
struct Date3
{
time_t mSecondsAfterEpoch;
};
All 3 implementations can be used to store a date, but all 3 are dramatically different implementations.
Let's assume that one of these implementations has been chosen by HR to hold the important dates for a new .com startup company. The struc
t is used in multiple programs in multiple departments in the company. Dates represented include:
Birthdays for all employees in the company (everybody in the company is in their 30's or younger at the time that the code was written)
Holidays for the company
Founding date for the company
Date for various stock options and stock grants
Others....
In addition to using the date in a lot of different programs, HR has also developed a lot of routines around the date struct to make programming
with this date representation easier. Of course, not every API was available at once, and not everybody chose to use these APIs in the code when
they wrote the program, so the use of the code is haphazard at best. Such APIs might include the following:
The board of directors has now determined that the company needs some experience leading the company due to its success gaining traction in
the market. They've brought on a new CEO whose birthday is before 1970. The CEO wants to aggressively grow the company, and that growth
will bring in even more people that can't be represented using the C date implementation. Now HR has to go back to the drawing board and
re-implement the date struct. That's the easy part. The hard part is finding all of the existing code and retrofitting it with the new date
representation so that all of the code will work correctly.
Consider the C++ class used to represent a date (below). The actual implementation of the class is not shown because, from the perspective
of somebody using the class, it doesn't matter. As long as the class member APIs return the correct result, the actual implementation is
irrelevant to the programmer using the class.
private:
// Implementation details would go here. They aren't being
// shown because we could pick any implementation we want to
// and it wouldn't make a difference to users of this class
public:
Date();
Date(const Date ©);
~Date();
There are several things to note in this C++ class and the way that it hides the actual implementation:
The class defines a mechanism to copy a date (overriding the "=" operator). In C, assigning one struct to another will cause a
memory copy between them. This may or may not be desirable depending on the implementation. Copying integers is fine, but copying
pointers can lead to issues if two different struct's point to the same memory, and a routine attempts to free the same memory twice.
The operations that the HR dept needs to support are all available via 3 APIs. Additional APIs can be added as people request more
functionality.
The actual implementation of the class isn't shown, but it would be in the private section of the class. The private qualifier means
that the implementation is only accessible to APIs defined within the Date class. In the real code, an implementation would be explicitly
given. However, knowing what the implementation is doesn't help anybody from outside the class because they cannot access the cla
ss internals.
The class supports a public embedded enumeration of days. This enumeration is scoped within the class so that it cannot conflict with
enumerations in other class or global enumerations. There is no need to worry about another programmer providing an enumeration for
SUNDAY, for example, that will conflict with this enumeration.
Though C macros are still problematic
Because the enumeration is public, users of the class can create variables of this type and check return values from getDay
OfWeek() against one of the enumeration values.
Simplified Interfaces
Software engineering is often concerned with giving the user the illusion of simplicity
The outside interface may be fairly simple to perform a task
The implementation may be much harder to understand and tricker to perform correctly
Abstraction can be utilized to provide a simple interface, bound to an object, while hiding the actual implementation
Example: Thread pool
Example taken from multi-threading tutorial (http://wiki.cadence.com/confluence/display/~jcroix/Multi-Threading+Tutorial#Multi-T
hreadingTutorial-ThreadPool)
Official source code in PUSH (Performance Utility Software Hub)
See Software Reuse portal at http://quality.cadence.com
...
};
/*
* Create a thread pool to manage 8 different threads at once. Submit
* 1024 jobs to the thread pool and then wait for them all to complete
*/
...
FIFOThreadPool threadPool;
threadPool.start( 8 );
for (unsigned i = 0; i < 1024; i++)
{
/*
* For each thread, create and initialize a data structure for it to
* operate against. Pass that data in and let the API delete the memory
* when it's done with it.
*/
/*
* Submit a job to the thread pool. The "job" is a single API
* (someThreadedAPI) that will be passed the variable "scs" as its
* sole parameter. The third parameter is set to false to tell the
* system not to retain the return value from the API.
*/
threadPool.add( &someThreadedAPI, scs, false );
}
/*
* Now that all 1024 jobs have been submitted to the thread pool
* queue, wait indefinitely for all of the threads to finish
*/
threadPool.waitAllJobs();
The FIFOThreadPool class hides the complexity associated with pthread programming (mutex variables, condition variables, race
conditions, deadlocks, etc.)
There's a LOT of complexity (http://wiki.cadence.com/confluence/display/~jcroix/Multi-Threading+Tutorial#Multi-ThreadingTutori
al-ThreadPool)
The user only has 4 lines of code associated with the thread pool
Constructor: Create the thread pool
Initialization: Start up 8 threads that the thread pool will manage
Submission: Submit 1024 different jobs to the queue. Only 8 will run at any given time. Jobs will be executed in first-in-first-out
order
Synchronization: Wait until all threads complete before moving forward
Natural Interfaces
Consider the case of a class to hold a complex number
class ComplexNumber
{
double mReal;
double mImaginary;
public:
/*
* Define multiple constructors for different ways of initializing
* an instance of the class
*/
ComplexNumber();
ComplexNumber(const double real);
ComplexNumber(const double real, const double imaginary);
ComplexNumber(const ComplexNumber ©);
~ComplexNumber();
/*
* Access elements of the class
*/
/*
* Provide mathematical operators so that an instance of a ComplexNumber
* instance looks just like the operations on scalars (floats, doubles) do
*/
/*
* Define some standalone APIs for ComplexNumber
*/
/*
* Define member APIs for ComplexNumber
*/
ComplexNumber::ComplexNumber()
{
mReal = 0.0;
mImaginary = 0.0;
}
ComplexNumber::~ComplexNumber()
{
mReal = mImaginary = 0.0;
}
/*
* With mathematical operators defined, a ComplexNumber instance can be
* operated upon just like a scalar
*/
class BadAbstraction
{
int *mVectorOfInts;
unsigned mVectorSize;
public:
...
unsigned getSize() const;
const int *getVector() const;
int getValue(const unsigned index) const;
};
The class has a routine that gets the size of the vector, returns the vector, and returns an integer at the specific point in the vector
This locks the object so that it fundamentally must use a vector for the underlying implementation
For example, you can't switch the vector of integers into a red/black tree of integers without causing problems for a user of the
class
In a tree, the order in which you add things won't be the order in which you retrieve things
In general, duplicate items cannot exist in a tree
The getValue() API becomes O(n) instead of O(1). To get the 5th element, you need to start with the 1st element in
the tree and traverse the tree to get to the 5th. In a vector, you go straight to it.
Without also changing the code that uses the class, the API getVector() would cause a memory leak since a vector
would have to be allocated and populated
The callers would have to explicitly change their code to destroy the returned vector
Creating the vector from a red/black tree is also O(n) instead of an O(1) operation to return an actual vector held in the
base class
Iterators, to be covered later in the course, provide a means of abstraction that virtually eliminates this issue
Even though the user doesn't have direct access to the data structure, the effect is the same
By encapsulating data members as private, the compiler limits visibility of the data members to just the methods that are part of the cl
ass
Abstraction is the natural consequence of encapsulation as external access to the data must be provided via APIs
Though a programmer can see the C++ header to identify the storage mechanisms used, they can only access the data via APIs
that are made public
The programmer of the object is then free to change the underlying data representation as long as the APIs don't change
There's no contract between the creator of the object and the user of the object that says that the underlying
implementation won't change. It can and it might.
Changing an API may cause compilation failures if the changed API is used
A consistent public API is a contract between the creator of the object and the user of the object
Thread Abstraction
class MultiThreadedClass
{
enum ThreadState
{
NOT_STARTED,
RUNNING,
FINISHED,
KILLED,
};
#if defined(__LINUX__)
typedef pthread_t ThreadType;
#elif defined(__WINDOWS__)
typedef HANDLE ThreadType;
#elif defined(__MACOS__)
typedef IOThread ThreadType;
#else
#error "Unsupported platform"
#endif
ThreadType *mThreads;
ThreadState *mState;
unsigned mNumThreads;
private:
// Do not allow the copy constructor or assignment operator
// Can't copy thread states
MultiThreadedClass(const MultiThreadedClass ©);
MultiThreadedClass &operator=(const MultiThreadedClass ©);
public:
MultiThreadedClass();
~MultiThreadedClass();
...
};
Different operating systems have different ways of creating and destroying threads
Abstraction can hide the OS-specific issues within APIs of the object so that the programmer doesn't have to deal with them
The same class can now be used on multiple platforms without having to force a user to alter their code since the APIs abstract
away the OS-specific data types and operations on those types
The copy constructor and assignment operator have been disallowed
No definition will be given for those APIs (declared but not defined)
Copying a thread doesn't make any sense
If somebody attempts to use them, the compiler will error since they are private
The object author can't use them because no body will be provided for them (causing a link error)
The values in mState must exactly track the state of each thread in mThreads on a 1-to-1 basis
Abstracted operations on threads will also alter the state in tandem
The class author should not provide an API to modify one variable in isolation from the other
No opportunity for another programmer to accidentally desynchronize the two variables
Summary
Abstraction is the separation of functionality from the implementation details
Encapsulation is the mechanism used in C++ that enforces the hiding of implementation details
The keywords public, protected, and private are the means through which encapsulation is accomplished in C++
By hiding the implementation, a programmer can alter the implementation as required, without causing users of the object to redesign
their code (as long as the object interface remains backward compatible)
An implementation can still be exposed through poor API use, even if access to the underlying data members is not provided
This breaks a tenant of OO programming
1 Instantiation is the process of creating a concrete representation of an object. In C++, a class, like a struct, is simply a definition. Creating a
variable on the stack (ie "struct SomeStruct x;") creates an instance of that class or struct that can be operated against and that can
hold state (assuming that there are variables in the class or struct). In C, dynamic memory is allocated via malloc(), and the memory is
managed by the programmer to stamp the definition of the struct into the memory via typecasting. Memory is just a collection of bytes that can
be returned via a call to free(). In C++, the new operator both allocates memory and instantiates a concrete representation of the class or str
uct into the memory by applying the constructor to that memory. The delete operator is responsible for invoking the destructor on that memory
block and returning the allocated memory to the system.
Modularity
Definitions
Wikipedia: "Modular programming (also called "top-down design" and "stepwise refinement") is a software design technique that
emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything
necessary to execute only one aspect of the desired functionality. Conceptually, modules represent a separation of concerns, and
improve maintainability by enforcing logical boundaries between components. Modules are typically incorporated into the program
through interfaces. A module interface expresses the elements that are provided and required by the module. The elements defined in
the interface are detectable by other modules. The implementation contains the working code that corresponds to the elements declared
in the interface."
Modularity
Concept is closely linked to that of abstraction
It's the mapping of a problem into subproblems
In OO programming, modules are implemented with classes and namespaces
Modular programming does not require objects
It can be done in any language
Objects naturally map to a modular framework
Communication between the objects is accomplished via the methods that the objects provide
This is typically the hardest part of learning about OO programming
New OO programmers tend to want to make every small data element an object
Some languages support this natively (Smalltalk, Simula, Ruby, ...). C++ doesn't.
In C++, int, float, etc. are not objects, cannot be inherited from, and don't have APIs bound to them
Determining the proper size and scope of an object is an art that's learned over time
"Choose the elements so that they are as independent as possible; that is, elements with low external complexity (low coupling)
and high internal complexity (high cohesion)" (from http://www.theenterprisearchitect.eu/archive/2006/09/21/the-art-of-object-orie
nted-programming-oop)
Benefits
A large program can be broken up into a number of independent modules
Interfaces between the objects must be agreed upon by those programmers whose objects will communicate with each other
Protocols are also established to define when interfaces can be utilized
Programmers can work on their own modules/objects independently as long as the agreed-upon interfaces and protocols remain in tact
Individual unit tests can be created for modules, often independent of the other modules
Where intermodule communication must be performed, "stub" classes can often be used to simulate responses and/or calls from
other modules
Unit Testing
Unit testing of a C struct isn't possible because the struct only contains data
APIs that utilize the struct can be littered throughout the code
A given routine may only utilize a portion of the data in the struct, making it harder to test
Even after testing a data structure and APIs that use that data structure, it's impossible to keep other programmers from using
the data structure in unsupported ways or to force them to utilize tested APIs
In OO programming, only the interfaces need to be tested (from the perspective of the object user)
The public interfaces require testing because they're the method through which operations occur on the objects
Object user must implicitly assume that non-public portions of the object work correctly
Only the object programmer can provide unit tests for private and protected APIs
In C++, private APIs can make assumptions about the data content because the class itself can guarantee the condition of the
data
C assert() macros can validate the assumptions
Example: A classthat has a text string as one of its components
Other components are not shown for brevity sake
private:
/*
* The APIs below can only be invoked by other APIs within this class.
*/
public:
SomeClass();
SomeClass(const SomeClass ©);
~SomeClass();
...
void setText(const char *newText);
const char *getText() const;
...
SomeClass &operator=(const SomeClass &other);
};
//////////////////////////////////////////////////////////////////////////////
// PRIVATE APIs //
//////////////////////////////////////////////////////////////////////////////
mTextSize = other.mTextSize;
if (mTextSize)
{
mText = new char[ mTextSize ];
memcpy( mText, other.mText, mTextSize );
}
else
{
mText = NULL;
}
...
}
delete[] mText;
...
}
mText = NULL;
mTextSize = 0;
...
}
inline SomeClass::SomeClass()
{
/*
* Instance is currently in an invalid state. Call init() to set the
* initial state
*/
init();
}
assign( copy );
}
inline SomeClass::~SomeClass()
{
/*
* Instance is currently in a valid state. Call destroy() to return
* dynamic memory to the heap, making the state invalid. Then call
* init() so that the instance is reset. We do this so that any
* attempt to access the instance after destruction won't yield
* results that reflect the prior state, potentially making the
* instance look like it's still valid.
*/
destroy();
init();
}
return( mText );
}
delete[] mText;
mTextSize = text ? strlen( text ) : 0;
if (mTextSize)
{
mText = new char[ mTextSize + 1 ];
memcpy( mText, text, mTextSize + 1 );
}
else
{
mText = NULL;
}
}
if (&other != this)
{
/*
* The current instance is valid, so destroy its
* contents. Then, from this invalid state, make a copy
* of the passed instance within the current instance
*/
destroy();
assign( other );
}
return( *this );
}
The APIs assign(), init(), and destroy() are helper functions that can only be called by other APIs of the object
They localize common operations that are performed on the data (code factorization)
Only APIs within this object can invoke these private APIs
They make assumptions about the state of the object
Calling assign() twice in a row will cause a memory leak
Because it's a private API, we can guarantee within the object itself that assign() will never be called twice in a row
If the assign() API was public, no such guarantee could be made
The validate() API can be used to validate the state of the object
An object can be validated at each public API via assert() (as shown)
In this case, you don't want assert() executed in shipped code since runtime penalty could be large, depending on complexity
of validation code
Can also include validation assertions in private APIs judiciously
Can't put at the beginning of init() since it is called to initialize an object instance. State may be invalid prior to
initialization
External unit testing can be performed only on the public APIs
Composition
Combining simpler building blocks (in this case, objects) into more complex ones
Composition is not limited to objects
C Composition
/*
* This is an example of composition in C
*/
struct CDate
{
unsigned mMonth;
unsigned mDay;
unsigned mYear;
};
The C struct shows how 3 unsigned values are integrated to form a date
In procedural languages, where data and interfaces are not bound together, composition consists only of combining data
members
In OO programming, composition consists of both data and API members
C++ Example
Employee information may contain a person's name and birthdate
The name can be represented by a string object
The birthday can be represented by a date object
Much easier to instantiate multiple object elements to create the composite object than to have low-level data elements and APIs
that must be written, tested, and maintained
public:
Date();
Date(const Date ©);
~Date();
...
Date &operator=(const Date &other);
};
class Name
{
std::string mFirst;
std::string mLast;
char mMiddleInitial;
public:
Name();
Name(const Name ©);
~Name();
...
Name &operator=(const Name &other);
};
class Employee
{
Date mBirthDate;
Date mHireDate;
Name mName;
...
public:
Employee();
Employee(const Employee ©);
~Employee();
...
Employee &operator=(const Employee &other);
};
In this example, the Date class is used to represent 2 different dates: birthday and hire date
Unit testing of the Employee class may involve validation that hire date is after birthday, but no need to test Date class
Date class testing is the responsibility of the Date object programmer
Can later enhance class to include a termination date, simply by adding another Date instance
If the CDate struct is used instead of the C++ class Date, separate APIs must also be provided to manipulate the CDate
struct so that the Employee can still be treated as a single entity
The Employee class is simply a container of object objects
Summary
OO languages do not enforce modularity or good coding practices
OO languages provide a framework that, when properly applied, lends itself to modularity
Taking advantage of OO constructs to build modular objects yields benefits in speed of programming and speed of
testing/debugging
Often yields runtime and memory benefits, too (targeted optimization)
Modular programs are easier to understand and maintain
When new programmers start on the project, modularity can help them come up to speed faster
If you're the author of the code, but haven't touched the code in a long time, a modular approach can make it easier for you to
relearn your own code
Bugs can be easier to find in modular code since bugs tend to be contained within a module instead of spread throughout a
program
By applying the principles of modularity, it's possible to re-implement an object for speed, memory, or accuracy gains, independent of the
rest of the program, as long as the API to the object is maintained
Composition is the process of building bigger modules out of other components, grouping them together logically
Assignment #1
This is an optional assignment that is designed to help you become familiar with C++ and object-oriented programming. Given the examples in
class so far, try translating the following C program into a C++ program. A skeleton of the C++ program is provided.
To compile and run the program, go to http://www.compileonline.com/compile_cpp_online.php and paste your C++ program into the window.
Compile and execute the program to verify that it matches the output provided by the C program.
C Program
C Program to Translate
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
enum Color
{
RED,
GREEN,
BLUE,
BURNT_ORANGE,
OTHER_COLOR
};
enum TireManufacturer
{
GOODYEAR,
MICHELIN,
BRIDGESTONE,
OTHER_MANUFACTURER
};
enum Brand
{
FORD,
GM,
TOYOTA,
HYUNDAI,
OTHER_BRAND
};
struct Car
{
unsigned mModelYear;
unsigned mSalesPrice;
enum Brand mBrand;
enum Color mColor;
enum TireManufacturer mTires;
};
destroyCar( myCar );
destroyCar( myWifesCar );
}
C Program Output
Output
My car
======
Model year: 2011
Sales price: $35000
Tire manufacturer: Goodyear
Color: Burnt orange
Brand: GM
My wife's car
=============
Model year: 2014
Sales price: $50000
Tire manufacturer: Bridgestone
Color: Blue
Brand: Hyundai
class Car
{
public:
enum Brand
{
FORD,
};
private:
unsigned mModelYear;
public:
Car(const Color color, );
~Car();
Car::Car()
{
}
Car myWifesCar;
myWifesCar.setColor( Car::BLUE );
Correction
In a prior class, I stated that the following code would result in a memory leak in the event that a vector is created on the fly in response to a C++
class method.
class BadAbstraction
{
int *mVectorOfInts;
unsigned mVectorSize;
public:
...
unsigned getSize() const;
const int *getVector() const;
int getValue(const unsigned index) const;
};
In class I had said that changing from a vector to a linked-list would mean that a call to getVector() would require creation and population of a
vector of data on the fly. Furthermore, I said that the caller wouldn't be able to return the dynamic memory to the heap because the method is
returning a const pointer. In other words, I said that the following would be illegal.
It was pointed out to me after class1 that it is, in fact, possible to delete the memory returned in this fashion. The language does allow this. While
there are valid reasons for the language to allow this, IMHO this presents an even bigger issue as a caller can delete something that they
shouldn't. In other words, though I was wrong about the compiler not allowing this construct, it reinforces the problem of breaking abstraction
since the caller can destroy your underlying data structure without explicitly calling an API to the class.
Relationships
Inheritance is used to define "is-a" relationships between objects
Example: A shape class
A circle is a shape
A rectangle is a shape
Because a rectangle doesn't have a radius, and a circle doesn't have a width or length, neither of those dimensions are
shared by a shape
However, a shape does have an area
Individual shapes have different ways of calculating the area
Parent objects ("superclass" or "base class") provide common attributes and properties for inheriting objects
Inheriting objects ("subclass") are derived from superclass objects and expand upon the definition of the superclass
Inheritance is different from composition
Composition groups a number of (potentially disparate) objects together to form a new object
In inheritance, a subclass expands upon the definition of a superclass
Composition does not require an "is-a" relationship
In OO Programming, both data and APIs are inherited
Simple Example
Consider designing classes for various shapes: rectangle, circle, triangle
Can design a base class (Shape) from which Rectangle, Circle, and Triangle are derives
This can be implemented in C as shown below
C Pseudo Inheritance: Shape
typedef enum ShapeType
{
CIRCLE,
TRIANGLE,
RECTANGLE,
UNDEFINED_SHAPE
};
By including the superclass (Shape) type within each subclass, we've effectively emulated inheritance
In this example, in order to draw a specific Shape, information in the Shape has to be used to cast the object to a specific type in order to
achieve object-specific behavior
Common code invariably ends up with a switch statement to implement type-specific behavior
To add a new shape, a new struct needs to be created, and all APIs that deal with different types of shapes would need to be updated
to handle yet another shape type
If you don't have full access to the source code for all APIs that operate on different shapes (for example, you only have the
library and header file), such updates may be impossible
C trick: Note that putting the Shape struct at the beginning of Rectangle, Circle, and Triangle, we can pass any of these to the r
enderShape() API
Now consider the C++ implementation below
public:
Shape();
Shape(const Shape ©);
virtual ~Shape();
...
double getX() const;
double getY() const;
void setX(const double x);
void setY(const double Y);
virtual void render() const;
...
Shape &operator=(const Shape ©);
};
public:
Circle();
Circle(const Circle ©);
~Circle();
...
double getRadius() const;
void setRadius(const double radius);
void render() const;
...
Circle &operator=(const Circle ©);
};
public:
Triangle();
Triangle(const Triangle ©);
~Triangle();
...
double getBase() const;
double getHeight() const;
void setBase(const double base);
void setHeight(const double height);
void render() const;
...
Triangle &operator=(const Triangle ©);
};
class Rectangle : public Shape
{
double mWidth;
double mHeight;
public:
Rectangle();
Rectangle(const Rectangle ©);
~Rectangle();
...
double getWidth() const;
double getHeight() const;
double setWidth(const double width);
double setHeight(const double height);
void render() const;
...
Rectangle &operator=(const Rectangle ©);
};
...
Circle c;
Rectangle r;
Triangle t;
...
c.render();
r.render();
t.render();
...
Things to note
No more enumerations for types of shapes
New syntax for class definitions to support inheritance (": public Shape")
Tells the system that each of the three shapes are derived from the superclass Shape
The keyword public tells the compiler that all data and APIs marked public in Shape are public in the inheriting
classes
Can also specify private and protected – to be covered in later classes
The Shape class defines a render() API to draw the shape
Don't worry about the virtual keyword for now – to be covered in later classes
Used to implement polymorphism
Each type of Shape later redefines the render() API to be able to render the specific type of Shape without an explicit
cast
Because Circle, Triangle, and Rectangle are all derived from Shape, they can be passed to any routine that expects a Sh
ape instance, without having to perform a typecast
Shape defines APIs getX(), getY(), setX(), and setY()
All objects inheriting from Shape also inherit these APIs
Each subclass adds to Shape with APIs specific to its attributes
Rectangle adds width and height APIs
Triangle adds base and height APIs
Circle adds radius APIs
Shape *shapes[ 3 ];
shapes[ 0 ] = &c;
shapes[ 1 ] = &r;
shapes[ 2 ] = &t;
renderShapes( shapes, sizeof( shapes ) / sizeof( *shapes ) );
With this definition of Shape, a new class can be created without having to modify existing source code
Original source code for APIs that process Shape instances isn't even needed in order to extend the class
Virtual APIs
The virtual keyword tells the compiler that the render() API will be superseded by a new API in an inheriting class
Even if calling the API using a Shape pointer instead of a Circle pointer, the Circle::render() API will be invoked
The render() API for Shape doesn't make sense because it can't actually draw anything
The Shape class represents an abstract concept, not a concrete object
What would you do if this was actually invoked for a Shape?
Two ways to approach this issue
Always abort if the API is invoked
Don't define a body for the Shape::render() method and tell the compiler that a method MUST be supplied by any
inheriting classes
Called "pure virtual" and will be covered later
Shape becomes an abstract base class
Advanced Example
Consider the following objects
Animal
Mammal
Canine
This can be implemented in C as shown below
...
Canine myDog;
myDog.mSuperClass.mSuperClass.mNumLegs = 4;
myDog.mSuperClass.mSuperClass.mHasTail = true;
...
By including the superclass type within each subclass, we've effectively emulated inheritance
There are no APIs bound to a struct in C, so APIs don't inherit
To get to any subclass data element, we must explicitly navigate to the data member
If a new class is created and inserted between Mammal and Canine, all explicit navigations will fail
To pass a type of Canine to an API expecting a Mammal or Animal pointer, we need to explicitly navigate
In OO Programming, the compiler (or interpreter) knows that a Canine is a Mammal is an Animal
APIs are inherited (depending on inheritance permissions) so that explicit qualification is not required unless a conflict exists
between APIs of different superclasses
Data members are also inherited with the same resolution as available to APIs
Below is the C++ implementation
enum SkinType
{
FUR,
SCALES,
FEATHERS
UNDEFINED_SKIN
};
enum BloodType
{
WARM_BLOODED,
COLD_BLOODED,
UNDEFINED_BLOOD_TYPE
};
private:
void assign(const Animal ©);
void destroy();
void init();
private:
unsigned mNumLegs;
SkinType mSkinType;
LivesOn mLivesOn;
BloodType mBloodType;
bool mHasTail;
public:
Animal();
Animal(const Animal ©);
~Animal();
...
void setBloodType(const BloodType bloodType);
void setHabitat(const LivesOn habitat);
void setHasTail(const bool hasTail);
void setNumLegs(const unsigned numLegs);
void setSkinType(const SkinType skinType);
...
Animal &operator=(const Animal &other);
};
private:
void assign(const Mammal ©);
void destroy();
void init();
public:
Mammal();
Mammal(const Mammal ©);
~Mammal();
...
void setGestationPeriod(const unsigned gestationPeriod);
void setHasWiskers(const bool hasWiskers);
...
Mammal &operator=(const Mammal &other);
};
enum CanineSpecies
{
GERMAN_SHEPARD,
BLACK_RUSSIAN_TERRIER,
FRENCH_BULLDOG,
ROTTWEILER,
UNDEFINED_SPECIES
};
private:
void assign(const Canine ©);
void destroy();
void init();
private:
FurColor mFurColor;
CanineSpecies mSpecies;
public:
Canine();
Canine(const Canine ©);
~Canine();
...
void setFurColor(const FurColor furColor);
void setSpecies(const CanineSpecies species);
...
Canine &operator=(const Canine ©);
};
C++ Inheritance
Inheritance is specified by using a colon (:) after the class name, specifying a visibility attribute (public, private, or protected),
and the class name of the superclass that is being inherited from
From this example, Canine publicly inherits from Mammal, and Mammal inherits from Animal
Any public API in Mammal or Animal is also public in Canine
If Mammal and Animal each have the same public API, Canine will inherit Mammal's API
Inherit the one in the nearest base class
Canine can still access Animal's API explicitly in this case
APIs that are private (like init()) are visible/accessible only to the class that they are defined within
Canine cannot directly invoke Mammal::init() or Animal::init()
Data members that are private are visible/accessible only to the class that they are defined within
Canine cannot directly access Mammal::mHasWiskers or Animal::mHasTail
Private inheritance
A class can inherit the implementation without inheriting the type
Using the syntax "class B : private A" allows B to use the implementation of A in its own implementation
B is not a subtype of A
An instance of B cannot be passed to an API expecting an instance of A
Public inheritance
A class inherits both the implementation and the type
Syntax "class B : public A"
B can still override APIs in A
An instance of type B can be passed where an instance of class A is expected
Protected inheritance will be described later in the C++ portion of the class
void Mammal::init()
{
// All mammals are warm blooded
setBloodType( WARM_BLOODED );
mGestationPeriodInMonths = 0;
mHasWiskers = false;
}
void Canine::init()
{
// Set Mammal attributes
setHasWiskers( true );
// Set Animal attributes
setHabitat( LIVES_ON_LAND );
setHasTail( true );
setNumLegs( 4 );
setSkinType( FUR );
// Set Canine attributes
mFurColor = UNDEFINED_FUR_COLOR;
mSpecies = UNDEFINED_SPECIES
}
Animal::Animal()
{
init();
}
Mammal::Mammal() : Animal()
{
init();
}
Canine::Canine() : Mammal()
{
init();
}
The constructor for a Mammal invokes the constructor of an Animal by using the syntax ": Animal()" in the constructor for Mammal
If no constructor is specified, the default is used
If there are multiple constructors to choose from, any of the constructors may be specified
Design Challenge
In the Animal example, an enumeration was used to distinguish various types of dogs
Instead of using an enumeration to specify dog styles, could have created specific types of dogs, inheriting from Canine
The choice of when to use inheritance and when to use enumerations is not clear cut
Would a subclass differentiate itself enough from the superclass to warrant the generation of a new class?
Would the subclass be used so frequently that it would be a benefit to the programmer to have a specific subclass? Or would it
simply cause additional work?
There's a cost associated with developing a new class. Does the benefit outweigh the cost?
Development cost
Maintenance cost
Single- vs Multiple-Inheritance
Examples so far have shown single inheritance
A new object inherits from just one other object
Sometimes there may be more than one "is-a" relationship
Object "A" may inherit from objects "B" and "C"
Called "multiple inheritance"
Not supported by all languages
Java only supports single inheritance
C++ supports multiple inheritance
Can lead to confusing applications
The order of inheritance becomes important
Different protections may be applied to each inheriting class
C++ Multiple Inheritance
class B
{
...
public:
B();
...
void setValue(const unsigned val);
};
class C
{
...
public:
C();
...
void setValue(const unsigned val);
};
class D
{
...
public:
D();
...
void setValue(const unsigned val);
};
People quite correctly say that you don't need multiple inheritance, because anything you can do with multiple inheritance you can also do
with single inheritance. You just use the delegation trick I mentioned. Furthermore, you don't need any inheritance at all, because anything
you do with single inheritance you can also do without inheritance by forwarding through a class. Actually, you don't need any classes
either, because you can do it all with pointers and data structures. But why would you want to do that? When is it convenient to use the
language facilities? When would you prefer a workaround? I've seen cases where multiple inheritance is useful, and I've even seen cases
where quite complicated multiple inheritance is useful. Generally, I prefer to use the facilities offered by the language to doing workarounds.
protected:
Shape();
Shape(const Shape ©);
public:
virtual ~Shape();
virtual double area() const=0;
virtual void render() const=0;
};
double area();
public:
Circle();
Circle(const Circle ©);
~Circle();
Shape s; // Illegal -- compiler error because area() and render() are undefined
MyShape m; // Illegal -- compiler error because only area() has been defined; also
need render()
Circle c; // Legal because both area() and render() have been defined
The keyword "virtual" and the assignment "=0" on the area() API in Shape mark Shape as an abstract base class in C++
The class Circle must provide a definition for both render() and area() if it is to be instantiated
The class MyShape could not be instantiated because it doesn't have a definition for render()
Summary
Inheritance is an extremely powerful concept that exploits the "is-a" relationship between objects
Inheritance is distinct from composition
A car object would not inherit from an engine object
A car object might inherit from a transportation object
Inheritance is not mandatory, but it is extremely helpful
99.99% of the time, a superclass will never know about the subclasses that inherit from it
There should be no enumerations within the superclass to indicate the subclass type
The superclass should contain common attributes associated with different types of subclasses
The point at which commonality stops is typically the point at which a new subclass should be considered
A fender isn't part of all transportation objects (like a plane or train), so a transportation object probably shouldn't contain
a field for fenders
Cars and bikes often do have fenders, so an object between transportation object and a car or bike object, from which
both car and bike can inherit, might make sense
Multiple inheritance exists in C++ but not in all OO languages
Not often used
Advanced topic that has the potential to cause more problems than to solve them
Abstract base classes are a mechanism to enforce the definition of an API in a subclass
Useful when the formulation of the API cannot be known by the superclass, and no default implementation exists
This is just an introduction to inheritance
It's a very complicated subject, and much more time will be spent on individual aspects of inheritance in C++
Polymorphism
Definitions
Wikipedia: ...the ability to create a variable, a function, or an object that has more than one form.
Techopedia: Polymorphism is an object oriented programming concept to refer to the ability of a variable, function, or object to take on
multiple forms. A language that features polymorphism allows developers to program in the general rather than program in the specific.
Shape Revisit
Consider the Shape class introduced in the Inheritance section (replicated below)
The render() API has been redefined for each inheriting subclass (Circle, Rectangle, and Triangle)
The programmer that wants to render a generic Shape doesn't need to know the specific type of Shape
The language has bound a different implementation of the render() API to each subclass
The render() API is an example of polymorphism in action
By using the "=0" syntax, we're telling the C++ compiler that this API has no body associated with it
Called a pure virtual function
No instance of class that has a pure virtual function can be instantiated
public:
Shape();
Shape(const Shape ©);
~Shape();
...
virtual void render() const=0;
...
Shape &operator=(const Shape ©);
};
public:
Circle();
Circle(const Circle ©);
~Circle();
...
void render() const;
...
Circle &operator=(const Circle ©);
};
public:
Triangle();
Triangle(const Triangle ©);
~Triangle();
...
void render() const;
...
Triangle &operator=(const Triangle ©);
};
public:
Rectangle();
Rectangle(const Rectangle ©);
~Rectangle();
...
void render() const;
...
Rectangle &operator=(const Rectangle ©);
};
Polymorphism and Inheritance
Inheritance is an obvious case in which polymorphism benefits the programmer
Objects that inherit from the same superclass can redefine APIs to specify a new behavior
APIs must have same name and form of the inherited APIs
A developer can write code that operates on base APIs without needing to know about the actual object being operated upon
Polymorphism in C
It is possible to implement polymorphism within C using function pointers (callbacks)
Not as straight forward as C++
Must ensure that the correct function pointer is assigned for the right shape
The API can never be set to another API or to NULL while the shape is still in use
Must explicitly pass in the pointer to each shape to the render routine
Must use unsafe casting to explicitly set the pointer to the render API
Adds 1 pointer per API that must be virtualized
C++ uses one pointer per instance instead of 1 per API
Polymorphism in C
typedef struct Shape Shape;
struct Shape
{
RenderShape render;
};
struct Circle
{
Shape mShape;
double mRadius;
double mX;
double mY;
};
...
void renderCircle(const Circle *c)
{
// Code to render a circle using the radius and (x,y) coordinates
}
...
void initCircle(Circle *c)
{
c->mShape.render = &renderCircle;
c->mX = c->mY = c->mRadius = 0;
}
...
void renderShapes(const Shape **shapes, const unsigned numShapes)
{
// This works as long as the first parameter in each object (Circle, Triangle,
// etc.) is a Shape
for (unsigned i = 0; i < numShapes; i++)
{
shapes[ i ]->render( shapes[ i ] );
}
}
class ComplexNumber
{
double mA;
double mB;
public:
ComplexNumber();
ComplexNumber(const ComplexNumber ©);
~ComplexNumber();
// Math operations
void addTo(const ComplexNumber &other);
void subFrom(const ComplexNumber &other);
void mulAgainst(const ComplexNumber &other);
...
...
ComplexNumber x, y;
...
x.addTo( y ); // x.mA += y.mA and x.mB += y.mB
x += y; // Compiler error since += is not defined for ComplexNumber
public:
ComplexNumber();
ComplexNumber(const ComplexNumber ©);
~ComplexNumber();
ComplexNumber x, y;
...
y = x; // Uses operator=()
x *= y; // Uses operator*=()
Don't worry about the exact formulation or the "return( *this )" code in the example for now
The point is that basic operators can be defined for classes so that a more natural syntax can be employed
Operator overloading doesn't have to be limited to operators within a class
Operator Overloading Outside of a Class
ComplexNumber operator+(ComplexNumber a, const ComplexNumber &b)
{
return( a += b );
}
ComplexNumber x, y, z;
...
z = x + y; // Uses redefined operator+() and overloaded operator=() from within the
class
Later, when discussing templates, we'll learn why it's so important to be able to overload operators, like less-than, that are used for POD
types
POD = "plain old data"
Ex: int, float, char
Precedence order remains for redefined operators
See http://en.cppreference.com/w/cpp/language/operator_precedence
...
const char *testString = "This is a test";
String a;
a.mLength = strlen( testString );
a.mData = strndup( testString, a.mLength );
...
String b = a; // Bitwise copy of the contents of a into b
...
free( a.mData ); // At this point, b no longer references valid data!
By overloading the assignment operator in C++, a valid copy of a string can be made
private:
void assign(const String ©);
void destroy();
void init();
public:
String();
String(const String ©);
~String();
...
String &operator=(const String ©);
};
//////////////////////////////////////////////////////////////////////////////
// Private inline APIs below //
//////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////////
// Public inline APIs below //
//////////////////////////////////////////////////////////////////////////////
inline String::String()
{
init();
}
inline String::~String()
{
destroy();
init();
}
The assignment operator will first destroy any existing contents via destroy() and assign the new contents via init()
The init() API allocates its own storage for the string data. It does not share the string data with the instance passed into it
When the String instance destructor is invoked, it can safely return dynamic memory to the heap without worrying that something else
is pointing to its data via assignment or copy construction
Not possible without overloading the assignment operator or copy constructor
Note the test to ensure that the following works properly
Swapping 2 Integers in C
int a, b;
...
const int temp = a;
a = b;
b = temp;
...
This is a generic operation that can be repeated for many different types as long as the assignment operator is defined for a type
Using generic programming (C++ specifics to be covered in more detail later), an API can be created to swap any two types
This API can be called to generically swap two floating-point values, integers, or a C++ class for which the assignment operator has been
defined
Don't need to write unique APIs per class
Don't need to call class-specific names to do assignment operations between instances (like we would have to do in C)
We can call swapTwo() on two String instances (from the example above)
The C API qsort() requires a pointer to a function to do comparisons between two structs because operator overloading is not
supported in C
void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *));
Not required in C++ if specific operators are provided for a class
It also happens to do a swap of values via bit copy which can utilize a generic swap instead
Delegation
Very loose definition: Delegation is the process of wrapping a variable, class, or function with something else, and executing on that
delegate through the wrapper
Powerful concept
More detail to come later in the class
Simple example: composite shape
A composite shape is a single object that consists of other shapes
When rendering the composite shape, it renders individual shapes (that might also be composites)
Shape Delegation
class CompositeShape : public Shape
{
Shape **mShapes;
unsigned mNumShapes;
public:
Shape();
Shape(const Shape ©);
~Shape();
...
void addShape(const Shape &shape);
void render();
...
Shape &operator=(const Shape ©);
};
...
inline void CompositeShape::render()
{
for (unsigned i = 0; i < mNumShapes; i++)
{
mShapes[ i ]->render();
}
}
Summary
Polymorphism is a powerful concept, often considered a requirement of any true OO programming language
Lets developers that are using objects abstract out the details of the objects
Object developers are responsible for correctly implementing the underlying functionality per specification
Operator overloading is a particularly useful aspect of polymorphism
Utilizing the same operators as POD types means that generic operations can be constructed that can operate upon any type of
object for which the operator is defined
Generic Programming in C
C doesn't offer true generic programming constructs, but it does offer macros
Ex: #define min(a,b) ((a) < (b) ? (a) : (b))
Not a good generic approach
Not type safe. Passing in a "char *" to the min() macro won't work because it will take the minimum based on address, not
content
Without operator overloading, limited to C basic types (int, char, double, etc.) and not C struct's
Macros have potential side effects
min(x++,–y) will have unexpected effects on x or y based on their values
C++ defines a new keyword, template, that is used to tell the compiler that the class or API being defined is to be used as the
implementation of the class or API once the type for it has been defined
The generic type (in this case "T") is identified by the keyword "type name"
Compiler creates functions on the fly based on the types used
Template Expansion
double a, b;
int x, y;
...
double minAB = min( a, b ); // Works because a and b are same type
int minXY = min( x, y ); // Works because x and y are same type
double minAX = min( a, x ); // Compiler error because a and x are not the same
type
// The following shows the APIs that are ***auto-generated*** by the compiler based
// upon the use of min() above
The min() API doesn't suffer from side effects the way that the macro implementation does because "a" and "b" are parameters and are
not expanded
When the compiler comes across a call to min(), it expands the template, using the type of the parameters to define the API
Passing in 2 double values causes a double version of the API to be created on the fly
Still have a problem if passing in two "char *" values
Must use template specialization to solve
Define a specific implementation that the compiler will use for that type and only that type
There are drawbacks to the C++ approach
The compiler produces each class and API on the fly, increasing compilation times
Each unique template adds to the size of the code
Debugging can be challenging
You may have an error in your template API that isn't found by the compiler until that API is actually
used
Compiling of templates is a light-weight scan until they are instantiated
Use of deeply nested templates can lead to confusing error messages
Templates can't be precompiled (object files can't be created for them independent of an instantiation)
Must be carefully designed
In the above implementation, you can't take the min() of an int and a double value
The Python list is a generic list that can accept multiple items of different types
Associative arrays (maps) in these languages share similar properties
These data structures handle arbitrary types without specific support required from the programmer
They also provide a means to obtain information about the data structure using the same APIs, no matter what's stored in them
Collections (lists, maps, sets, etc.) are the most common form of generic programming and can be quite powerful
Many OO languages support collections
The form of the collection is often important from both a memory and a performance standpoint
A linked list is fast at insertion but poor at sorting or creating linked lists
The selection of a specific implementation may not be possible for a particular OO language
The selection of an implementation
The generic routine takes two of the same parameter types (2 ints, 2 floats, etc)
Can't mix two different data types with this definition
Not a C++ restriction. Templates will be covered in greater detail later in the course
Not possible to do in C, even with a macro
No way to create an intermediate variable of an undefined type in a macro (ignoring the GCC-specific extension)
Advanced example: quicksort (See example 2)
Generic Override
Sometimes the generic operation doesn't make sense "as is"
Finding the min of two values is simple, and a generic min() API can be written
What about finding the min of two values pointed to by two pointers?
Min Using Pointers
int *a, *b;
...
int *c = min( a, b );
This code will return the minimum of two pointers, not the pointer that points to the minimum of two values
To handle this case, we can define a second version of min()that is specific to pointers
Note that we need yet a third implementation if finding the min of two C strings
// Below is the syntax used to manually create an explicit version of the template.
// By explicitly creating your own version, you keep the compiler from auto-generating
// one for you based upon the template definition provided for this API
typedef char *CharPtr; // Used for clarity sake when comparing against min() above
int a = 5;
int b = 3;
int minAB = min( a, b );
Generic Containers
template<typename T> class MyArray
{
T *mArrayData;
unsigned mArraySize;
public:
MyArray();
MyArray(const T *someData, const unsigned dataSize);
MyArray(const MyArray ©);
~MyArray();
...
MyArray<int> intArray;
MyArray<char> charArray;
This container holds an array of objects whose type has yet to be defined
When we need a container of a specific type, we declare it explicitly at instantiation
Compiler instantiates required APIs for that data type as required (on the fly)
Because it's an array, we've overloaded the indexing operator (operator[]())
This operator makes sense since we explicitly named the class MyArray, implying the use of this operator
In general, this is a bad idea as it locks the implementation
This operator makes no sense for a linked list or a tree
Retrofitting this operator onto a linked list or tree would be very expensive computationally, potentially leading to O(n2) al
gorithms for loops, for example
Iterators
Being able to store and manipulate objects within a container is extremely powerful, but eventually a programmer will need to have
access to the contents of the container
When performance is critical, the storage mechanism used can be extremely important (linked list, vector, balanced binary tree, etc)
OO programming advocates data and implementation hiding
The choice of a given container should be hidden within an object
The object programmer should be able to choose a different container type in the future, without changing the APIs, if required
How do you abstract the access mechanism for data in a container from the type of container so that APIs don't need to change?
Wikipedia: In object-oriented computer programming, an iterator is an object that enables a programmer to traverse a container. ... Note
that an iterator performs traversal and also gives access to data elements in a container, but does not perform iteration.
An entire section in the C++ portion of the class is dedicated to iterators
C Iterators
Iterators are commonly used by all programmers
Because C cannot define new types, and APIs are not bound to the data, iterators are explicitly tied to the implementation
A programer must know exactly how something is organized so that the proper access method can be utilized
Iterators in C
/*
* Simple example of an iterator through characters
*/
/*
* More complex example of an iterator through structs
*/
C++ Iterators
C++ iterators are classes themselves that are used to sequence through specific collections
They overload operators so that they conform to natural syntax
Support all of the operators that pointers normally support (++, --, ->, *, ==, etc)
Iterator Example: Strings
std::string myString = "Some long string";
std::string::iterator i = myString.begin();
std::string::iterator end = myString.end();
while (i != end)
{
if (*i == 'l')
{
// Do some operation
}
++i;
}
Designing an Iterator
class MyDataStructure
{
SomeTreeStructure mData;
...
public:
typedef SomeTreeStructure::iterator iterator;
public:
...
iterator begin();
iterator end();
...
};
Later, if you want to change the data structure that you're using to a linked-list, you can do so because you haven't exposed the data
structure to the user, only the iterator (and then only through a typedef)
Changing a Data Structure with Iterators
class MyDataStructure
{
LinkedListStructure mData;
...
public:
typedef LinkedListStructure::iterator iterator;
public:
...
iterator begin();
iterator end();
...
};
Summary
Generic programming provides a simple means through which a function or class can be coded once, debugged once, and used for
many different underlying types
Often goes hand-in-hand with operator overloading
POD types utilize basic operators to perform many functions
C++ classes need to define APIs for those same operators so that the same generic approach can be used for POD as well as
C++ class instances
Generic objects can easily be defined
Scripting languages have offered these facilities for many, many years
Available in C++ via templates
Iterators further expand upon generic programming by providing a means of sequencing through different data structures in a consistent
manner
Routines don't need to know about each underlying data structure as long as they can use iterators to access data
Changing the underlying data structure doesn't necessitate a code rewrite of APIs that use that data as long as they use iterators
Will require recompilation because iterators differ from one another
Memory Management
Memory Types
Heap
Memory that is dynamically allocated and freed
Amount of heap memory is limited by the OS, physical memory size, and swap size
In today's computers, measured in gigabytes
Stack
Holds local variables
Holds register contents and return addresses for called subroutines
Limited in size (8MB in Linux is typical)
Linux default stack size for threads is 2MB
May also be used to hold thread-local storage
If more stack memory is used than is allocated, it may overwrite heap memory
Shared
Memory in a fixed address space region whose contents can be shared between 2 or more different processes
Threads share address space already
Shared memory is for different executables, not threads
Image from http://www.eventhelix.com/realtimemantra/basics/debugging_software_crashes_2.htm
Stack can grow beyond its limits
Excessive recursion
Excessive number/size of local variables
If stack overwrites allocated heap memory, or heap memory is allocated at a memory address that the stack grew into, the
program may crash or (worse) produce incorrect values
Heap cannot grow beyond its bounds
An out-of-memory exception is generated if heap memory is exhausted upon a memory request
Threading
The thread stack is allocated on the heap
Thread-local variables may end up taking up a portion of the thread stack, which may cause problems
A runtime module occasionally sweeps memory, following pointers to determine what memory is accessible
Memory that is no longer accessible is collected for destruction
Advantages
No reference counting required, and no orphaned objects
Disadvantages
Requires CPU resources to occasionally run the garbage collector
An object isn't immediately destroyed when it is no longer referenced
The garbage collector must first collect the object, which occurs when the collector is run
Results in a potentially larger memory footprint
May result in noticeable lag if a large amount of memory is collected at once
In generic garbage collection, no way to control when garbage collection might occur
Conservative garbage collection may still result in leaked memory while aggressive garbage collection would result in an
unstable system
Memory Leaks
Memory leaks are easy to insert into the code by mistake
The longer the API, the easier it is to have a memory leak
Example
A function allocates memory at the entry point of the function
The amount of memory required means that it has to be allocated from the heap
The function is very long, and to see the entire API requires scrolling a few pages in the editor
A bug is found in the API at some point, and a different programmer puts in a section of code that exits the routine upon
discovering the error
The programmer was unaware of the initial allocation or the fact that the memory was deallocated at the end. They just knew
enough to fix the bug
Memory Leak in C++
int someAPI(SomeObject *obj)
{
/*
* Here we don't use std::auto_ptr<T>, and we can see that a
* memory leak occurs because we branch out of the routine
* without destroying "x"
*/
SomeBigObject *x = new SomeBigObject;
...
if (!obj)
{
return( -1 );
}
...
x->someAPI();
...
(*x).someOtherAPI();
...
delete x;
return( 0 );
}
While C++ can't prevent these types of issues, it can help with them
Consider a new object to automatically handle memory (using generic programming techniques)
It is small enough to fit on the stack
It overloads the pointer dereference operators ("->" and "*")
It automatically destroys the underlying object when it goes out of scope
Automatic Pointer
template<typename T> class AutomaticPointer
{
T *mObj;
private:
// By declaring these and putting them into the private section of
// the class, we prevent the compiler from auto-generating versions
// for us. Because they're private, no user can attempt to use them
// by mistake. By not defining a body for them, we'll cause a link
// error should any API in this class attempt to use them.
AutomaticPointer();
AutomaticPointer(const AutomaticPointer ©);
AutomaticPointer &operator=(const AutomaticPointer ©);
public:
AutomaticPointer(T *obj);
~AutomaticPointer();
T &operator*() const;
T *operator->() const;
};
Now you can use AutomaticPointer to guarantee that your memory is always going to be returned
No Memory Leak
int someAPI(SomeObject *obj)
{
AutomaticPointer<SomeBigObject> x( new SomeBigObject );
...
if (!obj)
{
// No memory leak!
return( -1 );
}
...
x->someAPI(); // Syntax is still valid due to operator overloading
...
(*x).someOtherAPI(); // Same here
...
// No need to delete x!
return( 0 );
}
When "x" goes out of scope, the destructor for "x" is called
The destructor releases memory associated with SomeBigObject
The instance "x" looks like a pointer
Slight performance penalty because of one level of indirection through instance "x"
No need to write one on your own
C++ standard libraries support automatic pointers for you already
This class was just for demonstration purposes
Need one class for allocation of a single object, a different class for allocation of multiple objects
See std::auto_ptr<T> and std::unique_ptr<T>
Some classes handle memory management for you transparently
C has no built-in string type, just a vector of characters
C++ std::string takes care of allocation and deallocation of memory associated with strings
It may or may not use reference counters to share strings, depending on the implementation
Assigning one string to another may increment a reference counter, or it may make a copy of the string
Other classes may provide similar services
If you dynamically allocate objects of these types and fail to destroy them, they'll still end up leaking memory!
Summary
Many languages provide a means for automatic memory management
Scripting languages do by default
C++ does not provide automatic memory management as part of the language
Had to be compatible with C
High-performance and/or low-memory needs sometimes dictate that automatic memory management cannot be used
C++ classes can be used to help with memory management
External libraries (Boost) also provide facilities
Proposals have been written for garbage collection in C++ but have never become part of the language
For more information, see RAII (Resource Acquisition Is Initialization)
Namespaces
Definitions
Wikipedia: "In general, a namespace is a container for a set of identifiers (names), and allows the disambiguation of homonym identifiers
residing in different namespaces.[1][2] Namespaces usually group names based on their functionality."
cplusplus.com: "Namespaces allow [programmers] to group entities like classes, objects and functions under a name. This way the global
scope can be divided in "sub-scopes", each one with its own name."
Overview
Many languages provide namespace support
Namespaces provide another form of encapsulation
Encapsulation controls access to data and APIs
A namespace encapsulates classes, variables, and APIs by adding a named wrapper around these entities
To access a class, variable, or API, the name of the namespace must be prefixed to the name of the class, variable or API
C++ Namespaces
Namespaces are created using the keyword namespace
Namespaces may be named or unnamed
Unnamed
Equivalent to declaring variables, classes, and APIs "static" (in C terminology)
Only have file-level scope and cannot be accessed outside of the file the namespace is defined within
Anonymous namespaces are preferred to using static
Named
The namespace name follows normal naming conventions
Need not be contiguous
Namespaces can be nested
C++ standard components belong to the std namespace (like cin and cout)
Namespaces
namespace
{
// This is an anonymous namespace
// Anything defined in this namespace has file-level scope
int someVar;
namespace A
{
void apiInA()
{
...
}
namespace B
{
void apiInAB()
{
...
}
}
}
namespace B
{
// This is different from the nested namespace under "A"
void apiInB()
{
...
}
}
namespace A
{
// This adds to what was in A before
void anotherAPIInA()
{
...
}
namespace B
{
// This continues to add to A::B
...
}
}
Namespace Madness
void anotherAPI()
{
// This is in the global namespace
...
}
namespace A
{
void someAPI()
{
...
}
}
namespace B
{
void someAPI()
{
...
}
}
namespace
{
void someAPI()
{
...
}
void anotherAPI()
{
someAPI(); // Calls API in this anonymous namespace
::anotherAPI(); // Calls global namespace API
...
}
}
void namespaceTest()
{
someAPI(); // Calls API in anonymous namespace
A::someAPI(); // Calls API in A's namespace
B::someAPI(); // Calls API in B's namespace
}
Using Directive
The "using" directive tells C++ to treat names within the namespace as if they were in the current namespace
No need to explicitly provide the namespace as part of the variable, class, or API name
The directive is scoped in the same way that variables are scoped
Only valid within the containing open/close braces ({})
If done at top-level, scope is valid from that point in the file to the end of the file
Use with caution
Easy to confuse which namespace you're working within
Better off explicitly naming the namespace
Summary
Namespaces provide another means of encapsulation
Found in many different programming languages
Even in formats like XML
Namespaces disambiguate names that might otherwise conflict
Namespaces can often be nested to provide additional conflict protection
Anonymous namespaces in C++ provide file-level scope similar to the statickeyword in C
Anonymous namespaces preferred mechanism in C++
The static keyword applies only to variables and functions but not to user-defined types (class, struct)
Classes and structures can be defined within an anonymous namespace
In C++, "::" is used to separate names in namespaces
Prefixing a name with "::" refers to the name in the global namespace
No need to qualify a name if it's within the same namespace or is unambiguous in upper-level namespaces
The Basics
Defined vs Declared
There's a difference between the two words, though people often use them interchangeably
Sometimes the two happen simultaneously
A class API is always declared within the class
The definition of the API may be within the class or outside of the class
Distinction
The declaration, with documentation, tells the user what the method does
The definition (or implementation) tells the compiler how the method should do something
Conflicting goals possible
Putting the declaration in the header file, and the definition into a .cpp or .cxx source file, can speed compilation and reduce
dependencies between modules
Putting the definition into the header file (using inline) can reduce runtime
Easier to understand through example
You might want to come back and re-read this after going through the rest of the content of this page
It's introduced now because the terms "defined" and "declared" are used throughout the page
Defined vs Declared
class SomeClass
{
...
void someAPI()
{
// The definition of the API can be provided within the class itself
// as is done in Java and Python
...
}
void anotherAPI(); // Don't define the API within the class
};
void SomeClass::anotherAPI()
{
// The definition of the API is done outside of the class
// Note that the API is preceded by the name of the class and
// two colons: "SomeClass::anotherAPI()"
...
}
Comments
Comments may be multi-line comments or single-line comments
/* */ : Beginning and ending delimiters for multi-line comments
// : Single-line comment delimiters
Class Declarations
Classes are created using the keyword class
Classes are similar to a C struct in the way that they are declared
Not surprising since a struct is a class in C++
A semi-colon is used after the class definition, has as is done with a struct
Unlike a C struct, a class can include API declarations
A class can also contain only API declarations
A class can also be empty (yes, these are actually useful)
By default, all data and APIs in a class are private (can only be accessed by other APIs in the class)
The keyword public makes all data and APIs listed after the keyword accessible to anyone else from outside the class
The keyword protected limits access to the data and APIs after it to the class itself and any inheriting classes
The keywords private, protected, and public can occur anywhere within the class definition, and can be repeated as often as
desired
Data members are defined in the same way that they are defined in a C struct
APIs are defined using either just the fully-qualified API declaration or as the declaration and body
When the body is provided within the class, the API is a candidate for code inlining
A class definition may be nested within another class definition
Class Syntax Basics
class SomeClass
{
// All APIs, data, and classes below are private
class NestedClass
{
// This is a nested class within SomeClass
// It's private, so nobody can access it directly
// outside of SomeClass instances
...
};
int mSomeData;
protected:
// All data, APIs, and/or classes listed after the protected
// keyword are accessible to inheriting APIs
int mProtectedData;
void someProtectedAPI();
public:
// All data, APIs, and/or classes listed after the public
// keyword are accessible to anybody
void soSomeOperation();
private:
// Back to private access again
void yetAnotherAPI();
public:
// Back to public
Copy Constructor
// This program generates a compiler error because the copy constructor
// is declared as private. The problem is that, when a() is called, a copy
// of the class is passed into it.
//
// If the copy constructor is commented out, the code will compile because the
// compiler will generate a copy constructor that does a bit copy of the
// class instance. BEWARE COMPILER-GENERATED CONSTRUCTORS!
class SomeClass
{
private:
SomeClass(const SomeClass ©) {}
public:
SomeClass() {}
~SomeClass() {}
SomeClass &operator=(const SomeClass ©) {}
};
Class Members
Variable members of a class may be an instance of another class
Variable members are initialized (constructors are invoked) in the order in which they appear within the class
Could cause problems if one variable depends on the setting of another variable and the initialization order is incorrect for that
dependency
const
Variables, class data members, API parameters, and APIs may be designated as const
Variables outside of a class
The const keyword, when applied to a variable tells the compiler that the value of the variable (or variables in the instance that
the variable represents) won't change
Class data member (instance variable)
A const data member is assigned a value at construction (or at definition, in the case of a static data member (see below))
It can never change in the life of the class instance
API parameter
It's contents will not change during the scope of the API
API/method
Tells the compiler (and caller) that none of the instance's object variables will change within that API
The method has no side effects
Put another way, the user would be surprised if the method did have side effects
Instance variables that are const can only invoke const methods
Methods that are const can only invoke other const methods (once a const, always a const)
Methods that are not const can call any type of method
Location of const keyword impacts what is actually constant
Probably one of the most confusing aspects of dealing with const
Const examples
// Variables
const char *a; // The variable is going to point to something that cannot
change
// However, "a" can be reassigned
char const *b; // Identical to "a"
char * const c; // The variable "c" cannot be changed (so it should be
initialized
// when declared). What it points to can change.
const char * const d; // The variable "d" cannot be changed (so it should be
initialized
// when declared). What it points to cannot change, either
// Parameters
void someAPI(const int x); // The variable "x" cannot change within the API
class SomeClass
{
const unsigned mValue; // The variable mValue must be set at construction
bool mTestVar;
...
public:
...
SomeClass(const int value);
...
void someAPI() const; // The API someAPI() cannot change a variable of the
instance
// unless it is marked mutable (below)
};
mutable Variables
Mutable Example
class LockObject
{
...
public:
...
void lock(); // not const
void unlock(); // not const
...
};
class SomeClass
{
mutable LockObject mLock;
...
public:
...
int returnSomeValue() const;
...
};
static
Variables and APIs of a class can be static
Variables
When a variable is declared as static, there is only one variable for all instances of a class
It's effectively a global variable, scoped within the class
Variables marked as static may be private, protected, or public, just like non-static variables
In general, static variables should be initialized when they are defined and not in the constructor
Assigning in the constructor would cause each instance to reassign a value to the variable
Such variables may also be const
Declared within the class but defined outside of the class
There is no difference between declaration and definition for instance variables
APIs
Can also be marked as static
They can be called outside of any specific instance
Cannot be const
It makes no sense for them to be const since they can't access variables or APIs of an instance directly
They can call private and protected APIs of instances of their own class (in addition to the public APIs that any routine
can call)
There is no this pointer associated with a static API
If you want to operate upon a specific instance, you need to pass that instance to the static API
private:
void myPrivateAPI();
public:
...
static void someStaticAPI();
static void anotherStaticAPI(SomeClass *scInstance);
...
};
...
SomeClass sc;
...
SomeClass::someStaticAPI(); // Call a static API by fully qualifying
the name
...
sc.someStaticAPI(); // Can use an instance to call the API
...
SomeClass::anotherStaticAPI( &sc );
...
sc.anotherStaticAPI( &sc );
...
Why would you want to use static? Compatibility with a C library, for example
Static APIs and Data Members
class SomeClass
{
...
static unsigned gInstanceCount; // Using "g" prefix to denote "global"
void *threadedAPI();
public:
SomeClass();
~SomeClass();
...
static void *threadedEntry(void *ptr);
};
SomeClass::SomeClass()
{
...
gInstanceCount++;
}
SomeClass::~SomeClass()
{
...
gInstanceCount--;
}
void *SomeClass::threadedAPI()
{
// This is the API that actually does the threading
...
}
...
pthread_t thread;
pthread_create( &thread, NULL, SomeClass::threadedEntry, new SomeClass );
...
Restricting Construction and Assignment
See "Defined vs Declared" above
There are some situations in which public constructors and/or assignment operators make no sense
They can be declared in the class, with no body provided
This keeps the compiler from auto-generating a default version for these constructors
Will cause a link error if a programmer attempts to use them by mistake
Example: TCP/IP communication pipe class
In the specification, it was decided that no instance can be created without a TCP/IP address
The default constructor is declared private so that nobody can create an instance without the address
The copy constructor and assignment operator are restricted because making a copy of the pipe instance doesn't make sense
for a communications class
How would communication work if a copy is allowed
What happens when the first copy goes out of scope or is explicitly destroyed?
TCP/IP Pipe Class Example
class TCPIPPipe
{
public:
// This is just an example to illustrate a nested class. Ordinarily
// you'd use another class and not a struct.
struct IPAddress
{
short mAddress[ 4 ];
IPAddress()
{
mAddress[ 0 ] = mAddress[ 1 ] = mAddress[ 2 ] = mAddress[ 3 ] = 0;
}
};
private:
/*
* Do not allow the default constructor, copy constructor, or assignment
* operator. By declaring them here and providing no body, we keep the
* compiler from auto-generating a default version. Furthermore, any attempt
* to use them will result in a compiler error, telling the programmer that
* these operations are prohibited
*/
TCPIPPipe();
TCPIPPipe(const TCPIPPipe ©);
TCPIPPipe &operator=(const TCPIPPipe ©);
public:
/*
* The only supported constructor will take a TCP/IP address argument
*/
TCPIPPipe(const IPAddress &address);
~TCPIPPipe();
...
};
...
TCPIPPipe::IPAddress addr;
addr.mAddress[ 0 ] = 192;
addr.mAddress[ 1 ] = 168;
addr.mAddress[ 2 ] = 1;
addr.mAddress[ 3 ] = 1;
TCPIPPipe pipe( addr );
TCPIPPipe pipe0; // Compiler error: private default constructor
TCPIPPipe pipe1( pipe ); // Compiler error: private copy constructor
TCPIPPipe pipe2( addr );
pipe2 = pipe; // Compiler error: private assignment operator
Construction
class C
{
...
public:
C();
C(const char *someText);
...
};
class B: public C
{
...
public:
B();
B(void *ptr);
...
};
class A: public B
{
...
public:
A();
...
};
A::A() : B( NULL )
{
...
}
class B
{
...
public:
B();
B(void *ptr);
...
};
B::B(void *ptr)
{
...
}
Virtual Methods
Preceded by the keyword virtual
Provides runtime binding to a method instead of compile time binding
If you issue a method, ptr->method(), the code to be executed is selected at compile time for a non-virtual method based on
the apparent type of the pointer
The code to be executed is selected at runtime for a virtual method based on the actual type of the underlying pointer ( ptr)
This is called dynamic binding
Inheriting classes can override the definition provided in a superclass
Once a method is virtual, it's virtual for all inheriting classes
The top-most class defines the API that is invoked
Constructors cannot be virtual but destructors can be
By defining the base class destructor as virtual, you guarantee that deleting a pointer to a base class will invoke the proper
destructor instead of invoking the base class destructor
A virtual destructor uses the keyword virtual before the destructor
Used in inheritance to guarantee that the correct constructor is applied
Example
Suppose class A inherits from class B and A's destructor is virtual
An instance pointer of type B is passed to an API that expects a type of A
The instance pointer is destroyed, but the instance pointer is now a pointer to type A, not B
If the destructor is virtual, B's destructor will be invoked (which will, in turn, invoke A's)
If the destructor is not virtual, only A's destructor is invoked
More on virtual destructors when talking about inheritance
A single virtual API requires that all instances carry an additional (hidden) pointer in order to resolve virtual API calls
Pure, virtual APIs have no body associated with them
Use "=0" syntax when declaring the API within the class
Inheriting classes must provide the body for pure, virtual APIs
Only a class for which all pure, virtual APIs have been defined can be instantiated
Can also have virtual superclasses
To be covered later in multiple inheritance
Virtual Methods
class A
{
...
protected:
// Define these as protected in the event that inheriting classes define a
// copy constructor or assignment operator. We can't create an instance of
// A since we have a pure, virtual API, so making them public doesn't make
// any sense
A(const A ©);
A();
A &operator=(const A ©);
public:
virtual ~A();
class B : public A
{
...
public:
...
void draw(); // Override A's draw()
void drawIt(); // Provide a definition for drawIt()
...
};
class C : public B
{
...
public:
...
void draw(); // Override B's draw()
void drawIt(); // Override B's drawIt()
...
};
void B::draw()
{
...
A::draw()
}
void C::draw()
{
...
B::draw()
}
obj->draw();
obj->drawIt();
public:
SomeClass();
SomeClass(const int constValue);
...
};
SomeClass::SomeClass() : mConstValue( 0 )
{
mOtherValue = 10;
}
References
Similar to pointers, but with a different syntax
Safer in that you can't delete a reference like you can a pointer
A reference must be assigned when declared
A reference cannot be reassigned
References
class SomeClass
{
...
public:
...
someAPI();
...
};
...
int x;
SomeClass inst;
int &xReference = x;
SomeClass &instReference = inst;
...
inst.someAPI();
...
instReference.someAPI(); // same as inst.someAPI()
...
The variables xReference and x both refer to the same memory location
Any changes made using xReference are the same as using x and visa versa
The class instance variables inst and instReference refer to the same instance, just like x and xReference
So why would you use references?
Some people like the syntax better than dealing with pointers
A returned reference cannot be deleted while a returned pointer can be
Can't return a reference to a local variable since the memory associated with a variable will be invalid after the function
returns
Convention: passing in a const reference instead of a pointer
Passing a pointer could mean that you're going to allow the variable to be changed
Passing in a const reference means that the variable won't be changed
It all comes down to a matter of style
Reference Examples
const int &min(const int &a, const int &b)
{
return( a < b ? a : b );
}
int temp = a;
a = b;
b = temp;
}
...
return( *this );
}
int &invalidReturn()
{
// This is invalid because we can't return a reference to a local variable.
// The memory associated with the variable is no longer valid once the
// function returns, so a reference makes no sense
int x;
...
return( x );
}
const int &findFromVector(const int *vector, const unsigned numValues, const int
findValue)
{
// This actually works. We can return a reference to NULL because the compiler
// returns a pointer under the covers when it returns a reference. No dereferencing
// is done unless the caller does so
Variable Declarations
In traditional C89, variables must be declared at the top of a function
In C99, variables can be declared where they are first used
Adopted from C++ specification
Can define variables inside of a for-loop
bool Types
The bool type exists in ANSI C (C99) but not in older versions of C
Requires use of include file stdbool.h
Can have values of true and false (which are macros in C)
In C++, bool is a defined type, and true and false are keywords
Not necessarily equivalent to 0 and 1
...
double p1 = poly( 1.0, 5.0 ); // p1 = 5
double p2 = poly( 2.0, 3.14159, 5.75, 3.6 ); // p2 = 29.0416
...
class SomeClass
{
...
public:
SomeClass(const long x=0, const double y=1.4, const double z=3.14159);
SomeClass(const char *name=NULL); // Compiler error -- conflicts with
prior
// constructor when invoked with no args
SomeClass(const char *name, const long x=0);
...
void someAPI(const double a=6.0, const double b=2.3) const;
...
};
Friends
A class can have a friend API or class
A friend can call private and protected APIs in a class, or access data directly in a class, just as if it was an API of the class
Should have very limited utility, though there are cases where it makes sense
It breaks abstraction if used liberally
Should never be used if there's a reasonable alternative!
Graph
# include <vector>
class Node;
class Edge;
class Graph;
class Node
{
// A node can have 0 or more edges attached to it
Graph *mOwner;
std::vector<Edge*> mEdges;
...
private:
// We aren't going to allow these because they don't
// make sense
Node();
Node(const Node ©);
Node &operator=(const Node ©);
private:
// Make the graph a friend so that it can create nodes
Node(Graph *owner);
~Node();
public:
unsigned numEdges() const;
Graph *owner() const;
...
};
class Edge
{
Graph *mOwner;
Node mNodes[ 2 ];
...
private:
Edge();
Edge(const Edge ©);
Edge &operator=(const Edge ©);
private:
// Make the Graph a friend so that it can create edges
public:
Node &operator[](const unsigned nodeIndex) const;
Graph *owner() const;
...
};
class Graph
{
std::vector<Node*> mNodes;
std::vector<Edge*> mEdges;
...
public:
Graph();
Graph(const Graph ©);
~Graph();
...
Node &createNode();
Edge &createEdge(Node &n1, Node &n2);
...
Node &Graph::createNode()
{
Node *newNode = new Node( this );
mNodes.push_back( newNode );
return( *newNode );
}
In this example, we don't allow public constructors or assignment operators for Node and Edge
A Node instance and Edge instance must belong to a Graph instance, so creating one outside of a Graph doesn't make sense
We have APIs within Graph to create Node and Edge instances
They also take care of data management (adding those instances to the Graph's data structures for tracking)
Because a Graph must be able to create a Node and Edge, it's a friend to both classes
Header Files
Header files in C++ have been renamed
Many have a "c" prepended to their name
They no longer use the ".h" in the name
Examples
#include <cstring>
#include <cstdio>
#include <cstdlib>
Namespaces
Namespaces prepend class and API names with the name of the namespace so as to avoid name collisions
C++ uses the std namespace for many of the system APIs
Even normal C APIs, like memcpy(), have been moved to the std namespace
To use an API or class in a namespace, prepend the class or API with the name of the namespace, separated by "::"
std::memcpy()
If within a namespace, and you want to use an API in the global namespace that conflicts with the name of an API in your current
namespace, prepend the API with "::"
The anonymous namespace defines things that are treated as if they are in the global namespace, but their
duration is for the extent of the file
Doesn't make any sense to put them into a header file
Namespaces
#include <cstring>
namespace MyNameSpace
{
void *memcpy(void *dest, const void *src, size_t n)
{
...
}
void someAPI()
{
...
std::memcpy( to, from, size ); // Calls the "C" version of memcpy()
...
::memcpy( to, from, size ); // Calls the version at the top of the file
...
memcpy( to, from, size ); // Calls the version in my namespace
...
using qi::double_;
using qi::phrase_parse;
using qi::_1;
using qi::lit;
using phoenix::push_back;
Skipper<Iterator> skipper;
Because C was the output of cfront, it was easy to compile and link the two together
Example C++ source and cfront output below (from http://cpptips.com/c++_c_output)
//-------------------------------------------------------//
class Base {
private:
int Base__privateData ;
protected:
int Base__protectedData ;
public:
int Base__publicData ;
} ;
//-------------------------------------------------------//
public:
int Derived__publicData ;
//-------------------------------------------------------//
int main() {
int i ;
Derived d1 ;
Derived d2( 2 ) ;
i = d1.Base__getPrivateData() ;
printf( "%d\n", i ) ;
i = d1.Derived__getPrivateData() ;
printf( "%d\n", i ) ;
i = d1.F1() ;
printf( "%d\n", i ) ;
i = d1.F2() ;
printf( "%d\n", i ) ;
}
--------------------------------------------------------------------------------------
--------------
I2 cfront C output (formatted)
--------------------------------------------------------------------------------------
--------------
char *__vec_new() ;
char __vec_delete() ;
struct __mptr {
short d ;
short i ;
__vptp f ;
} ;
struct Base {
int Base__privateData__4Base ;
int Base__protectedData__4Base ;
int Base__publicData__4Base ;
struct __mptr *__vptr__4Base ;
} ;
return __0this ;
}
struct Derived {
int Base__privateData__4Base ;
int Base__protectedData__4Base ;
int Base__publicData__4Base ;
struct __mptr *__vptr__4Base ;
int Derived__privateData__7Derived ;
int Derived__publicData__7Derived ;
} ;
return __0this ;
}
int
main()
{
_main() ;
{
int __1i ;
struct Derived __1d1 ;
struct Derived __1d2 ;
__dt__7DerivedFv( &__1d2, 2 ) ;
__dt__7DerivedFv( &__1d1, 2 ) ;
}
exit( 0 ) ;
}
char __pure_virtual_called() ;
A struct is a class
A C++ class binds data and APIs into a single entity
From the output of cfront (above), it can be seen that a class is translated into a C struct
In C++, a struct is simply a class in which all APIs and data members are public
This maintains backward compatibility with C since existing code can be compiled without requiring a change
It also means that a C struct can be enhanced with APIs without breaking compatibility
One or more constructor to initialize a struct instance
An assignment operator to be used when a simple bit copy could lead to problems (for example, copying pointers)
A destructor to return memory to the heap when no longer needed
A C++ Struct
class ThisIsAStruct
{
public:
... // Data members only
};
Name Overloading
Note above that the cfront program expanded the names of methods to include the class type as well as the parameter types
Because of this expansion, C++ lets the user write APIs of the same name as long as the types and/or number of parameters differs
between the APIs
Works for both class methods as well as normal functions
Required changes to the C header files
Function Overloading
// These are all legal because the parameter types differ
// Note that the return types differ, though this has no
// bearing on function overloading
float someAPI(int a)
{
...
}
The mangled names for the above APIs can be seen by using the UNIX nm command
The c++filt command converts mangled names into C++ names
Mangled Names
$ g++ -c someAPI.cpp
$ nm someAPI.o
0000000000000038 s EH_frame0
0000000000000010 T __Z7someAPIi
0000000000000020 T __Z7someAPIid
0000000000000000 T __Z7someAPIii
$ nm someAPI.o | c++filt
0000000000000038 short EH_frame0
0000000000000010 T someAPI(int)
0000000000000020 T someAPI(int, double)
0000000000000000 T someAPI(int, int)
In C, the parameter types do not become part of the function name
Because C++ code had to link to C libraries, a new syntax was developed so that the compiler would recognize C APIs and not expand
the names
#if defined(__cplusplus)
extern "C" {
#endif
...
void *memcpy(void *dest, const void *src, const size_t size);
void memset(void *dest, const ch val, const size_t size);
...
#if defined(__cplusplus)
}
#endif
Summary
C++ isn't a pure object-oriented programming language
It's C heritage required tradeoffs in its implementation
The fact that it's came from C isn't all bad
C++ benefits from the raw processing power of C
Fast adoption due to the sheer number of C programmers at the time it was developed
At the time, it could utilize all of the APIs already in place for C programs, especially in a UNIX environment
Despite the fact that C++ compilers natively compile C++ source instead of translating to C, it's C heritage remains in tact
Understanding the translation process benefits C++ programmers even today when trying to determine optimization
methodologies
Because the types are part of the expanded method names, APIs can be overloaded in C++ based on parameter types and const modifi
ers, but not return types
What is this?
The this Pointer
How do you associate data with an API?
In examples shown so far, the class APIs access data without explicitly referring to a specific instance
public:
...
unsigned size() const;
...
};
...
StringClass someString;
...
unsigned stringLen = someString.size();
The translator added a hidden parameter, named "this", that points to the specific instance of the object that is to be operated upon
Even though cfront is no longer used, the "this" pointer still exists
It can be used in existing code to distinguish this specific instance
Why Use the this Pointer?
Prior examples in the class have shown how to override the assignment operator
The assignment operator copies the contents of the passed instance into the current instance
Assign integer variable b to a copies the contents of b into a
It also returns a reference to a so that further copies can be made
The generic assignment operator shown in the class so far consists of several components
The if-statement checks to see if the passed copy is the same as the current copy (assigning a variable to itself)
Sometimes assigning a variable to itself is fine
Assigning an integer variable to itself is not a problem
In many classes, assignments have side effects
A class may first deallocate memory it currently holds in preparation for a new value (the destroy() API)
If memory is released or the state is destroyed, the instance value will be reset
The next step, to copy the contents into the current instance, will then copy reset values instead of original
values, assuming that there isn't a program crash
Validating that the passed instance isn't the current instance is a quick way to avoid unnecessary (and many times
problematic) assignments
There are many web discussions about this particular problem that you may want to review
After the current contents have been destroyed and the new contents copied into the instance, a reference to the current
instance is returned
The form that you choose for your assignment operator should match your class
You may not need a separate destroy() or assign()
You may not need to check to see if an instance is assigning to itself
However, you must always return a reference to the current instance
struct MouseData;
struct WindowData;
enum WindowEvents
{
...
MOUSE_CLICK,
...
};
/*
* You have a C++ class that creates a GUI window in which users can draw
* shapes using their mouse (for example, a routing editor). You want your
* C++ class to interface to the window system, but it must do
* so via the C APIs.
*/
class MyGUICanvas
{
...
private:
void mouseWasClick(MouseData *data);
static void mouseClickCallback(MouseData *mouseData, void *instancePtr);
...
public:
...
void guiRegisterCallbacks(WindowData *window);
...
};
...
WindowRegisterCallback( window, MOUSE_CLICK, reinterpret_cast<MouseCallback>(
&mouseClickCallback ), this );
...
}
Summary
The this pointer is a hidden parameter to every class API
It can be dereferenced just like any other pointer
You can't assign to it (the compiler will produce an error)
You can't declare a variable with the name "this"
When accessing a variable in an instance or calling an API, it is equivalent to using the this pointer to invoke the API
Because a struct is also a class, the this pointer also exists for APIs associated with struct instances
Casting
There are 2 types of casts: implicit and explicit
C supports both types, too
C Casts
int x;
float y;
...
y = x; // Implicitly convert an int into a float
...
x = y; // Implicitly convert float to int
...
char *str = (char*) malloc( 1024 ); // Explicitly convert "void*" to "char*"
class B
{
...
public:
B();
B(const B &other);
B(const A &other); // Implicit conversion from A to B
~B();
...
B &operator=(const B &other);
B &operator=(const A &other); // Implicit conversion from A to B
};
...
A a;
B b;
...
b = a; // Use assignment operator to convert
...
B bPrime( a ); // Use constructor to convert
Casting
class Animal { ... };
class Dog : public Animal { ... };
class Cat : public Animal { ... };
For each edit, please make sure that you start with the original file that I'm providing. You can get the original problem and the solutions by
clicking on the appropriate links below:
#include <cstdio>
/*
* This program is a very simple template to try experiments on. Compile
* and run it under the following scenarios to see the output.
*
* The program itself is pretty simple. When an instance is created, memory
* is allocated. When the instance goes out of scope, memory is deallocated.
* Messages are printed when memory is allocated and deallocated so that the
* address of this memory can be viewed. By changing the class and the code,
* we'll see different behavior.
*
* First, run the program "as is". You should see something to the following
* effect: (Note: Your memory addresses will be different than the example.)
* Memory allocated from constructor at 0x7f93e4003200
* Memory allocated from copy constructor at 0x7f93e4004600
* Memory deallocated in destructor at 0x7f93e4004600
* Memory deallocated in destructor at 0x7f93e4003200
* The first block of memory is allocated when instance "sc" is created in
* main(). The main() body calls API someAPI() which takes a single parameter:
* an instance of SomeClass. C++ has to call the copy constructor for SomeClass
* to make a copy of the parameter "sc" which is used in the API. When the API
* terminates, the local instance goes out of scope, and the destructor is
* invoked. Finally, control returns to main(), and "c" goes out of scope,
* invoking the destructor.
*
* For the following assignments, always start from this base code, not the
* code that you modified.
*
* Assignments:
* o Change the parameter "sc" in someAPI() to be a const reference. You
* should see the following output:
* Memory allocated from constructor at 0x7fb03a003200
* Memory deallocated in destructor at 0x7fb03a003200
* Can you explain why?
* o Using the original code, comment out the copy constructor. Now you
* should see the following (or something close to it):
* Memory allocated from constructor at 0x7fe141003200
* Memory deallocated in destructor at 0x7fe141003200
* Memory deallocated in destructor at 0x7fe141003200
* *** glibc detected *** ./constructor1: double free or corruption ...
* Can you explain what's going on?
* o Using the original code, comment out the copy constructor. Also
* remove "const" from the member variable mSize. In main(), create a
* second variable of type SomeClass. and assign c to d ("d = c;"). You
* should see the following (or something close to it):
* Memory allocated from constructor at 0x7fa742003200
* Memory allocated from constructor at 0x7fa742004600
* Memory allocated from copy constructor at 0x7fa742004a00
* Memory deallocated in destructor at 0x7fa742004a00
* Memory deallocated in destructor at 0x7fa742003200
* Memory deallocated in destructor at 0x7fa742003200
* *** glibc detected *** ./constructor1: double free or corruption ...
* o Using the original code, comment out the assignment operator. In main(),
* add the line "SomeClass d = c;" and run. You should now see something
* like the following:
* Memory allocated from constructor at 0x7fa081803200
* Memory allocated from copy constructor at 0x7fa081804600
* Memory allocated from copy constructor at 0x7fa081804a00
* Memory deallocated in destructor at 0x7fa081804a00
* Memory deallocated in destructor at 0x7fa081804600
* Memory deallocated in destructor at 0x7fa081803200
* In C++, when you declare a variable and assign it at the same time,
* the copy constructor is invoked, whether you have an assignment operator
* or not (try it by uncommenting the assignment operator and compiling
* again). In other words, doing
* SomeClass d = c;
* is the same as doing this:
* SomeClass d( c );
* Give it a try to see.
*/
class SomeClass
{
char *mMemory;
const unsigned mSize;
public:
SomeClass();
SomeClass(const SomeClass ©);
~SomeClass();
Operator Overloading
Available Operators
Most normal C operators can be redefined on a per-class basis
Cannot redefine operators for POD (int, long, short, etc.)
Redefining operators for POD could cause havoc in programs
Added C++ specific operators that can be overloaded
new and delete
Follows standard C operator precedence
In addition to using natural notation, you can also use explicit operator notation
Ex: "a += b" or "a.operator+=( b )"
Table from http://en.cppreference.com/w/cpp/language/operator_precedence
() Function call
[] Array subscripting
* Indirection (dereference)
& Address-of
sizeof Size-of
14 || Logical OR
17 , Comma Left-to-right
Restrictions
Cannot overload "::", ".", ".*" or "?:" operators
Cannot overload sizeof() or throw() (shown as operators above, but not normally thought of that way)
New operators cannot be created
Other than cast operators
Overloading ",", "&&", and "||" means that evaluations using those operators lose their special properties
Operator "," normally sequences operations
Operator "&&" will normally stop evaluating an expression if the left side of the operator is false
Operator "||" will normally stop evaluating if the left side of the expression is true
Binary operators must return a value, just like they do for POD in C
When defining a binary operator in a class, the instance on the left side of the operator is the instance called while the right
side is the parameter
Ex: "a == b" is equivalent to "a.operator==( b )"
Interesting points
Overloading the function call operator (operator()()) can be done multiple times by passing in different arguments
Example: people sometimes use it when accessing multi-dimensional arrays
Can redefine the way that memory is allocated and deallocated on a per-class basis by overloading the new and delete operat
ors
Can prohibit allocation of a vector of instances by hiding new[] and delete[] operators
Matrix Example
class Matrix
{
public:
typedef double BaseType;
private:
...
public:
...
BaseType &operator()(const unsigned row, const unsigned col);
const BaseType &operator()(const unsigned row, const unsigned col) const;
...
};
…
Matrix mat;
Matrix::BaseType value;
calculateValue( &value, ... );
mat( 4, 5 ) = value;
When dealing with combination logical/assignment or mathematical/assignment operators the value on the left-hand side is the object
operated upon while the value on the right-hand side is the parameter
...
SomeClass SomeClass::operator++(int)
{
/*
* In a post-increment operation, the increment is supposed to occur
* after the value is retrieved. For example, if "a" is 10, the
* following statement sets "b" to 10 and then increments "a" to 11:
* unsigned b = a++;
* The following scheme results in the same effect. Unfortunately, it can be
* expensive to do post increment because:
* (1) We have to make a copy of the class
* (2) The copy is returned on the stack
* A pre-increment operator can be much cheaper, depending on the class.
* Yes, the compiler can optimize away some things, but depending on your
* class, it can't optimize away everything.
*/
SomeClass &SomeClass::operator++()
{
/*
* In the pre-increment, we alter the value and then return a *reference*
* to the class (vs a copy of the class in post-increment). This can be
* a significant savings in runtime
*/
mSomeCounter++;
return( *this );
}
...
SomeClass x;
...
x++; // Potentially very expensive vs "++x"!
...
...
public:
...
SomeClass &operator+=(const SomeClass &rhs);
...
};
...
return( *this );
}
...
SomeClass sc;
...
if (!sc) // Implicit cast to bool
{
...
}
...
if (sc) // Implicit cast to bool
{
...
}
...
const bool scIsValid = static_cast<bool>( sc ); // Explicit cast to bool
...
public:
ComplexNumber();
ComplexNumber(const ComplexNumber ©);
ComplexNumber(const double real);
ComplexNumber(const double real, const double imaginary);
...
ComplexNumber operator+(const ComplexNumber &other) const;
ComplexNumber &operator+=(const ComplexNumber &other);
ComplexNumber operator-(const ComplexNumber &other) const;
ComplexNumber &operator-=(const ComplexNumber &other);
ComplexNumber operator++(int); // Syntax to specify post-increment operator
ComplexNumber &operator++(); // Syntax to specify pre-increment operator
ComplexNumber operator--(int); // Syntax to specify post-decrement operator
ComplexNumber &operator--(); // Syntax to specify pre-decrement operator
...
};
ComplexNumber ComplexNumber::operator++(int)
{
// A post-increment operator returns the original value and increments *after*
// that value has been used in an expression. If creating a temporary instance
// is expensive, post-increment operations can be very expensive
ComplexNumber &ComplexNumber::operator++()
{
// A pre-increment operator does the increment and returns the new value
mReal++;
return( *this );
}
Default operator=()
If you don't write your own operator=(), the compiler will generate one for you that is a bit copy of one instance into the other
That's often not what you want to do
It's always safer to write your own than to let the compiler build one for you, even if it is just a bit copy
You can disable the operator=() from being used by making it private and not defining a body for it
May want to do that for very memory intensive classes that you don't want people to randomly create a copy of without realizing
the expense
class SomeObjectPointer
{
// This is the "smart pointer" to SomeObject. For now, we aren't going to
// define its contents because we just want to derive the actions that
// need to be performed on it
...
};
// Upon exiting the API, the destructor for ptr is invoked. This
// should decrement the reference count
}
...
// Create a smart pointer to an instance of SomeObject and use it
SomeObjectPointer sop( new SomeObject );
sop->someAPI()
...
// Reassign sop to point to something else. This should decrement the reference
// counter to indicate that only 1 thing still points to the original instance
// that was created above
sop = anotherPointer;
...
// After the next statement, you want the memory to be automatically destroyed
// without having to explicitly do it. That's because the reference counter for the
// original object went to zero
sop1 = sop;
Through this usage model, we've identified several different operators and APIs that we need to create
SomeObjectPointer Definition
#ifndef _SomeObjectPointer_h_
#define _SomeObjectPointer_h_
class SomeObject;
class SomeObjectPointer
{
unsigned *mReferenceCount;
SomeObject *mObject;
private:
void assign(const SomeObjectPointer ©);
void destroy();
void init();
public:
// Define standard constructors and destructor
SomeObjectPointer();
SomeObjectPointer(SomeObject *object);
SomeObjectPointer(const SomeObjectPointer ©);
~SomeObjectPointer();
// Define a way to get to the underlying pointer (since some APIs will still
// want the pointer instead of an instance of this class). Also define a way
// to reset the instance, back to it's initial state
SomeObject &operator*();
const SomeObject &operator*() const;
SomeObject *operator->();
const SomeObject *operator->() const;
// Define casting operators that you would normally have for a pointer.
// There's the implicit cast to bool if the pointer is not NULL, and
// the test for a NULL value using !. For example:
// SomeObjectPointer sop;
// ...
// if (sop) ... // Use bool operator
// if (!sop) ... // Use ! operator
//////////////////////////////////////////////////////////////////////////////
// Private inline APIs below //
//////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////////
// Public inline APIs below //
//////////////////////////////////////////////////////////////////////////////
inline SomeObjectPointer::SomeObjectPointer()
{
init();
}
inline SomeObjectPointer::SomeObjectPointer(SomeObject *object)
{
mObject = object;
mReferenceCount = new unsigned( 1 );
}
inline SomeObjectPointer::~SomeObjectPointer()
{
destroy();
init();
}
//////////////////////////////////////////////////////////////////////////////
// Comparison APIs below //
//////////////////////////////////////////////////////////////////////////////
#endif
C Matrix/Vector Multiplication
void matrixVectorMultiply(const float *A, const float *x, float *b,
const unsigned rows, const unsigned cols)
{
// Here's a traditional C implementation of the matrix/vector multiply
/*
* Here's the definition of the SSE class that can do the above code
*/
class SSEFloat
{
__m128 mValues;
private:
SSEFloat(const __m128 values);
public:
SSEFloat() {} // Does nothing
SSEFloat(const float value);
SSEFloat(const float *address);
~SSEFloat() {} // Does nothing
Summary
By using operator overloading, programmers can utilize the same syntax for objects that are available for POD
Programmers are already familiar with the standard C operators, so they can naturally understand what is going on with the
overloaded operator, often without having to look at programmer documentation
By utilizing operator overloading, templates can be written that work on objects and POD with the same syntax (more to come on
this in future classes)
POD operators cannot be redefined
Pre- and post-increment/decrement operators can have a dramatic impact on performance
Nothing says that a post-increment/decrement operator must be defined
malloc/free
struct SomeStruct
{
...
};
...
SomeStruct *ptr = (SomeStruct *) malloc( sizeof( SomeStruct ) );
...
if (ptr)
{
free( ptr );
ptr = NULL;
}
...
// Deallocate memory en masse. This calls the constructor for each object and
// then returns the memory allocated to it back to the heap
delete sc1;
delete sc2;
delete sc3;
delete sc4;
delete x;
delete y;
delete a;
Vector Allocation
Just as in C, C++ can only allocate a vector of objects/data, not a multi-dimensional matrix of such data
You can create a multi-dimensional vector of fixed size, however
Allocation is similar to that of the single instance
Uses braces ([]) to distinguish from single instance allocation
However, only the default constructor can be used when allocating an array of objects!
Vector Allocation and Deallocation
// Allocate multi-dimension vector of a fixed size
int someMatrix[ 4 ][ 4 ];
...
// Deallocate memory en masse. The calls to delete must match the calls
// to new (with the [] syntax)
delete[] sc1;
delete[] x;
delete[] y;
delete[] a;
...
Placement Operators
Multiple forms
Most common form
Instead of allocating memory from the heap, you can supply a memory address that you want an object to be allocated within
Examples: shared memory, stack memory, memory pools
The operator new() will simply place a new instance at the memory address, calling the constructor you provided
Cannot call delete on the pointer to an object allocated this way because you don't want the program to attempt to add that
memory back to the heap
Must explicitly invoke the destructor
Example: Allocating memory from the stack can be significantly faster than the heap for small operations
class MemoryPool
{
...
public:
...
void *allocate(const size_t numBytes);
void deallocate(void *address);
...
};
...
MemoryPool pool;
SomeClass *sc = operator new( pool ) SomeClass();
...
sc->~SomeClass();
operator delete( sc, pool );
Class-based Operators
A class can define its own new and delete operator
By definition, they are static member functions (and, therefore, cannot be virtual) that will be used in lieu of any global new/delete opera
tor
Can overload them with additional parameters
Class new and delete
class MyClass
{
...
public:
...
Exception Handling
There are actually other forms of new and delete that interact with the exception handling system of C++
There are some forms of new/delete that will throw an exception on an error condition
The forms that don't throw an exception will return 0 on an allocation that doesn't succeed
Fundamentally, the work the same as the operators above
Out of Memory
In the event that there is not enough memory to fulfill a memory allocation request, the system will call a new handler
The purpose of the new handler is to try to free up memory so that a retry of the memory allocation request can be made
If the memory allocation request doesn't succeed after calling the new handler, the exception is thrown (or 0 is returned, depending on
the form of new used)
Programmers can set a specific new handler via a call to set_new_handler()
Note that attempting I/O in a new handler also requires a memory allocation, so generating error messages may be problematic
Setting the new handler
void outOfMemory()
{
// Try to free up any transient memory that I may not need anymore
...
}
...
std::set_new_handler( outOfMemory );
...
Summary
C++ builds upon C's raw memory management routines with object-specific routines
The syntax is more complicated, but it brings type safety and flexibility
Most people won't deviate from the standard new/delete operators
I/O
In C, there were two ways to mechanisms used for I/O
An integer file descriptor
A FILE pointer
A FILE pointer provided a higher level of abstraction, but both could be used
printf() and scanf() used FILE pointers
There were 3 I/O mechanisms defined by default in a C program
File descriptors
0: Used to retrieve input
1: Used for normal output
2: Wrote to a defined error mechanism
FILE pointers
stdin: Associated with file descriptor 0
stdout: Associated with file descriptor 1
stderr: Associated with file descriptor 2
Programmers had to include stdio.h for standard I/O APIs
C++ Streams
C++ uses I/O streams to perform input and output operations
std::cin: Input stream like stdin
std::cout: Output stream like stdout
std::cerr: Output (unbuffered) stream like stderr
std::log: Output stream for error messages like stderr
Streams redefine the >> and << operators for input and output, respectively
Streams are defined in iostream and reside in the namespace std
I/O Stream
#include <iostream>
// Other methods
ostream &put(const signed char c);
ostream &put(const unsigned char c);
ostream &write(const signed char *c, const size_t s);
ostream &write(const unsigned char *c, const size_t s);
Output streams return a reference so that you can chain things together
std::cout << "You entered " << pi << " as the value for PI" << std::endl;
Also chains because the return value is a reference to the stream after supplying data to the passed variable
std::cin Example
#include <iostream>
/*
* Chaining is possible for both input and output streams
*/
...
unsigned age, weight;
std::cout << "Enter your age and weight and hit <ENTER>: " << std::flush;
std::cin >> age >> weight;
...
Like std::ostream, the >> operator is not overloaded for single characters or character sequences
The user is responsible for providing enough memory to read into a string variable if using the >> operator on a character
sequence
When reading in data, data is assumed to be white-space delimited by default
As above, the istream is shown without inheritance, but only for the purpose of this discussion
Input Stream Methods
class istream
{
...
public:
...
// Input on numbers
istream &operator>>(short &s);
istream &operator>>(unsigned short &s);
istream &operator>>(int &i);
istream &operator>>(unsigned int &i);
istream &operator>>(long &l);
istream &operator>>(unsigned long &l);
istream &operator>>(float &f);
istream &operator>>(double &d);
istream &operator>>(long double &d);
// Input methods
istream &get(signed char &c);
istream &get(unsigned char &c);
istream &read(signed char *c, const size_t s);
istream &read(unsigned char *c, const size_t s);
/*
* The following code will result in an infinite loop if the user
* enters something other than an unsigned (like a character string)
* when prompted. When std::cin attempts to read an unsigned value, it
* will see that the character string does not match its anticipated value.
* The character will be left on the stream and an error bit will be set
* on the std::cin instance to indicate a problem reading. In the
* if-statement, the error will be cleared, and the code will loop.
* When the program gets to the point of reading in the age and weight,
* it will again see the character string and skip it. Thus the infinite
* loop.
*
* To solve the problem, the programmer must alter the code to use
* std::cin.ignore() to ignore one field of trailing input. This will only
* ignore one field at a time, but it's better than nothing.
*
* if (!std::cin)
* {
* tryAgain = true;
* std::cin.clear();
* std::cin.ignore();
* }
*/
Other APIs
In addition to input or output routines, stream provide other operators to query and set the state of the stream
Boolean operators to test the validity of the stream
Precision APIs
Fill character APIs
others
Boolean Operator
#include <iostream>
...
/*
* Read data from the user while the input stream is valid. For example, if the
* user is sending data via a pipe to this program, we need to terminate when the
* end of the input is reached
*/
while (!std::cin.eof())
{
...
}
...
This page shows only a fraction of the APIs available for streams
See the following for more information
http://www.cplusplus.com/reference/ostream/
http://www.cplusplus.com/reference/istream/
Manipulators
Previous examples use std::endl
Causes a newline character ('\n') to be added to the stream, and forcibly flushes the stream
Just as in C, sending data for output doesn't necessarily cause it to be displayed immediately unless followed by a call to
fflush()
The stream flush() API will also force a flush
Can also use std::flush (below)
Other types of manipulators exist
std::noskipws: When used on an input stream, causes the stream to not skip over white space for all subsequent input
std::ws: After std::noskipws, will eat white space for that field (but does not change stream state)
std::skipws: Resets stream to default behavior of skipping white space when reading
std::showbase: Change state of the stream to show the numerical base prefix
std::scientific: Use scientific floating-point notation for floating-point values
std::hex: Output numbers in hexadecimal base
std::ends: When sending data to a string stream (below), adds a '\0' character to the stream
std::flush: Flushes the data sent to the stream
std::setw: Set the width of the field
May need to include iomanip (#include <iomanip>)
NOTE: Most manipulators that change the output format of a stream are permanent. The exception is std::setw(), which resets after
each output entity
I/O Manipulators
$ cat j2.cpp
#include <iostream>
#include <iomanip>
Stream Direction
An input stream is of type std::istream
An output stream is of type std::ostream
A bidirectional stream is of type std::iostream
Inherits from both std::istream and std::ostream
A common element of both std::istream and std::ostream is shared via "virtual inheritance"
Only 1 copy of std::ios is stored in std::iostream, not 2
The class std::ios provides basic buffering, error detection, and other valuable functions that are common to both types of streams
Without virtual inheritance, the input and output side of the std::iostream wouldn't coordinate with one another for error detection,
buffering, or any of the other common functions
Virtual Inheritance
namespace std
{
class ostream : virtual public ios { ... };
class istream : virtual public ios { ... };
class iostream : public istream, public ostream { ... };
}
String Stream
$ cat j.cpp
#include <cassert>
#include <iostream>
#include <sstream>
Of note is that the str() API accepts a std::string as a parameter but is being passed a C character sequence
The std::string class has a constructor that accepts a C character sequence
The compiler automatically invokes this constructor to create a temporary object of type std::string that is then passed to ss
.str()
Upon return from the str() API, the temporary is destroyed
Highlights both the power of C++ and a potential issue
If the compiler can determine your intent via constructors, inheritance, or whatever, it will attempt to do so
The compiler can also choose the wrong implementation to do a conversion
Even when correct, the creation of a temporary variable can be a performance and/or memory hit
File Streams
A file stream reads from and writes to files
Bidirectional file streams are also supported
An input file stream is std::ifstream
An output file stream is std::ofstream
A bidirectional file stream, std::fstream, inherits from both an input file stream and an output file stream
Opening a file stream
Pass the name in as a constructor parameter
Invoke the open() method
Closing a stream
Automatically occurs when the file stream goes out of scope
Can be done explicitly via the close() method
Mode bits can be passed to the open()API or constructor to define the way in which the stream is treated
Bits can be or'ed together to impact behavior
std::ios_base::ate The starting position for file operations is at the end of the file
std::ios_base::binary The file contains or will contain binary data, not text data
std::ios_base::ate positions the file "cursor" at the end of the file contents whereas std::ios_base::app appends to
the end of the file while the read cursor is at the beginning of the file
std::ifstream uses std::ios_base::in by default
std::ofstream uses std::ios_base::out by default
" w" std::ios_base::out Open a file for writing. If the file already exists, truncate it
std::ios_base::out |
std::ios_base::trunc
" a" std::ios_base::out | Create a new file or open an existing file. Any new content is placed at the end of
std::ios_base::app any existing content
"r+" std::ios_base::in | std::ios_base:: Create a new file or open an existing file. Existing content can be read, and new
out content can be added and/or read
"w+" std::ios_base::in | Create a new file or open an existing file. Existing content will be discarded. New
std::ios_base::out | file content can be added and/or read
std::ios_base::trunc
"a+" std::ios_base::in | Create a new file or open an existing file. New content will be written to the end of
std::ios_base::out | the file by default. Existing and new content can also be read
std::ios_base::app
File Stream Examples
#include <fstream>
#include <iostream>
#include <ctime>
...
/*
* Open up a text file and read the contents
*/
/*
* Write a report, appending new contents to existing contents
*/
std::ofstream report;
report.open( "report.txt", std::ios_base::app ); // std::ios_base::out default for
std::ostream
if (!report)
{
std::cerr << "Error: Could not create \"report.txt\". Check file system
permissions. Program terminating." << std::endl;
exit( 2 );
}
...
report.close();
...
/*
* Open a binary file. Go to the end of the file to read an file position value that
indicates
* where a date has been stored into the file. Update that date value with a new value
*
* For reading, seekg() is used to move to a position. For writing, seekp() is used.
*/
/*
* Forward declare class and API
*/
class SomeClass;
std::ostream &operator<<(std::ostream &o, const SomeClass &sc);
/*
* Now define class and API
*/
class SomeClass
{
friend std::ostream &operator<<(std::ostream &o, const SomeClass &sc);
...
};
...
/*
* The API must return the modified stream so that chaining can occur
*/
return( o );
}
Using a Class API for Output
#include <iostream>
class SomeClass
{
...
public:
...
void write(std::ostream *s) const;
...
};
...
}
Thoughts
If you're adding an overloaded operator to a class that's already written, the only choice that you have is to write everything from
the external API instead of from within the class
Hopefully all required data for output is available via public API of the class
It the class can be modified, either approach will work
When using a friend, abstraction is broken unnecessarily since there is an alternative
You must remember to update the friend when the classchanges
Conceptually easier to remember to update a member API that handles output
At the end of the day, it's a matter of taste
printf( "%5d std::cout << std::setw( 5 ) << 100 << ' ' << std::setw( 5 ) << 200;
%5d", 100, 200 );
printf( "%9.2f", std::cout << std::fixed << std::showpoint << std::setw( 9 ) << std::setprecision( 2 )
100.0 ); << 100.0 << std::noshowpoint << std::setprecision( 0 );
Sometimes it's just easier to use C
...
char buffer[ 1024 ];
snprintf( buffer, sizeof( buffer ), ... );
std::cout << buffer;
...
Boost::format
#include <boost/format.hpp>
#include <iostream>
#include <sstream>
...
int month, day, year;
getDate( &month, &day, &year );
std::stringstream ss;
ss << format("%4i %4i %4i\n") % year % month % day;
...
Uses Unix98 Open Group printf() precise syntax, rather than the standard C printf()
Lots of extensibility beyond what's shown here
Overloads the %operator
The % operator has a higher precedence than the shift operator and is evaluated left to right
Thus, the above statement is the same as the following
ss << (((format("%4i %4i %fi\n") % year) % month) % day);
A class called "format" exists within Boost
In the code, "format("%4i %4i %fi\n")" creates a temporary, unnamed variable that has formatting information associated
with it
Each use of the % operator modifies the temporary variable, and the call returns a reference to the temporary variable for use in
chaining
When all %operator statements have been evaluated, the temporary instance is sent to the stream
Means that the left-shift operator must be overloaded between a output stream and the format class
Conclusions
C++ I/O facilities are much more elaborate that C's
With operator overloading, you can send any type to a stream or read any type from a stream
It's not possible to do this with C because you can't modify the printf() types (i.e. "%d") to include new types
C++ makes it look like an entire class instance is sent to a stream, even though it decomposes into individual member variables sent, one
by one
Fundamentally, C provides all of the same basic functionality for I/O that C++ does
When dealing with file descriptors, you must deal with C and not C++
The ability to print to a self-expanding string (string stream) is a powerful feature not found in C
std::string
A Very Short Introduction to std::string
C doesn't provide a native string type
C++ provides a string type as part of the standard library: std::string
APIs are defined, but implementation details are left to the author of the class
Example: some implementations use reference counting to minimize copies while others don't
Memory is allocated and deallocated on the fly for a string as the size of the string changes
Strings are NOT null-terminated
Maintain a size parameter to indicate string length
must use c_str() to extract a null-terminated C string from the class
Provides obvious APIs for string handling
See http://www.cplusplus.com/reference/string/string/ for a good list of APIs available for a string
std::string Examples
std::string x( "Some" );
x += " string";
size_t firstS = x.find_first_of( 's' );
x.erase( 4, 1 ); // x = "Somestring"
...
/*
* While strings are powerful, they still aren't native. This will cause
* a compiler error because you can't add two types of "const char[2]"
*/
x = "a" + "b";
/*
* Because operator+() is defined for std::string, you can do this
*/
Nice properties
Dynamic expansion to grow as required
Automatically returns memory to the system when the string is cleared or when it goes out of scope
No explicit memory management required and no leaks
Can reserve bytes of memory so that each change to a string doesn't result in a call to new/delete
std::string::reserve()
Efficiency
Typically coded so that small strings don't allocate memory (if they can fit within the memory typically used to point to allocated
memory instead)
Sometimes reference counted so that assigning one std::stringto another just makes them point to the same memory
instead of having 2 duplicate instances of the same memory
If reference counted, the data must be locked when working in an threaded environment (or at least atomic primitives
must be utilized to guarantee no issues)
Because they call new/delete under the covers, it can be a lot more expensive than what many people think
Memory allocation
Memory allocation for std::string typically follows the pattern used in the Standard Template Library (reviewed in a future class)
General rule (there are exceptions): when more memory is needed in order to store yet another byte, double the existing memory
is allocated
Without doubling in size, memory allocation and copies would be O(n2)
Copy "abc"
Delete 3 bytes
Copy 'd'
Copy "abcde"
Delete 5 bytes
Copy 'f'
Copy "abcdef"
Delete 6 bytes
Copy 'g'
Copy "abcdefg"
Delete 7 bytes
Copy 'h'
In the linear memory growth, each addition of a character required copying all prior characters into newly allocated space
To copy 'n' characters, that would have resulted in 1 + 2 + 3 + … + n-2 + n-1 characters copied each allocation
This is the definition of O(n2) since 1,000,000 characters would have resulted in (1e6 * 1e6)/2 bytes copied
In geometric growth with a growth factor of 2, adding 1,000,000 characters 1 at a time would result in significantly fewer bytes copied
each allocation
1 + 2 + 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512 + 1024 + … + 262144 + 524288 = 1048575
Java used a factor of 1.5 instead of 2 for it's memory growth
STL started by using 2 but some implementations now use other factors (VC++ uses 1.5 since version 7)
Newer versions of std::string may also use a different growth factor
Casting
Operator overloading was presented in a prior class
In addition to overloading standard C operators, C++ programmers can overload casting operators
Casting to a std::string object can make it easy to view the contents of a class without having to overload the " <<" operator of a stream
class SomeObject
{
std::string mName;
unsigned mCount;
...
public:
SomeObject();
SomeObject(const SomeObject &other);
~SomeObject();
...
SomeObject x;
...
std::cout << static_cast<std::string>( x ) << std::flush;
...
Another approach would have been to create an operator for SomeObjectand an output stream
Converting to a std::string is more generic in that a std::string can be used for other classes and APIs, too
Converting to a std::string is also slower and takes more memory since the entire representation must first be made in
memory for the instance prior to delivering it as a std::string
Overloading operator<<() for a Stream
#include <iostream>
class SomeObject
{
std::string mName;
unsigned mCount;
...
public:
SomeObject();
SomeObject(const SomeObject &other);
~SomeObject();
...
SomeObject x;
...
std::cout << x << std::flush;
...
Overview
C++ templates are the mechanism used to create a single instance of code that can be used by more than one data type
Templates can be part of a class, class API, or a stand-alone function
Templates can accept a type or a value
NOTE: Templates are a very large and complex topic. A full discussion on templates is quite involved and beyond the scope of this class.
However, templates are presented in sufficient depth so that most common uses of templates can be understood and applied.
Multi-Parameter Templates
Templates can take multiple parameters
Parameters are separated by commas
Example: permute the values between two pointers
Transform Template
template<typename T, typename P> void permute(T &start, const T &end, const P
&transform)
{
while (start != end)
{
transform( *start );
++start;
}
}
struct ToUpperStruct
{
void operator()(char &ch) const
{
ch = std::toupper( ch );
}
};
...
char *ch = getStringFromUser();
permute( ch, ch + strlen( ch ), ToUpperStruct() );
...
The permute() function is a template function that works on two different types
This syntax does not exclude the possibility that the two types could also be the same type
The first and second parameters should be of the same type (type T)
The third parameter (of type P) defines an instance that needs to have the operator()() defined for it
It iterates over all data, from start to end, and permutes the data by applying the 3rd parameter, transform
When invoking the permute() function, we pass a temporary instance of ToUpperStruct to the function
The struct ToUpperStruct doesn't have a default or copy constructor, so the compiler auto-generates one
The instance of ToUpperStruct is responsible for converting the string from lower-case to upper-case
In this example, the code locks and unlocks a mutex if operating in thread-safe mode
If not operating in thread-safe mode, lock and unlocking doesn't occur
Because the value is a compile-time constant, the compiler can evaluate the if-statement at compile time and determine what to do
The two if-statements will be removed entirely if not run in thread-safe mode
The if-statements won't be evaluated at all in thread-safe mode
The function will simply lock and unlock the mutex without having to do a test
The value is part of the instantiation
Can mix types and values in a template
In the last invocation of someOperation(), the template parameters and values did not need to be explicitly defined because the
defaults are used
...
const char *response1 = getUserResponse();
const char *response2 = getUserResponse();
const char *least = lesser( response1, response2 );
...
int a = someValue();
int b = someValue();
int leastValue = lesser( a, b );
In this example, the standard less-than operator is used to calculate the lesser of two passed values
Works fine for mathematical values and classes for which operator<() is defined
An explicit version of lesser() is created for when the parameters are two "const char *" values
If the syntax of "char const * const &" is confusing, click here to see how to determine the correct syntax
public:
SomeClass();
SomeClass(const SomeClass<T> &other);
~SomeClass();
...
void someVeryLargeAPI() const;
..
SomeClass &operator=(const SomeClass &other);
};
template<typename T>
inline SomeClass<T>::SomeClass()
{
...
}
template<typename T>
inline SomeClass<T>::SomeClass(const SomeClass &other)
{
...
}
template<typename T>
inline SomeClass<T>::~SomeClass()
{
...
}
...
template<typename T>
inline void SomeClass<T>::someVeryLargeAPI() const
{
...
}
...
template<typename T>
inline SomeClass<T> &SomeClass<T>::operator=(const SomeClass &other)
{
...
return( *this );
}
Template Definition in a .cpp File
/*
* This code shows the same template class as above (SomeClass), but
* all of the APIs are defined in a .cpp file. For this to work, all of
* the "inline" APIs in this .cpp file must be removed from the .h file.
*/
#include "SomeClass.h"
template<typename T>
SomeClass<T>::SomeClass()
{
...
}
template<typename T>
SomeClass<T>::SomeClass(const SomeClass &other)
{
...
}
template<typename T>
SomeClass<T>::~SomeClass()
{
...
}
...
template<typename T>
void SomeClass<T>::someVeryLargeAPI() const
{
...
}
...
template<typename T>
SomeClass<T> &SomeClass<T>::operator=(const SomeClass &other)
{
...
return( *this );
}
/*
* Note that all of the "inline" keywords have been removed. They are
* now going to be function calls instead. Furthermore, we now need
* to explicitly create all instantiations of this template class for
* every possible variation of "T" that the user can pass in that we'll
* consider legal.
*
* Note that any attempt by the user to use the class for a template
* parameter that we don't enumerate below will result in a link error
* (missing reference). The compiler would have done this for us
* automatically if everything was in the header file. By moving the
* code to the .cpp file, we now have to do it explicitly.
*
* The following syntax is used to explicitly create an instantiation
* of SomeClass for int, float, and double, without reserving any
* storage for an object. All member APIs are generated for them
* by the compiler.
*/
/*
* The following syntax explicitly instantiates just the constructor
* but not anything else
*/
/*
* It's possible to use the "extern" keyword to prevent automatic
* instantiation of a class or member API, too
*/
extern template class SomeOtherClass<bool>;
extern template class SomeOtherClass<int>::SomeOtherClass();
Differentiation
Template classes with different parameters are really two different classes
SomeClass<bool> and SomeClass<int> are different classes altogether
When doing a copy constructor, a class can normally directly access data members of another class of it's type
This doesn't work across template classes unless a friend statement is explicitly used
Friends
template<typename T> SomeClass
{
template<typename U> friend class SomeClass;
T mData;
...
public:
SomeClass();
SomeClass(const SomeClass &other);
template<typename U>
SomeClass(const SomeClass<U> &other);
~SomeClass();
...
};
...
template<typename T>
inline SomeClass<T>::SomeClass(const SomeClass &other)
{
/*
* This copy constructor is invoked when the passed type is equivalent
* to the type that we're creating
*/
mData = other.mData;
...
}
SomeClass<int> j;
SomeClass<int> k( j ); // Invokes first copy constructor
SomeClass<double> l( j ); // Invokes second copy constructor
Things to note
The template syntax had to be used in the friend statement
By using a different name for the typename, we're telling the compiler that the friend clause isn't specific to only
equivalent instances of SomeClass
We added another copy constructor
The second one takes instances of SomeClass that are not equivalent to the current instance
The assignment in the second copy constructor may fail because we may not be able to cast data of type "U" to type "T"
For example, converting a double to a std::string
We won't know until instantiation and compilation
Without the second copy constructor there's no way to copy across template classes of different type instantiations
You don't always need to use friend
If you can derive all of the required data from an API, you can skip the friend specification
The specification of the second copy constructor used 2 template specifications instead of 1
It's not legal to combine the two into "template<typename T, typename U>"
The first template specification applies to the class
The second could apply to a return value or to a parameter type, but not to the class
Example: Pixels
Consider a drawing package with an arbitrary bit depth for pixels
Want to support 8-bit and 16-bit pixels, but also want to convert between them seamlessly
Reading an 8-bit JPEG and putting into a 16-bit canvas or visa versa
This is a code excerpt from a real program
Pixels
#include <cmath>
private:
void assign(const RGB ©);
template<typename U>
void assign(const RGB<U> ©);
void init();
public:
RGB();
RGB(const RGB ©);
RGB(const T red, const T green, const T blue);
template<typename U>
RGB(const RGB<U> ©);
~RGB();
T getBlue() const;
T getGreen() const;
T getRed() const;
/*
* Static APIs
*/
static T maxChannelValue();
};
/*****************************************************************************/
/* Private inline APIs below */
/*****************************************************************************/
template<typename T>
inline void RGB<T>::assign(const RGB ©)
{
mRed = copy.mRed;
mBlue = copy.mBlue;
mGreen = copy.mGreen;
}
template<typename T>
inline void RGB<T>::init()
{
mRed = T();
mBlue = T();
mGreen = T();
}
/*****************************************************************************/
/* Public inline APIs below */
/*****************************************************************************/
template<typename T>
inline RGB<T>::RGB()
{
init();
}
template<typename T>
inline RGB<T>::RGB(const RGB ©)
{
assign( copy );
}
template<typename T>
inline RGB<T>::~RGB()
{
init();
}
template<typename T>
T RGB<T>::maxChannelValue()
{
return( T( 0 ) - 1 );
}
template<typename T>
inline RGB<T> &RGB<T>::operator=(const RGB ©)
{
if (© != this)
{
assign( copy );
}
return( *this );
}
Friends
In addition to the use of friend in copy constructors, normal friend rules apply
Classes can friend specific instances of a template class or all instances
A function can be a friend
The function may be a template function or a normal function
Friends
/*
* This example shows other ways in which "friends" can be defined
*/
class AnotherClass
{
template<typename T> friend class SomeClass; // Friend any type of SomeClass
template friend class BogusClass<int>; // Only friend a specific type of
BogusClass
...
};
/*
* The following friends also work
*/
void mySecondFriendAPI();
void myPrivateAPI();
...
public:
...
};
template<typename T>
void myFriendAPI(AThirdClass<T> *ptr)
{
...
ptr->myPrivateAPI();
}
void mySecondFriendAPI()
{
AThirdClass<int> atc;
atc.myPrivateAPI();
}
public:
EmbeddedClass();
EmbeddedClass(const EmbeddedClass &other);
~EmbeddedClass();
EmbeddedClass<int> m1;
EmbeddedClass<double> m2;
...
public:
SomeClass();
SomeClass(const SomeClass &other);
~SomeClass();
...
template<typename T>
inline SomeClass::EmbeddedClass<T>::EmbeddedClass()
{
...
}
template<typename T>
inline SomeClass::EmbeddedClass<T>::EmbeddedClass(const EmbeddedClass &other)
{
...
}
template<typename T>
inline SomeClass::EmbeddedClass<T>::~EmbeddedClass()
{
...
}
template<typename T>
inline SomeClass::EmbeddedClass<T> &SomeClass::EmbeddedClass<T>::operator=(const
EmbeddedClass &other)
{
...
}
inline SomeClass::SomeClass()
{
...
}
inline SomeClass::~SomeClass()
{
...
}
EmbeddedClass<int> m1;
EmbeddedClass<double> m2;
...
public:
SomeClass();
SomeClass(const SomeClass &other);
~SomeClass();
...
template<typename U>
inline SomeClass<U>::SomeClass()
{
...
}
template<typename U>
inline SomeClass<U>::SomeClass(const SomeClass &other)
{
...
}
template<typename U>
inline SomeClass<U>::~SomeClass()
{
...
}
template<typename U>
inline SomeClass<U> &SomeClass<U>::operator=(const SomeClass &other)
{
...
return( *this );
}
private:
MyEnum mEnum;
public:
OtherClass();
OtherClass(const OtherClass &other);
~OtherClass();
...
MyEnum getEnum() const;
...
template<typename T>
inline OtherClass<T>::OtherClass()
{
...
}
template<typename T>
inline OtherClass<T>::OtherClass(const OtherClass &other)
{
...
}
template<typename T>
inline OtherClass<T>::~OtherClass()
{
...
}
template<typename T>
inline OtherClass<T> &OtherClass<T>::operator=(const OtherClass &other)
{
...
return( *this );
}
public:
SomeClass();
SomeClass(const SomeClass &other);
~SomeClass();
...
typename OtherClass<T>::MyEnum getDataEnum() const;
...
template<typename T>
inline SomeClass<T>::SomeClass()
{
...
}
template<typename T>
inline SomeClass<T>::SomeClass(const SomeClass &other)
{
...
}
template<typename T>
inline SomeClass<T>::~SomeClass()
{
...
}
template<typename T>
inline typename OtherClass<T>::MyEnum SomeClass<T>::getDataEnum() const
{
return( mData.getEnum() );
}
template<typename T>
inline SomeClass<T> &SomeClass<T>::operator=(const SomeClass &other)
{
...
return( *this );
}
Things to note
The typename qualifier is required when declaring the API getDataEnum()
The typename qualifier is also required in the definition of getDataEnum()
The typename keyword is required when referring to a qualified name that refers to a type and depends on a template
parameter
private:
void assign(const SquareMatrix &other);
void init();
private:
/*
* This is a tricky way to speed up the process of copying data from
* one instance to another when the underlying type T is a known type
* (like float or int). Create specialty functions for known types with
* a generic function for everything else
*/
void copy(float *dest, const float *src) { memcpy( dest, src, mSize * sizeof( float
) ); }
void copy(int *dest, const int *src) { memcpy( dest, src, mSize * sizeof( int ) );
}
void copy(T *dest, const T *src)
{
for (unsigned i = 0; i < mSize; i++)
{
dest[ i ] = src[ i ];
}
}
/*
* See comment above for copy()
*/
void init(float *dest) { memset( dest, 0, mSize * sizeof( float ) ); }
void init(int *dest) { memset( dest, 0, mSize * sizeof( int ) ); }
void init(T *dest)
{
for (unsigned i = 0; i < mSize; i++)
{
dest[ i ] = T();
}
}
public:
SquareMatrix();
SquareMatrix(const SquareMatrix ©);
~SquareMatrix();
...
/*
* To get away from LONG names, use #define statements to shorten things
*/
/****************************************************************************/
/* Private inline APIs below */
/****************************************************************************/
INLINE CLASS::SquareMatrix()
{
init();
}
INLINE CLASS::~SquareMatrix()
{
/*
* The reason for the re-initialization is so that any attempt to access
* the data after destroying the instance won't return seemingly valid data.
*/
init();
}
#undef INLINE
#undef CLASS
...
Nested Templates
A template class can be a parameter for another template class
If the closing ">" of a template is followed by the closing ">" of another template, there must be a space between the two
C++ confuses ">>" for the shift-right operator
C++11 removes this restriction
Recursive Templates
Templates can be recursive
Example: binary search
Recursive Templates
template<typename T,unsigned S> struct TemplateSearch
{
inline unsigned operator()(const T *values, const T &wanted)
{
register unsigned result;
if (wanted < values[ S / 2 ])
{
TemplateSearch<T,S/2> lower;
result = lower( values, wanted );
}
else
{
TemplateSearch<T,S-S/2> upper;
result = S / 2 + upper( values + S / 2, wanted );
}
return( result );
}
};
/*
* The code below is equivalent to the following code:
* unsigned bsearch6(values, wanted)
* {
* return( wanted < values[ 3 ] ?
* (wanted < values[ 1 ] ? 0 :
* (wanted < values[ 2 ] ? 1 : 2)) :
* (wanted < values[ 4 ] ? 3 :
* (wanted < values[ 5 ] ? 4 : 5)) );
* }
*/
TemplateSearch<float,6> search;
unsigned pos = search( array, wanted );
Fibonacci Numbers
$ cat j.cpp
#include <iostream>
/*
* Generic Fibonacci number calculation
*/
/*
* Special version of Fibonacci number for 0 and 1. This keeps the compiler
* from unrolling any further while also providing the correct values for
* both 0 and 1
*/
Inheritance
Just like any other class, templates can inherit from other classes
Base classes may be template classes or non-template classes
Template Specialization
template<typename T1, typename T2> class SomeObject { ... };
template<typename T> class SomeObject<T,T> { ... }; // Partial template
specialization
template<> class SomeObject<int,float> { ... }; // Full template
specialization
/*
* The following is an interesting template specialization because it specifically
* handles the case in which the template parameter used is a pointer. Even though the
* type has not been defined, it has been qualified as a pointer type. This is
different
* from the 2nd version (above) which will now only deal with non-pointer types.
*/
Summary
Templates are extremely powerful
Both functions and templates can be templates
They eliminate the need for code duplication when the same operations are performed for different types
The compiler can optimize template usage for a given type or value
Template metaprogramming takes this to the extreme by letting the compiler precalculate an answer (or partial answer if the full
unroll is too difficult)
The compiler will generate unique code for each template instantiation based based on the types used as template parameters
Like functions, template parameters can have defaults
Programmers can provide specific implementations for templates (template specialization) that are optimal for certain conditions
Next Section: Standard Libraries
To determine the specialization required for these two types, I created the following program.
The following steps can then be used to see what the template specialization should look like.
We now see what the specialization needs to look like, and can code it accordingly.
Final Specialization
template<typename T> const T lesser(const T &a, const T &b)
{
return( a < b ? a : b );
}
It can also be coded as follows (exchanging "char const *" for "const char *").
Final Specialization
template<typename T> const T lesser(const T &a, const T &b)
{
return( a < b ? a : b );
}
Standard Libraries
A Note About This Material
A discussion about C++ libraries could span days
Entire books have been written about different C++ class libraries or even subsections of a library
There is now way to provide a lot of detail about C++ class libraries within a 60-minute time frame
The goal of this page is multi-fold
To provide an overview of the wealth of material available to C++ programmers through the use of class libraries
To whet the reader's appetite for more information related to class libraries
To provide just a few examples of how some library components are used and why they are so powerful
C Library
A number of C language library header files have been replaced with C++ header files
The following is not an exhaustive list, but does list some of the most common header files that have been converted to support C++
C++ versions add "c" as a prefix and remove the ".h" suffix from their C counterparts
Many of the standard C APIs are replaced with C++ equivalents that take advantage of C++ templates and/or function overloading
Example: There are 3 different implementation of cos() in C, depending on the parameter type; There's only 1 in C++
math.h cmath
return __builtin_cosf(__x);
return __builtin_cos(__x);
extern long double cosl(long inline long double cos(long double __x)
double); {
return __builtin_cosl(__x);
Whereas C required a unique name for APIs that accepted different parameter types, C++ can use the same generic name
When the return value is different, C++ still requires a different name to distinguish between the APIs
New names are in the std namespace, not the global namespace the way that the C APIs are
Makes the code more readable
Potential side effect if people rely upon automatic casting
Passing a float to cos() in C would automatically convert the float to a double and return a double result
Passing a float to std::cos() in C++ deals only with floats, not doubles, because there's a float version of std::cos()
If using math.h in C++, and the global namespace, cos() in C++ is still the C cos() API
When compiling with C++, header files will also check the parameter types in a function invocation to validate the calling convention used
is correct
complex
new
The global versions of new and delete operators do not require this header to be included
Provides definitions for get_new_handler() and set_new_handler()
Should be included when catching exceptions specific to memory operations or when defining your own new and/or delete replacement
s
/*
* Set the new handler back again, then execute it. If it has an
* exception, catch the exception and rethrow it
*/
std::set_new_handler( handler );
try
{
handler();
}
catch (...)
{
/*
* NOTE: We do not provide an argument in the throw statement
* below. This rethrows the exception with the exact information
* that we received (as opposed to creating a new exception without
* the data).
*/
throw;
}
/*
* The memory handler succeeded. However, we don't know if it released
* enough memory or, for that matter, any memory at all. Allocate again.
* This time, throw an exception on a failure. There's not another
* chance at this.
*/
mem = std::malloc( sz );
if (!mem)
{
throw( std::bad_alloc() );
}
}
return( mem );
}
try
{
void *mem = ::operator new( sz );
return( mem );
}
catch (...)
{
}
return( NULL );
}
valarray
The valarray library can be used to perform mathematical operations on all values in an array, using a single command, instead of
having to iterate over the array, element by element
Also provides masking classes so that operations can be limited to those selected by the mask/slice
On SIMD machines (single-instruction, multi-data), valarray operations can be optimized
On SISD machines (single-instruction, single-data), the loop iterating over all elements is hidden
valarray
#include <valarray>
/*
* Create an array of 10 elements. Every operation performed on the array will be done
* to all 10 elements implicitly -- no need to iterate externally.
*/
...
typedef std::valarray<double> AnArray;
AnArray array1( 10 );
...
AnArray result = std::pow( 2, array1 ); // Returns 2**x for each value in array1
...
AnArray cosArray = std::cos( array1 ): // Returns cos(x) for each value in array1
...
tuple
A tuple is a collection of values, possibly of different types, held within a single class instance
Example: 2-D and 3-D coordinates, pixel values (RGB), name/birthday combinations, etc.
Elements of the tuple are not named
Access to elements is positional
Don't have to define a struct in advance to use tuple
tuple
#include <algorithm>
#include <tuple>
...
typedef std::tuple<std::string,unsigned> NameAgeTuple;
template<unsigned Position> struct CompareTuple
{
bool operator()(const NameAgeTuple &t1, const NameAgeTuple &t2) const
{
return( std::get<Position>( t1 ) < std::get<Position>( t2 ) );
}
};
...
NameAgeTuple children[ 3 ];
children[ 0 ] = std::make_tuple( "Nick", 23u );
children[ 1 ] = std::make_tuple( "Josh", 21u );
children[ 2 ] = std::make_tuple( "Brandon", 19u );
...
std::sort( children, children + 3, CompareTuple<0>() ); // Sort children base on
name
...
std::sort( children, children + 3, CompareTuple<1>() ); // Sort children base on
age
...
functional
functional
#include <algorithm>
#include <functional>
...
std::string names[ 3 ];
names[ 0 ] = "Nick";
names[ 1 ] = "Josh";
names[ 2 ] = "Brandon";
...
std::sort( names, names + 3 ); // By default, sort uses "<" operator
...
std::sort( names, names + 3, std::greater<std::string>() ); // Reverse sort
...
Containers
Data structures for the most common forms of containers found in programming
Utilizes a common set of APIs across all containers so that movement between containers is as easy as possible
Basic components
list
a doubly linked list that can contain generic elements
vector
A 1-D vector of elements that dynamically grows as elements are added to it
Elements are easily added and removed at one end of the container only (the back)
Elements can be added or removed to the other side (the front), but it requires shifting all other elements in the vector to
make room for the new element
deque (double-ended queue)
Like a vector in that it's a 1-D collection of elements that dynamically grows (and shrinks) as elements are added (or
removed) to (from) it
Unlike a vector, elements can be added or removed from either end of the queue with equal ease
map
An associative array that pairs two values (key/value) to each other (like name & age)
Implemented as a red/black tree
Only 1 value per unique key
set
A collection of unique objects (no duplicates in the set)
Implemented as a red/black tree
multimap
Like a map, but allows multiple values for a given key
multiset
Allows duplicates in the set
unordered_xxxxx
"xxxxx" can be map, set, multimap, or multiset
Implemented as a hash table instead of a red/black tree
Where a red/black tree is ordered based on the value of the key, a hash is ordered based on the hash value
Some standard hashes are provided, but users may need to implement their own to use these variants
/*
* Here are some examples of how you can use std::set<T>
*/
/*
* Here's an example of a std::map<T1,T2>. It reads names and
* ages from some stream and then checks on contents
*/
/*
* Here's a quick example of a std::vector<T>
*/
/*
* And a std::deque<T> example reading and processing event times
*/
typedef std::deque<time_t> TimeDeque;
TimeDeque rodeoEvents;
while (rodeoData)
{
time_t bullRideLength;
rodeoData >> bullRideLength;
if (rodeoData)
{
rodeoEvents.push_back( bullRideLength );
}
}
...
while (rodeoEvents.size())
{
std::cout << "Bull riding time: " << rodeoEvents.front() << std::endl;
rodeoEvents.pop_front();
}
Interesting thing to note about std::vector<T> and std::queue<T> is that pop_front() and pop_back() do not return a value
Also contains adapterssit on top of these containers to provided added functionality
These adapters can utilize multiple container types in their implementation
stack: a last-in-first-out container
priority queue: the highest priority element is always at the top of the queue
queue: a first-in-first-out container
The following table comes from C++ Programmer's Guide to the Standard Template Library by Mark Nelson
#include <map>
...
class SomeReallyBigClass
{
...
public:
...
bool operator<(const SomeReallyBigClass &other) const;
...
};
...
struct SRBCPointerCompare
{
bool operator()(const SomeReallyBigClass *a, const SomeReallyBigClass *b) const
{
return( *a < *b );
}
};
...
/*
* Because SomeReallyBigClass is so big, we don't want to create them on the fly
* when we insert elements into a map. This would waste CPU time and memory.
* Instead, we'll create a map that maps pointers to this class to a double-precision
* value. A pointer value will vary from run to run, and it doesn't guarantee that
* we won't have duplicate keys. Thus, we need to provide a comparison class
* that dereferences the pointer value when doing the comparison
*/
Comparison Classes
As shown above, comparison classes can be used to define map and set ordering
Comparison classes can also be used on algorithms, like std::sort()
While defining operator<() for a class provides a mechanism for ordering, it only defines a single ordering schema
If you have multiple orderings, you'll need comparison classes
Comparisons Classes
#include <deque>
#include <string>
/*
* In this example, we have a class that contains information about an employee.
* The operator<() is defined for the class to order employees based on employee
* number.
*/
class Employee
{
std::string mLastName;
std::string mFirstName;
unsigned mAge;
unsigned mEmployeeNumber;
...
public:
...
unsigned age() const
{
return( mAge );
}
const std::string &lastName() const
{
return( mLastName );
}
...
bool operator<(const Employee &other) const
{
return( mEmployeeNumber < other.mEmployeeNumber );
}
...
};
/*
* The following two classes let us sort employees based on last name and on age
*/
struct SortOnLastName
{
bool operator()(const Employee &a, const Employee &b) const
{
return( a.lastName() < b.lastName() );
}
};
struct SortOnAge
{
bool operator()(const Employee &a, const Employee &b) const
{
return( a.age() < b.age() );
}
};
...
/*
* Gather information about an employee and add it to the queue of all employees
*/
std::deque<Employee> employees;
Employee nextEmployee;
while (getNewEmployeeData( &nextEmployee ))
{
employees.push_back( nextEmployee );
}
...
/*
* Sort employees based on employee number
*/
std::sort( employees.begin(), employees.end() );
...
/*
* Sort employees based on last name using an unnamed temporary variable of
* type SortOnLastName
*/
std::sort( employees.begin(), employees.end(), SortOnLastName() );
...
/*
* Sort employees based on age
*/
std::sort( employees.begin(), employees.end(), SortOnAge() );
From an implementation perspective, a std::map<key,T> implements the red/black tree using a class called std::pair<key,T>
Conceptually std::pair<T1,T2> is just a tuple (not a C++ tuple class but simply a tuple)
A tuple is an ordered list of elements
A std::pair<T1,T2> is a tuple of size 2
Comparisons are made between keys when ordering in the red/black tree
A std::pair<T1,T2> is a struct with 2 variables: first and second
first is the value associated with the first type (T1)
second is the value associated with the second type (T2)
When accessing data through an iterator (i), one of two formats can be used (following the pointer convention)
(*i).first and (*i).second
i->first and i->second
In a map, key corresponds to T1 while the value for that key corresponds to T2
The first element of the std::pair<key,T> should never be changed because the ordering in the red/black tree will be
disrupted
Because the container interfaces have been abstracted, there is no way for the user of a container to get to the underlying mechanics
associated with the container
The only way to access or manipulate the container is through the publicly available APIs
Knowing that a map and set are red/black trees, for example, is now only useful in the sense that memory and performance
requirements can be estimated
There's no way to directly alter the underlying red/black tree
Also means that somebody could write a version based on another tree type or some other data structure, as long as
they maintained the behavior of the container
Iterators
The STL authors recognized that they needed a standard mechanism that could access data in any generic container using the same
semantics
Such a mechanism had to be a separate class and not part of the container itself
In C, when iterating across a vector, you use an independent index variable, separate from the vector, to access data
Same concept had to apply in C++ with iterators
Used pointers as the guiding principle for iterators
In C++, a pointer is an iterator by definition
C Iteration
/*
* This illustrates how pointers can be used to iterate over elements.
* The standard form is to use an integer to iterate through contents
* for (unsigned i = 0; i < limit; i++)
* {
* double x = myValues[ i ];
* ...
* }
* The same thing can be done with pointers, as shown below
*/
...
const unsigned limit = 20;
double someVector[ limit ];
populateValues( someVector, limit ); // Fill someVector[] with values
...
double *end = someVector + limit;
double *begin = someVector;
for (double *i = begin; i != end; ++i)
{
double x = *i;
...
*i = someValue;
}
In the C example above, begin and end refer to the first element of the vector and to 1 past the last element of the vector
C++ uses the same concept and terminology for iterators
Vector with Iterators
/*
* Here's the equivalent of the code in the prior example. In this case,
* we'll use the STL vector for our container. We don't need to specify a
* size since the container will grow automatically to accommodate the
* contents we want to add to it.
*
* The code is a little verbose so that a line-to-line comparison can be
* made to the code above.
*/
#include <vector>
...
typedef std::vector<double> VecType;
VecType someVector;
populateValues( someVector );
...
VecType::iterator end = someVector.end();
VecType::iterator begin = someVector.begin();
for (VecType::iterator i = begin; i != end; ++i)
{
double x = *i;
...
*i = someValue;
}
In addition to supporting iteration in the other direction (rbegin() and rend()), containers contain both const and non-const version
s
A const iterator (named const_iterator) does not allow the container contents to be modified while a non-const iterator
(named iterator) does
Since the introduction of iterators in STL, iterators have propagated to other areas of the C++ language
Iterators have become more specialized depending on their use
Iterators on STL containers move bidirectionally while iterators on an input stream are unidirectional
Iterators are critical for use in standard algorithms (below) as well as when copying data from one container into another when the
containers are of different types
Copy From Deque to Vector
#include <vector>
#include <list>
/*
* Using iterators, it's very easy to copy from one container into another.
* The containers all supply constructors that take iterators as parameters.
* The container then copies the values from the iterators. Below is some
* pseudo-code to explain it
*
* class SomeContainer
* {
* ...
* public:
* template<typename Iterator>
* SomeContainer(Iterator first, Iterator last)
* {
* while (first != last)
* {
* push_back( *first );
* ++first; // Remember the discussion about
post-increment
* }
* }
* ...
* };
*/
...
std::list<int> myData;
populateValues( &myData );
...
std::vector<int> vectorOfData( myData.begin(), myData.end() );
myData.clear();
...
Algorithms
Functors/Function Objects
namespace std
{
...
template<typename T> struct greater
{
bool operator()(const T &a, const T &b) const
{
return( a > b );
}
};
...
}
/*
* We can create an instance of the class or create a temporary
* object if we want to use it in a sort, for example, to do a
* reverse sort
*/
...
std::vector<int> myVector;
populateValues( &myVector );
std::sort( myVector.begin(), myVector.end(), std::greater<int>() );
...
perturb( &myVector );
std::greater<int> greaterCompare;
std::sort( myVector.begin(), myVector.end(), greaterCompare );
...
STL Allocators
Generic memory allocation is performed via global operators new and delete
Can also use class-specific operators
STL introduced a concept called an allocator
An allocator is a class that is designed to allocate memory and deallocate memory for a specific object type
Uses templates to define the type that it operates upon
Abstracts memory allocation
One of the fields found in STL containers is an allocator
Example: template<typename T, typename Compare=std::less<T>, typename Alloc=std::allocator<T> >
set;
By defining your own allocator and providing it as a parameter to std::set, you can define a new set without having to actually do all
of the underlying work
Example: PUSH stack-backed allocator and set working on that allocator
enum
{
ALIGNMENT=8u,
LSB=(ALIGNMENT-1),
MASK=(~0u-LSB),
T_BYTES=((sizeof(T)+LSB)&MASK),
BYTES=(NumElements*T_BYTES)
};
/*
* Variable dictionary
* o mBuffer: The space that the allocator allocates memory from.
* When the instance is on the stack, the memory comes from the
* stack
* o mNextAllocation: Points to the next spot at which memory should
* be allocated from
* o mNumAllocations: The number of allocations performed
* o mBytesAllocated: The number of bytes allocated so far
*/
private:
/*
* These inline functions provide a very simple, elegant way of
* specifying that the atomic types in C++ don't need a destructor.
* The last variant specifies that, for types that don't have a
* specialized destruct() API, the destructor of the type should be
* utilized.
*
* These APIs are used in the FixedAllocator<T> class below (see
* FixedAllocator<T>::destroy()).
*/
void destruct(char*) {}
void destruct(unsigned char*) {}
void destruct(short*) {}
void destruct(unsigned short*) {}
void destruct(int*) {}
void destruct(unsigned*) {}
void destruct(long*) {}
void destruct(unsigned long*) {}
void destruct(long long*) {}
void destruct(unsigned long long*) {}
void destruct(float*) {}
void destruct(double*) {}
void destruct(long double*) {}
template<typename U> void destruct(U *u) { u->~U(); }
private:
/*
* API to help in the initialization of a class instance
*/
void init();
public:
/*
* Standard types required for use in STL containers
*/
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef T *pointer;
typedef const T *const_pointer;
typedef T &reference;
typedef const T &const_reference;
typedef T value_type;
/*
* Convert a FixedAllocator<T> into FixedAllocator<U>
*/
template<typename U> struct rebind
{
typedef FixedAllocator<U,NumElements> other;
};
public:
/*
* Constructors just need to set up the pointer for the next
* allocation to the allocated buffer. Destructors don't need
* to do anything.
*/
FixedAllocator() throw();
FixedAllocator(const FixedAllocator ©) throw();
~FixedAllocator() throw();
/*
* Template-based allocation that needs to be defined in this class
*/
template<typename U>
FixedAllocator(const FixedAllocator<U,NumElements> &other) throw()
{
init();
}
/*
* The following APIs are "boiler plate" APIs and don't change with
* the implementation of allocate() and deallocate() (below)
*/
pointer address(reference x) const;
const_pointer address(const_reference x) const;
void construct(pointer p, const T &val);
void destroy(pointer p);
size_type max_size() const throw();
/*
* The 2 APIs below are responsible for (1) acquiring memory that
* can be used to construct elements of type T within and (2)
* releasing the acquired memory. In this case, allocation just
* moves the pointer for memory and deallocation does nothing
*/
pointer allocate(size_type num, const void *hint=0);
void deallocate(pointer p, size_type num);
/**
* Reset the allocator. All allocations previously made via the
* allocator are invalid, and all memory is available for reuse.
*/
void reset();
};
/***********************************************************************/
/* Private inline APIs below */
/***********************************************************************/
#undef CLASS
#undef INLINE
#undef TEMPLATE
/***********************************************************************/
/* Required global functions */
/***********************************************************************/
#include <list>
#include <map>
#include <set>
#include <vector>
...
#include "FixedAllocator.h"
...
#endif
Using the new set
#include "FixedSTL.h"
...
/*
* Create a set that hold 20 elements at most. Elements are allocated
* from the stack, so allocation and deallocation is ***FAST***.
*/
Boost
From boost.org: "Boost provides free peer-reviewed portable C++ source libraries."
Amazing collection of both simple and complex class libraries that can be used to extend C++
Have already seen boost::format in the I/O class notes
All of the code is in the boost namespace, so there's no potential for name collision
A lot of the Boost concepts were integrated into the C++11 standard that just came out
However, Boost continues to evolve and add even more content
The documentation provides a comprehensive list of all of the class libraries and their functionality
Boost Variant Example
/*
* This example shows one of the more interesting Boost libraries: Boost Variant.
* Since C++ can't support unions of classes, Boost added the Variant library to
* support them.
*/
#include <iostream>
#include <map>
#include <string>
#include "boost/variant.hpp"
/*
* In this example, we want to gather meta-data about pictures that we've shot
* with our camera. Meta-data can be a text string, an integer, or a floating-point
* number, depending on the data type. The data type is represented as a string.
* We want to map this string to an arbitrary value of string, integer, or floating-
* point value.
*/
...
PhotoAttributes jpegPhoto;
jpegPhoto[ "iso" ] = 200;
jpegPhoto[ "flash" ] = 0;
jpegPhoto[ "fstop" ] = 2.8;
jpegPhoto[ "camera" ] = "Canon 5D Mark III";
...
for (PhotoAttributes::const_iterator i = jpegPhoto.begin(), end = jpegPhoto.end(); i
!= end; ++i)
{
std::cout << i->first << ": " << i->second << '\n';
}
...
void someAPI()
{
/*
* This API collects information and processes it. If, during the processing,
* an error is detected, it returns.
*
* The API uses a deque<T> for memory management. Returning from this API will
* invoke the destructor for the deque<T>. The deque<T> will invoke the
* destructor automatically for all elements in the deque<T>. No explicit
* memory management is required because it's handled by the deque<T> for us,
* so there's no chance of a memory leak.
*/
...
Data d;
DataDeque dataCollection;
while (creatingData( &d ))
{
dataCollection.push_back( d );
}
...
if (someException())
{
return;
}
...
if (anotherException())
{
return;
}
...
}
Boost also provides smart pointer classes (which are now part of C++11)
Smart pointers use reference counting and share data between the pointers
Scoped pointers are more limited
Do not allow transfer of ownership
Automatically destroy the instance when the scoped pointer goes out of scope
Each has 2 variants: one for a single instance and one for a vector of instances
With automatic destruction, the potential for memory leaks is removed
Additional overhead associated with smart_ptr and smart_array since additional memory is allocated to hold reference counter
Boost shared_ptr & shared_array
#include <boost/smart_array.hpp>
...
/*
* Creation of a single instance or a vector of instances simply uses the appropriate
* new operator. The returned pointer is passed as a parameter to the constructor of
* the shared_ptr or shared_array instance.
*/
/*
* When memory in one of the smart pointers is no longer accessible by any instance,
* the instance pointed to is automatically destroyed using the appropriate delete
* operator (vector or single instance).
*/
The C++ standard library also provides template classes to avoid memory leaks
The class std::auto_ptr<T> acts as a smart pointer for a single instance of type T
Replaced in C++11 with std::unique_ptr<T>
The std::unique_ptr<T> also supports vectors of objects, unlike std::auto_ptr<T>
Unlike the Boost smart pointers, only one std::auto_ptr<T> can point to a single instance at any given
time
C++11 also supports fixed-size arrays with std::array<T>
Like a std::vector<T> but cannot grow or shrink
C++ auto_ptr<T>
#include <memory>
/*
* Only one instance of an auto_ptr can point to the same allocated object at
* any given time. In the following code, the instance "x" will be created and
* initialized with the address of a new instance of class SomeObject. When
* the instance "y" is created, it takes over ownership of the memory pointed
* to by "x" and leaves "x" as NULL.
*
* When "y" goes out of scope or is reset, the underlying memory is automatically
* deleted as a single instance.
*/
Importance of std::swap()
The std::swap() API simply swaps two values between each other
std::swap()
template<typename T> void std::swap(T &a, T &b)
{
T other( a );
a = b;
b = other;
}
Consider an API that collects a lot of data and, if the data ends up being valid, it it kept; if not valid, it's destroyed
Importance of std::swap
#include <vector>
class Data;
typedef std::vector<Data> DataVector;
...
void collectData(DataVector *validData)
{
/*
* This API is responsible for collecting vast amounts of data. At the end,
* if the data is deemed "good", it's saved to the caller; otherwise the
* data is discarded. Data is originally collected in sampleData.
*/
DataVector sampleData;
...
/*
* Check the validity of the data and save if valid
*/
if (dataIsValid( sampleData ))
{
*validData = sampleData;
}
}
Consider the last statement (*validData = sampleData). The following operations are performed:
The content of validData is discarded
The contents of sampleData is replicated within validData
Upon exit, sampleData is destroyed
Even if we kept pointers to data in the vector instead of the actual data, we could still end up doing a lot of work to make the copy
Another approach is to use swap()
std::vector.swap()
...
if (dataIsValid( sampleData ))
{
validData->swap( sampleData );
}
}
In this code, the internal contents of the two vectors are swapped between each other using std::swap()
No duplicate is ever created, saving runtime
swap() is an API found in a lot of library classes
Can be used to clear contents, too
DataVector().swap( myData );
Creates temporary instance of no name
Swaps internals of temporary instance with the internals of myData
Destroys temporary instance
Can use it in your own class for efficiency
class SomeObject
{
std::string mName;
unsigned mAge;
...
public:
...
void swap(SomeObject &other);
...
};
Summary
The C++ language has a large set of high-quality standard libraries available for use
Often times libraries already exist to do complex tasks that you might otherwise struggle with
Utilizing these libraries provides a significant productivity boost for the programmer, allowing them to focus on the true problem as
opposed to implementation of support code
class SomeClass
{
/*
* An API used by the thread pool takes a single argument, of
* type void*, that cannot be a class member API (since the "this"
* pointer is not recognized). We have to put all arguments for
* threading into a separate structure that is used by the
* threaded API to actually do work.
*/
struct ThreadedData
{
...
};
struct ThreadedArguments
{
SomeClass *mThis;
ThreadedData *mData;
};
/*
* Variable dictionary:
* o mThreadPool: A first-in-first-out thread pool used to
* distribute jobs to various threads for parallel execution
*/
PUSH::Threading::FIFOThreadPool mThreadPool;
...
private:
/*
* The API threadedEntryPoint is the API that is called by the
* thread pool. It, in turn, invokes the API threadedAPI() that is
* part of the class
*/
private:
/*
* Don't allow the copy constructor or assignment operator.
* What would you do with respect to threads if this was
* allowed?
*/
public:
SomeClass();
~SomeClass();
...
void process();
...
};
//////////////////////////////////////////////////////////////////////////////
// Private inline APIs below //
//////////////////////////////////////////////////////////////////////////////
...
}
//////////////////////////////////////////////////////////////////////////////
// Public inline APIs below //
//////////////////////////////////////////////////////////////////////////////
inline SomeClass::SomeClass()
{
/*
* Initialize the class instance. This includes starting the thread pool
* with as many threads as there are cores on this computer.
*/
...
mThreadPool.start( mThreadPool.numCores() );
...
}
inline SomeClass::~SomeClass()
{
/*
* Wait for all queued jobs to finish in the thread pool. Then
* clean up all dynamic memory, close files, etc.
*/
mThreadPool.waitAllJobs();
...
}
inline void SomeClass::process()
{
/*
* This routine processes data for the class by spawning off threads to
* do the work in parallel
*/
...
ThreadedData *data = new ThreadedData;
...
ThreadedArguments *args = new ThreadedArguments;
args->mThis = this;
args->mData = data;
mThreadPool.add( threadEntryPoint, args, false );
...
}
...
}
Traits
Important Note
Like templates, traits are an extremely large topic
There's no way to cover all of the concepts behind a trait in a simple class
Readers are encouraged to study traits to see how their code might benefit
This section simply provides a very gentle and simple introduction to the concept
Traits Overview
A "trait" is simply a C++ class, typically empty of member APIs, that can be used to make compile-time decisions
Takes advantage of 2 C++ compiler mechanisms
Unused code elimination
Compile-time template expansion
switch (Count)
{
case 10: internalCalculations( 10 );
case 9: internalCalculations( 9 );
case 8: internalCalculations( 8 );
case 7: internalCalculations( 7 );
case 6: internalCalculations( 6 );
case 5: internalCalculations( 5 );
case 4: internalCalculations( 4 );
case 3: internalCalculations( 3 );
case 2: internalCalculations( 2 );
case 1: internalCalculations( 1 );
case 0: internalCalculations( 0 ); break;
default: doLongCalculations(); break;
}
if (isParallel)
{
pthread_mutex_unlock( &mLock );
}
}
/*
* For an instance declared in the following manner:
* SomeObject<false,3> someInstance;
* We end up with a version of calculate() that looks like the following.
* Note that the 2 if-statements have been removed, and the switch is
* eliminated
*
* template<> inline void SomeObject<false,3>::calculate()
* {
* internalCalculations( 3 );
* internalCalculations( 2 );
* internalCalculations( 1 );
* internalCalculations( 0 );
* }
*/
Now consider that we want to qualify types based upon whether or not they can be assigned using memcpy() or if each instance must be
iterated over, using the operator=() API
We can define the following (as Boost does)
NOTE: In this particular example on the Boost web site, they are always dealing with pointers. The code shows one API that
uses iterators and another that uses pointers. This can't be done generically as shown, but this is just an example program to
explain the power of traits in as simple an example as possible
Boost has_trivial_assign
/*
* First define a template that's used in the generic case. By default, we have
* to assume that a generic type T cannot be trivially assigned using memcpy().
*/
/*
* Now we can use template specialization and partial template specialization
* to define specific things that we know can be trivially assigned.
*
* This is NOT an exhaustive list!
*/
/*
* If your own class can be trivially assigned, you can add your class' traits
* by expanding the template set
*/
/*
* The has_trivial_assign<> trait can now be used to optimize copies through
* the following mechanism (take directly from the page
*
http://www.boost.org/doc/libs/1_55_0/libs/type_traits/doc/html/boost_typetraits/exampl
es/fill.html
*/
//
// fill
// same as std::fill, but uses memset where appropriate
//
namespace detail{
template <typename I, typename T, bool b>
void do_fill(I first, I last, const T& val, const boost::integral_constant<bool,
b>&)
{
while(first != last)
{
*first = val;
++first;
}
}
In the API fill(), the type truth_type will evaluate to either true_type or false_type
It will evaluate to true_type if has_trivial_assign<T>::value is true for type T AND the size of T is 1 byte
It evaluates to false_type under all other conditions
These evaluations are done at compile time
The fill() API next calls the API detail::do_fill()
The determination of which API is invoked is made based on the third parameter type
If the third parameter type is true_type, the second do_fill() API is invoked because it specifically calls out the third
parameter via partial template specialization
If the third parameter type is not true_type, the first do_fill() API is invoked
Summary
Traits are a large and sometimes complex body of code
This section just skimmed the surface
Readers are encouraged to review traits on their own to see how to best optimize their own code with traits
Traits utilize C++ templates and partial template specialization in order to determine things at compile time instead of runtime
Significant savings in runtime and/or memory are possible
Even when these savings aren't present, traits provide additional type safety to guarantee that an operation cannot be performed
incorrectly upon an object instance
Iterators
Note: The format for this page is different from other pages in the class. This page is an adaptation of an article to be published on the Software
Reuse Portal of the Quality Initiative.
Introduction
Iterators are a fundamental part of many C++ class libraries, including STL and Boost. They provide a clean, consistent mechanism to examine,
and potentially modify, data within a data structure. What makes them so powerful is that the iterator concept is somewhat independent of the
underlying data structure that they are applied against., and that iterator APIs have standardized since their introduction. Furthermore, many
generic containers and classes, like those in STL, provide iterators, allowing a certain degree of interoperability between very diverse data
structures.
Consider the following example: a class needs to store resistors and capacitors in a RC network, but these elements need to be available for
query to the user of the class. The following (partial) class definition might exist:
RC Network Class
class Capacitor { ... };
class Resistor { ... };
class RCNetwork
{
typedef std::vector<Capacitor> CapacitorContainer;
typedef std::vector<Resistor> ResistorContainer;
CapacitorContainer mCapacitors;
ResistorContainer mResistors;
...
public:
typedef CapacitorContainer::const_iterator CapacitorIterator;
typedef ResistorContainer::const_iterator ResistorIterator;
public:
...
CapacitorIterator beginCaps() const { return( mCapacitors.begin() ); }
CapacitorIterator endCaps() const { return( mCapacitors.end() ); }
ResistorIterator beginResistors() const { return( mResistors.begin() ); }
ResistorIterator endResistors() const { return( mResistors.end() ); }
...
};
...
RCNetwork network;
populateNetwork( &network );
...
double totalCap = 0.0;
RCNetwork::CapacitorIterator c = network.beginCaps();
RCNetwork::CapacitorIterator endC = network.endCaps();
while (c != endC)
{
totalCap += c->getValue();
++c; // Note: Pre-increment, not post-increment
}
...
double totalRes = 0.0;
RCNetwork::ResistorIterator r = network.beginResistors();
RCNetwork::ResistorIterator endR = network.endResistors();
while (r != endR)
{
totalRes += r->getValue();
++r;
}
...
To access the capacitors, the user needs simply use the beginCaps() and endCaps() APIs
To access the resistors, the user simply uses beginResistors() and endResistors()
Furthermore, the author of the class can simply change the single typedef statement to utilize a deque or list instead of a vector for
the containers, without having to require changes in the user's code.
Iterator Hierarchy
As stated above, iterators are somewhat independent of the underlying data structure. In actuality, a hierarchy of iterators exists based on the
capabilities of the underlying data structure. For example, a singly-linked linked list can easily be iterated over in a forward direction (from the
beginning of the list to the end of the list), but not backwards. A vector, however, can be accessed randomly, without regard for the query into its
contents. A good description of the different iterator categories can be found at http://www.cplusplus.com/reference/iterator. Fundamentally, there
are 4 levels of hierarchy, and five different categories of iterators:
Level 1
An input iterator performs single-pass input. They cannot change the underlying data in the data structure
An output iterator performs single-pass output. This type of iterator can alter the contents of the data structure it iterates over
Level 2
The forward iterator can sequentially iterate over a data structure or class in a single direction
Level 3
The bidirectional iterator can sequentially iterate over a data structure or class in both directions: forwards and backwards
Level 4
The random access iterator can access data data in the underlying data structure or class in a non-sequential fashion
Each successive level of iterator is a superset of the functionality of the level below it. Thus, a random access iterator builds upon the
functionality of the bidirectional iterator. The APIs of the random access iterator are also a superset of the bidirectional iterator. The
following table from http://www.cplusplus.com/reference/iterator explains the capabilities of each iterator type. In the table, a and b are
iterators, X is the class that the iterator is applied against, and m is a member or API of X.
default-constructible X a;
X()
Supports inequality comparisons (<, >, <= and >=) between iterators a<b
a>b
a <= b
a >= b
The sort() API can be used to sort elements in a vector, deque, string, or even C pointers.
Sorting
std::vector<int> intVector;
std::deque<float> floatDeque;
std::string someString;
char *cString;
...
std::sort( intVector.begin(), intVector.end() );
std::sort( floatDeque.begin(), floatDeque.end() );
std::sort( someString.begin(), someString.end() );
std::sort( cString, cString + strlen( cString ) );
The last sort (above) might be somewhat surprising since a pointer is not an iterator class. However, by looking at the table above, it's
interesting to note that all of the actions associated with iterators are performed using standard C operators (++, --,*, ->, etc). All of these
operations are also supported by pointers. Thus, while a C pointer isn't an iterator class, it is, in fact, a random access iterator for all intents and
purposes.
Sorting is just one such use of an iterator. Many of the algorithms in the <algorithm> header file utilize iterators to perform their operations (swa
p() doesn't, for example).
The value returned by end() should never be operated upon as it does not refer to a valid element within the container.
Some containers also provide rbegin() and rend() iterators. These iterators move in the reverse direction of the forward iterator. The value
returned by rbegin() references the last element in the container while rend() returns an iterator that references the element just before the
first (conceptually) in the container. Incrementing the iterator moves backwards while decrementing moves forwards through the container.
Note that the iterator to move backwards is named reverse_iterator and not iterator
Reverse Iteration
SomeContainer::reverse_iterator i = someContainerInstance.rbegin();
SomeContainer::reverse_iterator end = someContainerInstance.rend();
while (i != end)
{
... // Do something against *i or i->m
++i;
}
The following picture shows the relationship between begin(), end(), rbegin(), and rend() as well as the direction taken when the iterator is increm
ented
iterator vs const_iterator
Iterators come in both const and non-const forms. In their const form, the iterator cannot be used to change the data that it references. In
other words, a const_iterator provides a read-only view of the underlying data. These types of iterators are input iterators. In their non-const
forms, iterators can change the underlying data within a data structure. These types of iterators are output iterators. Reverse iterators also have a
const and non-const form: reverse_iterator is the non-const form while const_reverse_iterator is the const form.
In this regard, they continue to behave as a pointer (which they were modeled after)
const pointers cannot modify the contents that they point to while non-const pointers can
The type of iterator returned by a data structure is a function of the specific API invoked. When dealing with a const data structure or class inst
ance, a const-form iterator will be returned. When dealing with a non-const data structure or class instance, the non-const iterator will be
returned. While an iterator instance can be assigned to a const_iterator instance, the opposite cannot occur.
You can always obtain a const_iterator from a class instance, whether that instance is const or not
You can only obtain a const_iterator from a const class instance. Without casting the instance to it's non-const form, you cannot
get a non-const iterator from a const instance
const vs non-const Iterators
typedef std::vector<int> IntVector;
IntVector x;
const IntVector *y = &x;
Looking at the APIs within a container, you often see the two different APIs provided for each iterator type.
Class Iterators
class SomeClass
{
...
public:
...
iterator begin();
const_iterator begin() const;
iterator end();
const_iterator end() const;
...
};
...
for (unsigned i = 0; i < 1000; i++)
{
...
}
...
for (unsigned j = 0; j < 2000; ++j)
{
...
}
In the above loops, it doesn't matter which form of increment is performed on variables i and j. They are equivalent in terms of both functionality
and performance. Iterators, however, do not share this trait. In general, pre-increment and pre-decrement operations are preferred for iterators
over post-increment and post-decrement operations due to performance considerations. This has to do with the way that prefix and postfix
operations are defined.
In a prefix operation, the operation is applied against the variable prior to the value being returned. In a postfix operation, the original value is
returned prior to the increment or decrement. Below is an example implementation of the two approaches within an iterator.
Example Iterator Implementation for Pre-/Post-Increment
class SomeClassIterator
{
SomeClass *mOwner;
unsigned mPosition;
private:
friend SomeClass;
public:
SomeClassIterator();
SomeClassIterator(const SomeClassIterator ©);
...
SomeClassIterator &operator++();
SomeClassIterator operator++(int);
SomeClassIterator &operator=(const SomeClassIterator ©);
...
};
/*
* Pre-increment and pre-decrement operators are parameterless.
*/
/*
* In order to create a post-increment or post-decrement operator, C++
* requires that the operator is specified with an unused integer parameter.
* The parameter itself is meaningless. This is only used as a convention
* so that the compiler can distinguish the pre-increment/pre-decrement
* operator from the post-increment/post-decrement operator.
*/
// Postfix increment
SomeClassIterator returnIterator( *this );
mPosition++;
return( returnIterator );
}
The pre-increment operator simply increments an internal variable and returns a reference to the existing instance. However, the post-increment
operator must make a copy of the current instance prior to changing the value. The copy is then returned on the stack. In a for-loop, for example,
the only way to guarantee optimal performance is to explicitly use the prefix forms of increment and decrement. If the postfix form is utilized, it is
left to the compiler to optimize away unnecessary operations due to the presence of the temporary variable.
Another potential performance impact comes from invoking the end() API too often. For example, the code below may end up invoking end() e
very loop iteration (depending on how good the compiler is at optimization).
A much better approach is to assign the end iterator prior to the loop or in the initialization portion of the for-loop.
An Optimized for-loop
for (SomeClass::iterator i = someInstance.begin(), last = someInstance.end(); i !=
last; ++i)
{
...
}
class MyString
{
char *mContent;
unsigned mSize;
public:
MyString();
MyString(const MyString ©);
~MyString();
...
MyString &operator=(const MyString &other);
};
For this type of class, a random access iterator makes sense since a user would probably want to be able to access any character in the string
in a non-sequential manner. By convention, the iterator must provide a standard set of APIs. Furthermore, it should inherit from std::iterator,
using the random_access_iterator_tag value (there are various "tag" values for the different types of iterators. See http://www.cplusplus.co
m/reference/iterator/RandomAccessIterator). For a description of std::iterator, see http://www.cplusplus.com/reference/iterator/iterator.
MyString *mOwner;
unsigned mPosition;
/*
* Create a shorthand type for the base class. We're going to use this type
* over and over again in this class
*/
private:
/*
* We want MyString to be a friend of this class because it needs to be able
* to call a special constructor that we don't want to make public. This
constructor
* is used when calling begin(), end(), and other APIs that return iterators
*/
friend MyString;
public:
/*
* Expected types. Typically defined in std::iterator, but required if you
* don't inherit from that class. In other words, if you don't inherit from
* std::iterator, you must define these. When inheriting from std::iterator,
* they're already defined for you within std::iterator
*/
MyStringIterator();
MyStringIterator(const MyStringIterator ©);
~MyStringIterator();
MyStringIterator &operator++();
MyStringIterator operator++(int);
MyStringIterator &operator--();
MyStringIterator operator--(int);
MyStringIterator operator+(const difference_type n) const;
MyStringIterator &operator+=(const difference_type n);
MyStringIterator operator-(const difference_type n) const;
MyStringIterator &operator-=(const difference_type n);
/*
* Define binary operators. Note that some, but not all of these could also have
* been done within the class
*/
The definition for MyStringIterator can exist within the class MyString or outside of it.
public:
typedef MyStringIterator iterator;
...
};
---------- OR ----------
class MyString;
class MyString
{
public:
typedef MyStringIterator iterator;
...
};
The example above covers the case of a non-const iterator, but not a const iterator. It can be expanded upon to include both const and
non-const with additional code. Even more complex code is required in order to be able to assign a non-const iterator to a const iterator, but not
the opposite direction. Writing an iterator this way, in fact, requires more effort than writing the container that the iterator operates upon.
class MyString
{
/*
* The following structs are used to help create both iterator and
* const_iterator from the same base class. The generic ValueTypeOf
* class is used for a non-const instance of MyString and sets
* value_type to be "char". However, the const-specific version of
* ValueTypeOf is used for a "const MyString", and it sets value_type
* to be a "const char" instead of "char".
*
* For this simple class, we can actually replace the string
* "typename T::value_type" with "char". However, this illustrates
* the use of the class for generic contains that have value_type
* defined (which most do or should).
*/
private:
/*
* Define a base iterator that will be used with the Boost
* iterator_facade package to create a real iterator of both types:
* iterator and const_iterator
* The iterator_facade takes up to 5 parameters. These are:
* o The derived type. This is the name of the iterator itself.
* In this case, the Iterator is defined as "Iterator<Q>"
* because it's a template.
* o The second parameter is the value. To get both iterator
* and const_iterator to work, we're going to use the
* ValueTypeOf, defined above, to define the value. For a
* const_iterator, this is "const char". For an iterator,
* this becomes simply "char"
* o The third parameter is the tag. This is a random access
* iterator
* o The fourth parameter defines a reference. The default is
* the second parameter as a reference, so we could have
* skipped entering it
* o The fifth parameter is the difference between two
* iterator. It defaults to std::ptrdiff_t, so we could have
* skipped entering it
*
* Note that this looks somewhat like the iterator defined earlier (above).
* It contains the owner pointer and a position. However, most of the
* similarity ends there
*/
Q *mString;
unsigned mPosition;
/*
* In order to use iterator_facade, we **MUST** make
* boost::iterator_core_access a friend of this class. That
* way, the APIs required to manipulate the iterator can
* be invoked, even though they're private.
*/
private:
/*
* These APIs are used by the Boost iterator_facade to create
* the actions for standard iterator APIs. For a random access
* iterator, we need to define these six APIs. A forward
* iterator or bidirectional iterator doesn't need to define
* all six (for example, a forward iterator can't decrement).
*
* The iterator_facade class creates all of the APIs
* required for an iterator and uses the APIs below to implement
* the core iterator operators that we all know and love
*/
public:
/*
* Standard constructors and destructor
*/
Iterator();
Iterator(Q *string, const unsigned position=0);
Iterator(const Iterator ©);
~Iterator();
/*
* The constructor below is used when creating a
* const_iterator from an iterator. Note that it won't
* work the other way since the compiler will error
* when it tries to assign values between a passed
* const_iterator to the iterator instance
*/
/*
* APIs used to get to the contents of the iterator. Required
* in order to convert from iterator to const_iterator since the
* two are different classes and can't access each other's
* data. Look the implementation below to see how these APIs
* are used to avoid a compiler conflict.
*/
/*
* Comparison and assignment iterators. In this form, they allow
* the iterator ro be assigned to a const_iterator. They also
* allow comparisons between const_iterator and iterator without
* generating a compiler error
*/
/*
* We're done defining the iterator. At this point, we're back to
* defining content for the class MyString
*/
public:
/*
* The typedef for value_type is required so that iterators can be
* implemented. The ValueTypeOf struct references this type
*/
/*
* Define standard names for the iterators
*/
private:
/*
* Variable dictionary:
* o mContent --> The characters in the string, not NULL terminated
* o mSize --> The number of valid characters in the string. Also equal
* to the amount of memory allocated for the string
*/
char *mContent;
unsigned mSize;
public:
...
iterator begin() { return( iterator( this, 0 ) ); }
const_iterator begin() const { return( const_iterator( this, 0 ) ); }
iterator end() { return( iterator( this, mSize ) ); }
const_iterator end() const { return( const_iterator( this, mSize ) ); }
...
};
/*
* Below is the implementation for each API in the iterator
*/
mPosition = other.position();
mString = other.string();
}
if (amount >= 0)
{
mPosition += unsigned( amount );
}
else
{
const unsigned amt = unsigned( -amount );
mPosition = amt > mPosition ? 0 : (mPosition - amt);
}
}
mPosition--;
}
mPosition++;
}
/*
* Used to assign and compare const_iterator and iterator, as appropriate
*/
mString = copy.string();
mPosition = copy.position();
return( *this );
}
The iterator APIs can be further enhanced with error checking to ensure that there are no problems. The implementation shown above is long
enough as it is!
Even though it may not look like it, the iterator_facade class dramatically simplifies the creation of iterators:
Standard APIs for an iterator are implemented using simple APIs like increment(), decrement(), advance(), etc.
Both const_iterator and iterator can be implemented with the same class definition by using a helper class, ValueTypeOf, to
set the reference type associated with the container (MyString in this example)
Validating six APIs is much easier than validating all of the APIs associated with a standard iterator
Summary
Iterators abstract data structure implementation from data access within a container
They provide a consistent abstraction that can be utilized across a variety of very different classes and data structures
Iterators are hierarchically organized into 4 different levels and 5 different categories
Each higher level is a superset of the lower level functionality
Iterators can be utilized in a variety of different algorithms to abstract out the data structure format from the algorithmic operation (like st
d::sort())
The definition of an iterator can be a complex task in and of itself. However, the Boost iterator_facade package can greatly simplify
the creation of iterator and const_iterator classes
Explicit Construction
A constructor of a class X taking an argument of type Y is used by the compiler to perform an automatic conversion from Y to X in order to satisfy
a type requirement.
Sometimes, it is desirable to disable the implicit conversion provided by the constructor. This done by prefixing the constructor with the keyword
"explicit" which marks it as strictly for construction and not for implicit conversion. We disable implicit conversion to avoid surprising the user when
an implicit conversion does not have reasonable semantics.
Example:
class Y ...
class X
{
public:
X(Y&);
explict X(int);
...
};
X x;
x = y; // Compiler generates: x = X(y);
Note that automatic conversion could still be used to convert argument of explicit constructor:
class A
{
public:
explicit A(double d)
{}
};
int
main(int argc, char* argv[])
{
A a(1); // Automatic conversion: A a(double(1))
A b(1.0);
a = 2; // Error
b = 2.0; // Error
}
Third example:
template <typename T>
class Vector
{
public:
Vector(); // Empty vector.
Vector(size_t n); // Vector of n items initialized to T() each.
~Vector();
...
};
If we make the constructor that takes size_t explicit then the "v1 = 7" line would not compile. That is a lot more reasonable than having it compile
to something with surprising semantics since the user is more likely to expect "v1 = 7" to assign 7 to all elements of v or to reset v to a vector of
one value which is 7, rather than to replace v1 with a vector of 7 zeros.
Turning off automatic conversion is a practical way of making distinct integer subtypes. Below is an example of using integer indices to represent
vias and routing layers where it is possible to use the indices as integers (to index an array) but it is not possible to use integers as indices or mix
indices.
#include <iostream>
/// Subtype of an integer type Int. Can be used as an Int (via the the conversion
operator).
/// An Int cannot be converted to this subtype because constructor is explicit.
/// The "int n" argument should be passed a unique value for each distinct subtype.
template <typename Int, int n>
class IntType
{
public:
explicit IntType(Int i)
: val_(i)
{}
~IntType()
{}
private:
Int val_;
};
main()
{
ViaIndex vix0(0);
func(lix0, vix0);
lix0 = lix1;
// lix0 = vix0; // Compile time error.
// func(vix0, lix0); // Compile time error.
// func(0, 0); // Compile time error.
}
Next Section: RTTI
RTTI
C++ allows limited runtime type checking and type comparison. Given a pointer to an object, it is possible to get the actual type of the object even
though the apparent type of the pointer is that of a base class provided that the base class is polymorphic (i.e. it has at least one virtual method).
Pointers and references to non-polymorphic types cannot be resolved to actual object type at run time using standard language features.
Run-time information of polymorphic types can be obtained through dynamic casting or the rtti API.
Dynamic Casting
Given a pointer/reference with a base class type, it is possible to check if it actually points/refers to an object of a derived class using a dynamic
cast. If the result of the cast is non-null then it does.
Example:
class Base
{
public:
Base() {};
virtual ~Base() {}
};
It is cleaner if all the methods of a polymorphic class hierarchy are virtual. This would obviate the need for dynamic casting and result in cleaner
code.
If dynamic casting is used, then one has to check all the cast locations associated with a base class whenever a new derived class is introduced.
To avoid paying penalty for dynamic dispatch (e.g. same object and same method used in a loop)
To retrofit existing code (e.g. force dynamic selection of a non-polymorphic method)
Consider following search function which compares a glob/regex pattern to a set of nets:
// Fill results vector with all the nets of the given cell matching the given search
expression.
void findNets(Cell* cell, SearchExpression* expr, std::vector<Net*>& results)
{
results.clear();
for (Cell::NetIterator ni = cell->beginNets(); ni != cell->endNets(); ++ni)
if (expr->matches(ni->getName()))
results.push_back(*ni);
}
What if profiling points out that most of the times the search expression is a glob and the pattern is a literal:
// Fill results vector with all the nets of the given cell matching the given search
expression.
void findNets(Cell* cell, SearchExpression* expr, std::vector<Net*>& results)
{
results.clear();
// 2. General case.
for (Cell::NetIterator ni = cell->beginNets(); ni != cell->endNets(); ++ni)
if (expr->matches(ni->getName())
nets.push_back(*ni);
}
Here's another example (not recommended) where we use dynamic cast to avoid dynamic dispatch:
void f(Base* b)
{
for ( ... )
b->method(...);
}
versus
void f(Base* b)
{
Derived* d = dynamic_cast<Derived*>(b);
if (d != NULL)
for ( ... )
d->Derived::method(...); // Static dispatch. Faster than dynamic.
else
for ( ... )
b->method() ; // Original dynamic dispatch
}
#include <typeinfo>
#include <iostream>
class Base
{
public:
Base() {}
~Base() {}
};
class PolyBase
{
public:
PolyBase() {}
virtual ~PolyBase() {}
};
int
main(int argc, char* argv[])
{
Base b;
Derived d;
Base* dp = &d;
PolyBase pb;
PolyDerived pd;
PolyBase* pdp = &pd;
if (typeid(b) == typeid(Base))
std::cout << "check1\n"; // Prints
if (typeid(d) == typeid(Derived))
std::cout << "check2\n"; // Prints
if (typeid(*dp) == typeid(Derived))
std::cout << "check3\n"; // Does not print.
if (typeid(*pdp) == typeid(PolyDerived))
std::cout << "check4\n"; // Prints
if (typeid(pdp) == typeid(&pd))
std::cout << "check5\n"; // Does not print
std::cout << "pdp type: " << typeid(pdp).name() << '\n'; // Prints P8PolyBase
std::cout << "&pd type: " << typeid(&pd).name() << '\n'; // Prints P11PolyDerived
}
Exceptions
To deal with run-time errors and exceptions C++ provides an exception mechanism whereby the code that encounters the error raises or throws
an exception. This causes the current function/method (i.e. stack frame) to terminate as well as any function/method in the call chain of the
current method until an active handler is found that matches the type of the raised exception.
Note that any terminated function in the call chain will go through the normal process of destruction of active local objects.
If no handler is found in the call chain then the program terminates with an "unhandled exception" error.
Example
#include <iostream>
Notes:
1. An active handler is one of the catch statements associated with a try block that is in progress.
2. An active handler may be in the same function/method where the exception is thrown or any of the functions/methods in the call chain.
3. The closest matching handler (similar rules to function argument binding) is invoked.
4. More than one handler can be associated with a try block. The most specific handler is invoked. The handlers associated with one block
are examined in the order specified (less specific types must occur later than more specific ones). If no exact match is possible, then
match with a standard conversion is attempted.
5. A handler with ellipsis (dot dot dot) as signature will match any thrown expression.
6. Standard exceptions are all derived from std::exception.
Example:
#include <iostream>
try
{
throw 0;
}
catch (int a)
{
std::cerr << "Caught int exception\n"; // Will print.
}
catch (...)
{
std::cerr << "Caught generic exception\n"; // Will not print.
}
std::cerr << "End\n";
return 0;
}
Another example:
#include <iostream>
try
{
throw 1.0; // Throws an expression of type double which does not match int
}
catch (int a)
{
std::cerr << "Caught int exception\n"; // Will not print.
}
catch (...)
{
std::cerr << "Caught generic exception\n"; // Will print.
}
std::cerr << "End\n";
return 0;
}
The catch statements receives a copy of what is thrown. Even if the signature of the catch specifies a reference, C++ arranges for the reference
to be bound to a copy of what was thrown that is valid in the catch frame.
#include <iostream>
#include <exception>
class Exception : public std::exception
{
public:
Exception(const char *msg) : msg_ (msg)
{ std::cerr << "Constructed " << msg_ << ' ' << this << std::endl; }
~Exception() throw ()
{ std::cerr << "Descructed: " << msg_ << ' ' << this << std::endl; }
private:
const char* msg_;
};
void f2()
{
Exception e("f2"); throw e;
//throw Exception("f2");
}
void f1()
{
try {
f2();
}
catch (const Exception& e) {
std::cerr << "In f1. Caught: " << e.what() << ' ' << &e << std::endl;
throw;
}
}
int
main(int argc, char* argv[])
{
try {
f1();
}
catch (std::exception& e) {
std::cerr << "In main. Caught: " << e.what() << ' ' << &e << std::endl;
}
}
Output:
Constructed f2 0x7fff7cc0cc30
Copy constructed f2 0x8164080
Descructed: f2 0x7fff7cc0cc30
In f1. Caught: f2 0x8164080
In main. Caught: f2 0x8164080
Descructed: f2 0x8164080
However, the caller may not have enough context information to proceed, so it signals its caller. This requires most function/methods to return a
code signaling success or failure. Every call site has to be followed by a check of the code. The checking code is rarely exercised and becomes a
source of latent bugs and a challenge for tools such as code coverage. If an exceptional condition is considered after method/function interfaces
were first defined, many call/return sites would have to be updated.
Value InfixExpr::eval()
{
switch (operation)
{
case PLUS: return getLeft()->eval() + getRight()->eval();
case DIV:
Value rv = getRight()->eval();
if (rv == 0)
throw(Exception("Divide by zero");
return getLeft()->eval() / rv;
...
}
}
Use smart pointers (unique_ptr, shared_ptr, ...) to automatically release locally allocated memory.
Machine Model
It is strongly recommended that hardware specific optimization be avoided. Still it is sometimes useful to keep in mind the underlying hardware
model in order to fine tune the code.
The processor (each core) is typically pipelined: every instruction takes a number of cycles to finish and the execution of adjacent instructions is
overlapped when possible. A conditional branch disrupts the regular pipeline flow until the condition is known. Branch prediction allows the
processor to immediately issue an instruction following a conditional branch by making an assumption about the outcome of the branch and
possibly recovering once the branch outcome is determined. The compiler will typically issue code assuming the body of an if statement is the
most likely to be executed and that the code in the corresponding else is least likely to be executed.
So, if you know that an error condition is rare, it would be beneficial to write the code as follows:
if (not error_condition)
block-a
else
block-b
as opposed to:
if (error_condition)
block-b
else
block-a
Memory Hierarchy
The machine storage is organized in a hierarchy of various sizes and access speeds. Typically higher levels are larger but have longer access
time. Typical organization for a modern (2013) machine:
The registers and the cache are not visible to the C++ program. Still it is possible to influence their behavior:
Following code illustrates how a stride change affects the sequential cache locality for memory access. Code walks the matrix 2000 times adding
1 to each element. If row and col loops are switched the run time goes from 14.4 to 122.9 seconds.
#include <iostream>
int matrix[1024][16*1024];
main()
{
for (size_t row = 0; row < 1024; ++row)
for (size_t col = 0; col < 16*1024; ++col)
matrix[row][col] = 0;
Cache access can be also influenced by using a more compact representation of the data increasing the chance that critical sections will fit into
the cache and improving spacial locality of the program. One way of doing this is through the use of bit fields.
Interface should use most descriptive type. Do not use a char for a Boolean.
class Sample
{
...
private:
bool mFlag1 : 1; // Use 1 bit for flag1
bool mFlag2 : 1; // and 1 bit for flag2 ...
bool mFlag3 : 1;
bool mFlag4 : 1;
} __attribute__((packed));
If not using bit fields, the 4 flags would have used 4 bytes. With bit-fields (and packing) they use 1 byte.
If range of an integer variable is known to be small, we can use less than 4 bytes to represent it:
class Example
{
...
public:
private:
DayOfWeek mDay : 3 ;
} __attribute__((packed));
When writing synchronization primitives (e.g. your own mutex code), you need to understand the memory model and make sure to issue the
appropriate memory barrier instructions at the beginning of the lock and at the end of the unlock to insure that access to shared memory is
properly synchronized. It is highly recommended not to write your own synchronization primitives and to use the language/library facilities instead.
Tools
Most modern processors include performance related counters. With the right tools it is possible to inspect such counters and use the information
to restructure the code to avoid performance hot spot. Some of these tools include:
Cachegrind: part of the valgrind suite of tools. Simulates cache access of your application producing miss ratios for the different cache
levels (typically first and last). Program will run 20 to 100 slower but you will get per-function cache statistics.
Linux perf tools: With a modified linux kernel, it is possible to sample the performance counters and map the collected stats to a program
(or function within a program). Sample run:
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 0.956217 s, 535 MB/s
Design Patterns
A design pattern describes programming scenarios that appear over and over again in programming and that also describes the solution
to that problem
No need to reinvent the wheel
Leverage best-in-class solution to a problem
Three different categories of patterns
Creational Patterns
Deal with object creation such that the method chosen for creation matches the environment
Structural Patterns
Deal with relationships between classes
Behavioral Patterns
Deals with communication between classes
Flyweight Momento
Proxy Observer
State
Strategy
Visitor
Creational Patterns
Factory Method
Graph
# include <vector>
/*
* This code is an example of a factory. The Graph class creates Node and Edge
* instances that are members of this Graph. That way the Graph instance can
* explicitly manage the memory for each and ensure that the user doesn't try
* to add the same Node instance or Edge instance to multiple Graph instances.
* Furthermore Node and Edge instances can point to one another without having
* to worry about managing the memory associated with them since the all belong
* to a Graph instance.
*/
// Pre-declare the 3 classes
class Node;
class Edge;
class Graph;
class Node
{
// A node can have 0 or more edges attached to it
Graph *mOwner;
std::vector<Edge*> mEdges;
...
private:
// We aren't going to allow these because they don't
// make sense
Node();
Node(const Node ©);
Node &operator=(const Node ©);
private:
// Make the graph a friend so that it can create nodes
Node(Graph *owner);
~Node();
public:
unsigned numEdges() const;
Graph *owner() const;
...
};
class Edge
{
Graph *mOwner;
Node mNodes[ 2 ];
...
private:
Edge();
Edge(const Edge ©);
Edge &operator=(const Edge ©);
private:
// Make the Graph a friend so that it can create edges
public:
Node &operator[](const unsigned nodeIndex) const;
Graph *owner() const;
...
};
class Graph
{
std::vector<Node*> mNodes;
std::vector<Edge*> mEdges;
...
public:
Graph();
Graph(const Graph ©);
~Graph();
...
Node &createNode();
Edge &createEdge(Node &n1, Node &n2);
...
Node &Graph::createNode()
{
Node *newNode = new Node( this );
mNodes.push_back( newNode );
return( *newNode );
}
Abstract Factory
Builder
As objects become more and more complex, the number of constructors can grow dramatically
A different constructor can be used based upon the way in which an object is going to be used
Selecting from all of the constructors and passing in a lot of parameters to the constructor can cause programming headaches
Constructor growth could be exponential in the worst case
A builder accepts data in piece-meal form and, when asked for an instance of an object, determines how to construct the object based on
passed/saved parameters
The complexity of construction is encoded once in the builder and used over and over again without the programmer having to worry
about meeting all of the requirements of the instance being created
Prototype
Prototype Example
/*
* This code illustrates the Prototype pattern. An abstract base class, Transistor,
* is inherited by concrete types BJT, PNP, NPN, etc. The Transistor class defines
* the clone() API that has to be implemented by each concrete type.
*
* NOTE: Because Transistor is an abstract base class, we aren't going to allow
* a public constructor or copy constructor (the compiler wouldn't allow it
* anyhow). Furthermore, we aren't going to allow the assignment operator
*/
class Transistor
{
...
protected:
/*
* As a base class, we don't want to let somebody create an abstract
* transistor out of context
*/
Transistor();
Transistor(const Transistor &other);
Transistor &operator=(const Transistor);
public:
virtual ~Transistor();
virtual Transistor *clone() const=0;
};
/*
* Each concrete class has to implement the clone() API. In C++, APIs aren't
* differentiated by their return type. Thus, the clone() API for BJT can return
* a different type from Transistor(), as can clone() for PNP, NPN, etc.
*/
public:
// Replacement APIs for pure virtual APIs in base class
BJT *clone() const;
...
};
public:
// Replacement APIs for pure virtual APIs in base class
PNP *clone() const;
...
};
/*
* The clone() API is useful primarily when a routine is operating with
* the base class type and needs to create a duplicate. No queries need to
* be made to determine the type prior to creating a new instance. Just
* ask the existing instance to create a duplicate.
*/
Singleton
Logger Example
#include <iostream>
/*
* This code illustrates the Singleton pattern. The program maintains a single
* logging facility to send messages to in the class Logger::LoggerBase. The
* Logger class simply provides a mechanism that avoids the use of a global
* variable by using a singleton under the covers. Yes, the singleton is still
* a global, but it's protected in scope by the class Logger.
*/
class Logger
{
/*
* The LoggerBase class is responsible for actually logging messages to
* a file. It is sent messages from the Logger class that it is contained
* within
*/
class LoggerBase
{
...
private:
LoggerBase(const LoggerBase ©);
LoggerBase &operator=(const LoggerBase &other);
public:
LoggerBase ();
~LoggerBase();
...
LoggerBase &operator<<(const std::string &message);
...
};
/*
* The only variable for the class is the single instance of LoggerBase that
* all instances of Logger will use for message output
*/
static LoggerBase gBase;
public:
Logger();
Logger(const Logger ©);
~Logger();
...
Logger &operator<<(const std::string &message);
Logger &operator<<(const char *message);
Logger &operator<<(const char ch);
Logger &operator<<(const int value);
...
Logger &operator=(const Logger &other);
};
...
inline Logger &Logger::operator<<(const std::string &message)
{
gBase << message;
return( *this );
}
...
/*
* In the code below, each routine can create their own local logging
* facility, whenever needed. All of the logging facilities are kept
* in sync because they all use the same underlying base facility to
* actually generate messages.
*/
void doMoreWork()
{
Logger log;
...
log << "Doing more work...\n";
...
doMoreWork();
...
log << "Done doing more working...\n";
}
void doWork()
{
Logger log;
...
log << "Starting work...\n";
...
doMoreWork();
...
log << "Done working...\n";
}
Structural Patterns
Adapter
Adapter Example
#include "boost/smart_ptr.hpp"
class Display;
/*
* This example shows how to wrap one class, Car, so that it looks like
* a generic shape that can be drawn on a display. The base class, Shape,
* is an abstract base class that cannot be instantiated by itself. Instead,
* it defines interfaces that must be provided by inheriting classes.
*/
class Shape
{
protected:
Shape();
Shape(const Shape ©);
Shape &operator=(const Shape &other);
public:
virtual ~Shape();
/*
* The Car is a class entirely unrelated to a Shape. However, we want to be
* able to draw a Car instance onto a Display instance, and only Shape instances
* can be drawn.
*/
class Car
{
public:
Car();
Car(const Car &other);
~Car();
...
Car &operator=(const Car &other);
};
/*
* The solution is to create a wrapper class, CarShape, that is a type of Shape
* but that gathers information about it from the underlying Car instance. We're
* using a Boost smart_ptr<T> class to hold the pointer for the Car so that we
* don't have to worry about memory ownership. When nothing else points to the
* Car instance, it's automatically deleted.
*/
CarPointer mCar;
private:
CarShape();
public:
CarShape(const CarPointer car);
~CarShape();
public:
// Replacement for pure virtual APIs in Shape
double area() const;
void draw(Display *display);
};
Bridge
Applies composition for adaptation instead of simply changing the interface the way that the Adapter Pattern does
Instead of continuous inheritance and specification, refactor unrelated elements into different classes
Example: Cars
Composite
In this pattern, a group of objects is manipulated in the same way that a single object might be manipulated
Example from early in the C++ class
A Shape class may represent a single shape (circle, rectangle, triangle)
A collection of shapes makes a composite shape that can also be drawn, moved, filled, etc
The collection implements a composite pattern
Decorator
/*
* The following shows how to implement a person with "decorations". We start with
* an empty base class that describes APIs that each class should redefine
*/
class PersonBase
{
public:
...
virtual std::string describe() const=0;
...
};
/*
* Normally for a person we might think of eliminating the base class altogether
* and making Person the base class. Our decorators, though, also inherit from
* the base class, so we don't want multiple copies of the same data in every
* decorator
*/
public:
...
std::string describe() const
{
std::stringstream description( std::stringstream::in | std::stringstream::out );
description << mAge << " years old";
return( description.str() );
}
...
};
/*
* We create a "blonde-haired" decorator that also inherits from the base class
* that we want to decorate
*/
private:
BlondeHaired();
BlondeHaired(const BlondeHaired ©);
BlondeHaired &operator=(const BlondeHaired &other);
public:
BlondeHaired(PersonBase *base)
{
mBase = base;
}
...
std::string describe() const
{
std::stringstream description( std::stringstream::in | std::stringstream::out );
description << "blonde-haired, " << mBase->describe();
return( description.str() );
}
...
};
/*
* Yet another decorator that inherits from the base class
*/
private:
BlueEyed();
BlueEyed(const BlueEyed ©);
BlueEyed &operator=(const BlueEyed &other);
public:
BlueEyed(PersonBase *base)
{
mBase = base;
}
...
std::string describe() const
{
std::stringstream description( std::stringstream::in | std::stringstream::out );
description << "blue-eyed, " << mBase->describe();
return( description.str() );
}
...
};
...
/*
* We can create a description of a person by dynamically adding decorators
* to each successive level of decoration. This is different from the bridge
* pattern because additions are dynamic, and we can layer on as many decorations
* as we want at runtime
*/
/*
* The following should print:
* blue-eyed, blond-haired, 23 years old.
*/
Facade
A facade pattern presents a simplified interface to a class or classes that has/have a more complicated interface
Provides a single interface, mainly to make life easier for the programmer
The facade talks to the underlying classes, but they are not aware of the facade (unidirectional control)
Example: starting a car
To start a car, you put the key into the ignition and turn the key
Underneath the covers, turning the key initiates a number of other actions to get the car started
Independent subsystems are coordinated in their action through the use of the facade
Flyweight
Proxy
Behavioral Patterns
Chain of Responsibility
Command
An object is used to encapsulate all information necessary to execute a command at some later point in time
Example: thread pool in the Cadence proprietary C++ class library
Each API that is to be executed in a thread, along with the parameters for that API invocation, are passed to the thread pool
The thread pool saves the information for later execution by an available thread
Interpreter
For a given language representation (grammar), an interpreter reads the grammar to perform an action, build data structures, etc
Iterator
An iterator is used to sequentially access elements in a container without regard to the form of the container
Example: The C++ standard template library containers implement iterators in both directions (front-to-back and back-to-front)
Mediator
In the mediator pattern, an object is used to facilitate communication between two or more other objects
Objects only communicate to each other through the mediator, not directly
Example: Timing simulator
Each cell object is responsible for calculating its timing based upon input transition time and capacitive load
The simulator is the mediator that is responsible for getting the timing results from one cell object and passing data downstream
to later cell objects
The picture below shows how the Interpreter, Mediator, Flyweight, and Factory patterns might work together in a Fast-SPICE electrical
simulator
Momento
In the moment design pattern, a state is saved so that it can be restored at a later date if desired
A class is used to hold the state and, if required, restore the state of a system
Examples
The "undo" command in an editor
Snapshots of file systems in ZFS or on the NetApp disk array
Observer
An object (the subject) is created that registers other classes (the observers) and notifies them when a state change occurs
Example: event loops in a GUI (like Qt)
Programmers can register objects that should be invoked when various events are detected by the GUI (mouse movement,
mouse click, keyboard press, etc.)
Upon detecting the event, the Qt system invokes a method of the registered class instance
State
For a given API in a class instance, the action associated with the API may vary, based on the internal state of the instance
Can be easily implemented in C++ through the use of virtual APIs in a base class
Example: Adobe PhotoShop
The mouse is used to perform a number of different actions in PhotoShop
A base class might support mouse button press, mouse button release, and mouse movement
Depending on the tool selected in PhotoShop, the action associated with these events will be quite different from one another
Rectangle selection mechanics are quite different from text selection mechanics, for example
Strategy
Given a family of classes that are related to one another, the selection of which class to use for an operation is dynamically selectable
Example: Timing analysis
Given a netlist, transistor models, table-based models, and delay data, selection of which model to use for timing analysis is
determined at runtime based on the availability of the most accurate source of data and user selection
Visitor
Visitor Pattern
/*
* Forward declare the elements that a visitor base class should be able to process
*/
class Transistor;
class Resistor;
class Capacitor;
/*
* A visitor base class is created that has a visit() routine for each type of
* element that it should be able to process in the collection. Actual visitor
* implementations will need to supply bodies for these APIs
*/
class VisitorBase
{
protected:
VisitorBase();
VisitorBase(const VisitorBase ©);
VisitorBase &operator=(const VisitorBase &other);
public:
virtual void visit(const Transistor &transistor)=0;
virtual void visit(const Resistor &resistor)=0;
virtual void visit(const Capacitor &capacitor)=0;
};
/*
* Define basic elements that can be found in a generic netlist: resistor,
* capacitor, and transistor
*/
class Transistor
{
...
public:
...
void accept(VisitorBase *visitor)
{
visitor->visit( *this );
}
...
};
class Resistor
{
...
public:
...
void accept(VisitorBase *visitor)
{
visitor->visit( *this );
}
...
};
class Capacitor
{
double mCapacitance;
...
public:
...
void accept(VisitorBase *visitor)
{
visitor->visit( *this );
}
...
double getCapacitance() const
{
return( mCapacitance );
}
...
};
/*
* A netlist is a collection of transistors, capacitors, and resistors
*/
class Netlist
{
std::deque<Transistor*> mTransistors;
std::deque<Capacitor*> mCapacitors;
std::deque<Resistor*> mResistors;
...
public:
...
void accept(VisitorBase *visitor)
{
for (unsigned i = 0, limit = mTransistors.size(); i < limit; i++)
{
mTransistors.at( i )->accept( visitor );
}
for (unsigned i = 0, limit = mCapacitors.size(); i < limit; i++)
{
mCapacitors.at( i )->accept( visitor );
}
for (unsigned i = 0, limit = mResistors.size(); i < limit; i++)
{
mResistors.at( i )->accept( visitor );
}
}
...
};
...
/*
* Use a specialized visitor, TotalCapVisitor, to get the total capacitance
* associated with any netlist
*/
/*
* Create and build a netlist of resistors, capacitors, and transistors
*/
Netlist netlist;
...
/*
* Now use the visitor to calculate the total capacitance of the netlist
*/
TotalCapVisitor totalCap;
netlist.accept( &totalCap );
std::cout << "Total cap for netlist is " << totalCap.getTotalCap() << std::endl;
...
Template Metaprogramming
Definition
Wikipedia.com: "Template metaprogramming (TMP) is a metaprogramming technique in which templates are used by a compiler to
generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of
these templates include compile-time constants, data structures, and complete functions. The use of templates can be thought of as
compile-time execution."
In laymen's terms:
Letting the compiler compute an answer, or partial answer, instead of computing everything at runtime
Often takes advantage of recursive templates in which the compiler unrolls the recursion
n Condition
Below is the assembly language code produced by compiling this with -O3 optimization
Binary Search Assembly Code
movss xmm1, DWORD PTR [rdi]
xor eax, eax
ucomiss xmm1, xmm0
ja .L2
lea eax, [rsi-2]
mov edx, eax
ucomiss xmm0, DWORD PTR [rdi+rdx*4]
jae .L2
sub esi, 1
xor edx, edx
jmp .L3
.L16:
mov edx, ecx
.L5:
ucomiss xmm1, xmm0
jbe .L15
cmp edx, eax
ja .L12
mov esi, eax
.L3:
lea eax, [rdx+rsi]
shr eax
mov ecx, eax
movss xmm1, DWORD PTR [rdi+rcx*4]
mov ecx, eax
ucomiss xmm0, xmm1
jbe .L5 # LOOP HERE
cmp esi, eax
jb .L7
.L6:
add eax, esi
shr eax
mov edx, eax
movss xmm1, DWORD PTR [rdi+rdx*4]
ucomiss xmm0, xmm1
jbe .L16 # LOOP HERE
cmp esi, eax
mov ecx, eax
jae .L6 # LOOP HERE
.L7:
lea eax, [rsi-1]
.L2:
rep
j .L17
.L12:
mov esi, eax
lea eax, [rsi-1]
jmp .L2
.L15:
sub eax, 1
j .L17
.L17:
Below is the template code to do the binary search
/*
* These partially specialized templates stop the compiler recursion.
* We define a binary search in terms of the number of values to search.
* 0: This should never happen, but is in place "just in case"
* 1: By definition of what we're looking for, we return 0 if there's
* only a single element
* 2: If there are 2 elements to in the vector, we only need to compare
* what we're searching for against the second element in the vector.
* Again, refer to the definition of our return value to see why
* 3: This actually isn't needed since the compiler can derive 3 from
* the general case above. We only add it for optimization since
* we can easily define the values returned in the case of 3 elements
*/
Here's the assembly language code for a template-based search across 6 elements
Here's how you'd now use the binary search code in a program.
Using the Binary Search Code
...
const unsigned arraySize = 6;
float array[ arraySize ];
populateArray( array, arraySize );
...
float wanted = getWantedValue();
TemplateSearch<float,arraySize> search;
unsigned pos = search( array, wanted );
...
/*
* This is an optimized bubble sort that stops once there are no more
* swaps between elements of the vector
*/
if (!compare( v[ i - 1 ], v[ i ] ))
{
std::swap( v[ i - 1 ], v[ i ] );
swapped = true;
}
}
limit--;
} while (swapped);
}
public:
inline void operator()(T *values) const
{
sortUp( values );
BubbleSort<T,Size-1>()( values );
}
};
/*
* Here are specialized versions of the bubble sort that stop the
* recursion of the compiler. If there are 0 elements to sort, we do
* nothing. Same of there's only 1 element to sort (it's already
* sorted by definition).
*
* This is an example of partial template specialization
*/
public:
inline void operator()(T *values) const {}
};
Here's some example code that shows the performance of the standard quick sort against the template version of the bubble sort
Sorting Compared
#include <algorithm>
#include <cstdlib>
#include <iostream>
#include "BubbleSort.h"
Here are runtimes for the two different sorts, as measured on a 3.4GHz Intel 3960X computer
Note: The checks after sorting are required because the compiler optimizes everything out otherwise
Further experiments determined that each sort took the following amount of CPU time
Thus, the bubble sort is 54% faster than the std::sort() on 6 numbers, despite the O(n2) execution
Another experiment on the same computer showed identical performance on 10 numbers
Conclusion
On a performance-intensive application that sorts 10 numbers or less, a template-based bubble sort may be a better
choice
public:
inline void operator()(T *values) const
{
if (sortUp( values ))
{
BubbleSort<T,Size-1>()( values );
}
}
};
/*
* When working with a single element, there is no swap, so sortUp() can
* always return false
*/
public:
inline void operator()(T *values) const {}
};
With this new technique in place, the bubble sort and std::sort() achieve parity on a random data set at 12 elements
If the list is already sorted, the bubble sort never loses to std::sort()
Generalized Sorting
void sort(float *values, const unsigned size)
{
/*
* Determine which sort to invoke based upon the number of elements
* being sorted. Anything over 12 will be done using std::sort().
* 12 elements or less will be sorted with a bubble sort
*/
switch (size)
{
case 12: BubbleSort<float,12>()( values ); break;
case 11: BubbleSort<float,11>()( values ); break;
case 10: BubbleSort<float,10>()( values ); break;
case 9: BubbleSort<float,9>()( values ); break;
case 8: BubbleSort<float,8>()( values ); break;
case 7: BubbleSort<float,7>()( values ); break;
case 6: BubbleSort<float,6>()( values ); break;
case 5: BubbleSort<float,5>()( values ); break;
case 4: BubbleSort<float,4>()( values ); break;
case 3: BubbleSort<float,3>()( values ); break;
case 2: BubbleSort<float,2>()( values ); break;
case 1:
case 0: break;
default: std::sort( values, values + size ); break;
}
}
Example n-choose-k
Consider the case of calculating n-choose-k
Below is one way to calculate the value using Pascal's triangle, template metaprogramming, and template specialization
The following image is from http://www.mathsisfun.com/pascals-triangle.html
N-Choose-K
#include <cstring>
#include <iostream>
#include <sstream>
#include <stdexcept>
/*
* This code shows how to calculate n-choose-k using template metaprogramming.
* It also shows operator overloading.
*
* The n-choose-k problem is the problem of selecting k items from n items,
* when order doesn't matter. How many unique combinations of k items can
* be selected from n items? The generic formula for this calculation is
* n! / ((n - k)! * k!)
* This typically isn't computed directly because n! quickly overflows the
* variable, even for 64-bit systems. A loop can be used by multiplying partial
* products, but that involves floating-point calculations.
*
* A third way to calculate the value of n-choose-k is to use Pascal's
* triangle. An interesting property of Pascal's triangle is that the answer
* to the n-choose-k problem is found on row n, element k (where the first
* row and first element are 0-indexed).
*
* This code shows how to quickly calculate the answer to n-choose-k by
* calculating Pascal's triangle for that specific question.
*
* Memory is ((n+1)^2)/2 and runtime is O( n * min( k, n - k ) ).
*
* As an exercise, try removing the "static" qualifier on the variable
* priorRow in the NChooseK::operator[]() API to see what happens to runtime.
* Can you explain why the runtime is impacted?
*
* What happens if the specialization for NChooseK<25> is removed? Can you
* explain why the runtime is impacted?
*/
BaseType mResults[ N + 1 ];
if (N < k)
{
std::ostringstream msg;
msg << "The value of n (" << N
<< ") cannot be less than the value of k(" << k << ").";
throw( std::invalid_argument( msg.str() ) );
}
}
public:
NChooseK()
{
memset( mResults, 0, sizeof( mResults ) );
}
validate( k );
BaseType *result = mResults + k + 1;
if (!*result)
{
/*
* Recursively calculate the answer using results from the previous
* row. If there's an overflow situation, throw an exception.
* Otherwise store the result
*/
/*
* Specialize the operator for 0 and 1. After all, we already know the answer.
* This also serves as a way to stop recursion in the method above.
*/
template<> NChooseK<25>::NChooseK()
{
BaseType values[] = { 1, 25, 300, 2300, 12650, 53130, 177100,
480700, 1081575, 2042975, 3268760, 4457400,
5200300, 5200300, 4457400, 3268760, 2042975,
1081575, 480700, 177100, 53130, 12650, 2300,
300, 25, 1 };
Example: Maximum
Consider calculating the maximum of a set of values
A loop could take more time than the maximum calculations
Maximum Template
#include <algorithm>
/*
* Determine the maximum value in a fixed-size array without looping.
* The following code is equivalent to
* T maxValue = values[ 0 ];
* for (unsigned i = 1; i < Size; i++)
* {
* maxValue = std::max( maxValue, values[ i ] );
* }
*/
/*
* The two partially specialized templates below terminate recursion by the
* compiler. If there are two elements to compare, return the index of the maximum
* of the two. If there's only one element, it must be the maximum by definition.
*/
Here's the optimized (-O3) assembly generated by a call to calculate the maximum index for 6 values
MaximumIndex Assembly
movss 4(%rdi), %xmm0
movl $2, %ecx
ucomiss 8(%rdi), %xmm0
jbe LBB0_2
movl $1, %ecx
LBB0_2:
movss (%rdi), %xmm0
xorl %eax, %eax
ucomiss (%rdi,%rcx,4), %xmm0
ja LBB0_4
movl %ecx, %eax
LBB0_4:
movss 16(%rdi), %xmm0
movl $2, %ecx
ucomiss 20(%rdi), %xmm0
jbe LBB0_6
movl $1, %ecx
LBB0_6:
movss 12(%rdi), %xmm0
movl $3, %edx
ucomiss 12(%rdi,%rcx,4), %xmm0
ja LBB0_8
addl $3, %ecx
movl %ecx, %edx
LBB0_8:
movss (%rdi,%rax,4), %xmm0
ucomiss (%rdi,%rdx,4), %xmm0
ja LBB0_10
movl %edx, %eax
LBB0_10:
Here's the C++ loop equivalent with the optimized assembly shown
Maximum Index in C++
/* The following C++ code translates into this assembly language
*
* pushq %rbp
* movq %rsp, %rbp
* xorl %eax, %eax
* cmpl $2, %esi
* jb LBB0_5
* xorl %edx, %edx
* movl $1, %ecx
* LBB0_2:
* movl %edx, %eax
* ucomiss (%rdi,%rax,4), %xmm0
* movl %ecx, %eax
* ja LBB0_4
* movl %edx, %eax
* LBB0_4:
* incq %rcx
* cmpl %ecx, %esi
* movl %eax, %edx
* jne LBB0_2 # LOOP HERE
* LBB0_5:
* popq %rbp
* ret
*
* Find the index of the maximum value in the array and store it
* in the variable maxIndex.
*/
Units
Consider the case of writing a scientific program dealing with force, mass, acceleration, current, voltage, etc.
Want to make sure that units are consistent throughout the program
Boost Units: http://www.boost.org/doc/libs/1_55_0/doc/html/boost_units/Quick_Start.html
Dimensions are added to units through template metaprogramming techniques
Because dimensions are unique and known at compile time, bad assignments are compiler errors
Checks are made at compile time and there is no runtime penalty
Boost Units
/*
* This program uses trapezoidal integration to calculate total charge from
* current/time measurements
*/
#include <iostream>
#include "boost/units/systems/si/current.hpp"
#include "boost/units/systems/si/electric_charge.hpp"
#include "boost/units/systems/si/io.hpp"
#include "boost/units/systems/si/time.hpp"
/*
* Use a namespace shortcut because the names get too long otherwise.
*/
/*
*
* The basic entity in Boost Units is "quantity". We're going to shortcut
* long Boost names with our own shorter names for clarity.
*/
/*
* These are the time/value pairs for measured current. Boost Units uses
* seconds and amps, so we need to convert to nanoseconds and milliamps
*/
/*
* Iterate over all time/value pairs to calculate the charge
*/
Charge q = 0 * si::coulombs;
const unsigned limit = sizeof( t ) / sizeof( t[ 0 ] ) - 1;
for (unsigned i = 0; i < limit; i++)
{
q += (t[ i + 1 ] - t[ i ]) * ((j[ i ] + j[ i + 1 ]) / two);
}
/*
* Total charge is in Coulombs
*/
std::cout << "Total charge: " << q << std::endl;
}
Programming Constructs
The examples above show how a general loop looks via recursion
Using template metaprogramming, we can do loops and conditionals (if-then-else and switch statements)
This defines a complete Turing machine
The ability to use this machine is a function of the compiler implementation
A loop can be abstracted as follows
Template Loop
/*
* This shows how a loop is converted into a recursive template expansion.
* This particular code obviously does nothing. However, the general form
* can be expanded upon where a loop is required
*/
If-Then-Else in Templates
/*
* Create a namespace named "IfDetail". The reason that we do this is because
* we need to create explicit specializations of some template classes, and
* that can only be done in a namespace, not in a class
*/
namespace IfDetail
{
/*
* Forward declare a class, SelectClass, that takes 3 parameters:
* Cond, Then, and Else. Then define SelectClass explicitly for the
* case when Cond is true and Cond is false. If Cond is true, we're
* going to define a type, Selected, to be equivalent to the type Then.
* If Cond is false, we'll define Selected to be the same type as Else.
*/
template<bool Cond,typename Then,typename Else> struct SelectClause;
template<typename Then,typename Else> struct SelectClause<true,Then,Else>
{
typedef Then Selected;
};
template<typename Then,typename Else> struct SelectClause<false,Then,Else>
{
typedef Else Selected;
};
}
/*
* Define the if-then-else statement in terms of the condition, a then type,
* and an else type. The value of the condition will determine which type is
* defined as "Result"
*/
/*
* Here's an example of how such a structure might be used
*/
#include <iostream>
struct TrueStruct
{
static inline void execute()
{
std::cout << "In true branch." << std::endl;
}
};
struct FalseStruct
{
static inline void execute()
{
std::cout << "In false branch." << std::endl;
}
};
/*
* This is a trivial example in which we've hardcoded the first parameter to
* be true. However, consider the case in which the first parameter is itself
* a value returned by another template construct
*/
...
If<true,TrueStruct,FalseStruct>::Result::execute();
...
Switch statements are also possible using the same control flow
Switch in Templates
/*
* This code is similar to the code in the if-then-else example except that
* it allows selection from more than 2 elements.
*/
namespace SwitchDetail
{
/*
* Forward declare a switch selector. Note that there must be as many
* types (T0, T1, and T2) as there are selections in the switch statement.
* Then specialize the Select class for an Index value of 0, 1, and 2.
* The selected implementation will define the type associated with the
* typedef name "Selected".
*/
/*
* The switch defines a type, Result, from T0, T1, and T2 based upon the
* value of Index used in the invocation.
*/
Appendix
Contents
Public, Private, and Contract Programming
Templates and Iterators
Creating Classes - A Guide
Compiler-Generated Class Members
Some people may have a problem determining when an API should be private as opposed to public (or protected1). A good rule of thumb
to follow is that public APIs can be called by a programmer without restriction. Again, this is a rule of thumb and not set in concrete. If, on the
other hand, an API has a restricted purpose, and it cannot be called in an unrestricted manner, it may be a good candidate to be a private API.
Consider a C++ class to parse a C source file. The parser may have a number of special-purpose APIs for various C constructs. For example, the
parser may have APIs like parseVariable(), parseWhileLoop(), parseSwitchStatement(), etc. When parsing a complete C source
file, each of these APIs may be called, depending on the content of the file. Making these APIs public makes no sense since the API cannot be
called irrespective of the parsing context. However, a top-level API, like parseFile(), makes sense as a public API because the entire file is
treated as an atomic entity that is parsed in full. This API can be called at will because it's context insensitive.
Contract programming also applies to private APIs. However, it's very possible that private API contracts are much more restrictive than pub
lic APIs. This is most easily explained using examples. Consider a class that performs multi-threaded processing. The class supports three
different atomic functions: A, B, and AB (an atomic combination of A & B). Furthermore, all three actions are speed sensitive. The following class
might be defined:
Multi-threading and Private APIs
class Example
{
pthread_mutex_t mLock;
private:
void doA();
void doB();
public:
...
executeA();
executeB();
executeAB();
...
};
void Example::doA()
{
// Implementation note: mLock MUST be locked before calling
// this API or the application may crash
...
}
void Example::doB()
{
// Implementation note: mLock MUST be locked before calling
// this API or the application may crash
...
}
void Example::executeA()
{
pthread_mutex_lock( &mLock );
doA();
pthread_mutex_unlock( &mLock );
}
void Example::executeB()
{
pthread_mutex_lock( &mLock );
doB();
pthread_mutex_unlock( &mLock );
}
void Example::executeAB()
{
pthread_mutex_lock( &mLock );
doA();
doB();
pthread_mutex_unlock( &mLock );
}
The APIs doA() and doB() require that mLock is locked. The lock could have been placed inside of each of these APIs, but then there would be
no guarantee that the API executeAB() would actually execute both A and B as an atomic operation. The APIs doA() and doB() could use a
call to pthread_mutex_trylock() (as shown below), but this would violate the speed requirement as pthread calls can be very time
consuming.
private:
void doA();
void doB();
public:
...
executeA();
executeB();
executeAB();
...
};
void Example::doA()
{
// The call to pthread_mutex_trylock() is slow
int locked = pthread_mutex_trylock( &mLock );
...
if (locked)
{
pthread_mutex_unlock( &mLock );
}
}
void Example::doB()
{
// The call to pthread_mutex_trylock() is slow
int locked = pthread_mutex_trylock( &mLock );
...
if (locked)
{
pthread_mutex_unlock( &mLock );
}
}
void Example::executeA()
{
doA();
}
void Example::executeB()
{
doB();
}
void Example::executeAB()
{
// Lock here so that doA() and doB() won't lock or unlock
// mLock. Thus, A and B will be invoked atomically.
pthread_mutex_lock( &mLock );
doA();
doB();
pthread_mutex_unlock( &mLock );
}
The point of this example is to illustrate the fact that constraints may be placed upon the use of private APIs that could not be enforced for pub
lic APIs. Because private APIs can only be called by other APIs in the class, additional operational constraints may be placed upon these
APIs that are impractical or unenforceable if they were made public.
Another example of contract programming and explicit constraints on private APIs has been shown through many examples in this course.
There are many examples throughout the course in which a class is presented with the following structure.
Generic Class
class Generic
{
private:
void assign(const Generic ©);
void destroy();
void init();
public:
Generic();
Generic(const Generic ©);
~Generic();
/*****************************************************************************/
/* Private inline APIs below */
/*****************************************************************************/
/*****************************************************************************/
/* Public inline APIs below */
/*****************************************************************************/
inline Generic::Generic()
{
init();
}
inline Generic::~Generic()
{
// Destroy the instance and reset the instance so that any mistaken attempt
// to access the contents after destruction won't return seemingly valid
// data
destroy();
init();
}
Why create three separate APIs (assign(), destroy(), and init())? Factoring the code in this manner means that code need not be copied
multiple times, potentially introducing problems as the class is enhanced. For example, if more constructors are added, multiple calls to init() o
r assign() may be made. If this class has a lot of member variables that utilize dynamic memory, calls to any single API may leave the class
instance in an undefined state. For example, failure to call destroy() before calling assign() may result in a memory leak if the dynamic
memory is not returned to the heap.
Part of the reason for this structure is that C++ constructors cannot invoke other constructors. In C++11, that restriction is removed. This will help
the aforementioned situation, but won't eliminate the general issue: private APIs may have explicit constraints upon their use that must be
followed to guarantee correct operation. It is not always possible to ensure that all private APIs are fully self-contained and that they can be
invoked at will. It is the responsibility of the programmer to ensure that such constraints are well documented and, if possible, cannot be violated.
Consider a class that prefetches the contents of a file in one thread (a producer thread) while another thread operates upon the data read (a
consumer thread). Such a class might look like the following:
Prefetch Buffer
class PrefetchBuffer
{
...
public:
/*
* Define the interface for a const iterator. There is no non-const
* iterator since this data represents data read from disk and, therefore,
* should be constant for a given PrefetchBuffer<Size> instance.
*/
class const_iterator
{
public:
/*
* Required iterator types in order to work with other standard
* containers and classes that work with iterators
*/
private:
/*
* Variable dictionary:
* o mOwner: The PrefetchBuffer<Size> instance that the iterator is
* iterating through
* o mPosition: The byte position within the file that the
* PrefetchBuffer<Size> instance is reading
* o mIsEndIterator: True if this iterator represents the "end"
* iterator of mOwner
*/
PrefetchBuffer *mOwner;
off_t mPosition;
bool mIsEndIterator;
public:
...
/*
* Equality/inequality comparisons
*/
bool operator==(const const_iterator &other) const;
bool operator!=(const const_iterator &other) const;
bool operator<(const const_iterator &other) const;
bool operator<=(const const_iterator &other) const;
bool operator>(const const_iterator &other) const;
bool operator>=(const const_iterator &other) const;
...
};
...
};
A random access iterator must provide 6 different comparison operators as shown. Here's how such operators might look.
...
Basically, these operators all follow the same formula, with the exception of the comparison itself. Such APIs are excellent candidates for
templates. Changing the code (as shown below) achieves the desired result.
Enhanced Comparison Operators
class PrefetchBuffer
{
...
public:
class const_iterator
{
template<typename Compare>
bool compare(const const_iterator &other, Compare comp) const
{
PUSH_ASSERT( mOwner == other.mOwner );
const off_t size = mOwner->fileSize();
const off_t thisPos = mIsEndIterator ? size : mPosition;
const off_t thatPos = other.mIsEndIterator ? size : other.mPosition;
return( comp( thisPos, thatPos ) );
}
...
};
...
};
Creating A Class
$ create-class SomeObject
#ifndef _SomeObject_h_
#define _SomeObject_h_
/**
* \brief
* \details
*
* \author John F. Croix
* \copyright Copyright (C) 2014, Cadence Design Systems, Inc
*/
class SomeObject
{
/*
* Variable dictionary:
* o
*/
private:
void assign(const SomeObject ©);
void destroy();
void init();
public:
SomeObject();
SomeObject(const SomeObject ©);
~SomeObject();
/*****************************************************************************/
/* Private inline APIs below */
/*****************************************************************************/
/*****************************************************************************/
/* Public inline APIs below */
/*****************************************************************************/
inline SomeObject::SomeObject()
{
init();
}
inline SomeObject::~SomeObject()
{
destroy();
init();
}
#endif
namespace SomeNameSpace
{
/**
* \brief
* \details
*
* \author John F. Croix
* \copyright Copyright (C) 2014, Cadence Design Systems, Inc
*/
class SomeObject
{
/*
* Variable dictionary:
* o
*/
private:
void assign(const SomeObject ©);
void destroy();
void init();
public:
SomeObject();
SomeObject(const SomeObject ©);
~SomeObject();
/**************************************************************************/
/* Private inline APIs below */
/**************************************************************************/
/**************************************************************************/
/* Public inline APIs below */
/**************************************************************************/
inline SomeObject::SomeObject()
{
init();
}
inline SomeObject::~SomeObject()
{
destroy();
init();
}
#endif
Templates
For template classes, I start with one of the above and modify accordingly
I use C macros because there's a lot of typing otherwise that only gets in the way of the code
Templates
#ifndef _SomeObject_h_
#define _SomeObject_h_
/**
* \brief
* \details
*
* \author John F. Croix
* \copyright Copyright (C) 2014, Cadence Design Systems, Inc
*/
private:
void assign(const SomeObject ©);
void destroy();
void init();
public:
SomeObject();
SomeObject(const SomeObject ©);
~SomeObject();
/*****************************************************************************/
/* Private inline APIs below */
/*****************************************************************************/
/*****************************************************************************/
/* Public inline APIs below */
/*****************************************************************************/
INLINE CLASS::SomeObject()
{
init();
}
INLINE CLASS::~SomeObject()
{
destroy();
init();
}
#undef CLASS
#undef INLINE
#undef TEMPLATE
#endif
Inheritance
When dealing with inheritance, my base class is altered so that assign(), init(), and destroy() are protected instead of public
The destructor is usually virtual (but not always)
The constructors are typically protected
I eliminate the operator=() for the base class
The inheriting class is then altered to invoke the base class as required
Base Class
#ifndef _Base_h_
#define _Base_h_
/**
* \brief
* \details
*
* \author John F. Croix
* \copyright Copyright (C) 2014, Cadence Design Systems, Inc
*/
class Base
{
/*
* Variable dictionary:
* o
*/
protected:
void assign(const Base ©);
void destroy();
void init();
private:
/*
* Don't allow the assignment operator if the base class cannot be
* instantiated on its own
*/
protected:
/*
* If the base class cannot be instantiated on its own, make sure that
* the default and copy constructor are not public
*/
Base();
Base(const Base ©);
public:
virtual ~Base();
};
/*****************************************************************************/
/* Protected inline APIs below */
/*****************************************************************************/
inline Base::Base()
{
init();
}
/*****************************************************************************/
/* Public inline APIs below */
/*****************************************************************************/
inline Base::~Base()
{
destroy();
init();
}
#endif
The child class needs to access assign() and destroy() at a minimum, which is why they're now protected
It may need to access init() if you add an API that requires that the base class re-initialize itself (like a clear() API that calls destro
y() and init() APIs)
Child Class
#ifndef _Child_h_
#define _Child_h_
#include "Base.h"
/**
* \brief
* \details
*
* \author John F. Croix
* \copyright Copyright (C) 2014, Cadence Design Systems, Inc
*/
private:
void assign(const Child ©);
void destroy();
void init();
public:
Child();
Child(const Child ©);
~Child();
void clear();
Child &operator=(const Child ©);
};
/*****************************************************************************/
/* Private inline APIs below */
/*****************************************************************************/
/*****************************************************************************/
/* Public inline APIs below */
/*****************************************************************************/
inline Child::Child() :
Base()
{
init();
}
inline Child::~Child()
{
destroy();
init();
}
destroy();
Base::destroy();
Base::init();
init();
}
destroy();
Base::destroy();
Base::assign( copy );
assign( copy );
}
return( *this );
}
#endif
class A
{
public:
typedef std::complex<double> Complex;
private:
int mX;
Complex mY;
public:
int getX() const;
void setX(const int x);
const Complex &getY() const;
void setY(const Complex &y);
};
class A
{
public:
typedef std::complex<double> Complex;
private:
int mX;
Complex mY;
public:
A(const int x);
class A
{
public:
typedef std::complex<double> Complex;
private:
int mX;
Complex mY;
private:
/*
* Declare a default constructor but do not define it. This keeps
* the compiler from auto-generating one and will cause a compilation
* error if anybody tries to use it
*/
A();
public:
A(const int x);
int getX() const;
void setX(const int x);
const Complex &getY() const;
void setY(const Complex &y);
};
Copy Constructors
If a copy constructor is not present for a class, the compiler will automatically generate one, even if other constructors are present
The only way to keep the compiler from auto-generating a copy constructor is to define one for a class
In the auto-generated copy constructor, copy construction for all member variables is performed
If the member variable is a class instance, it's copy constructor is invoked
The copy constructor for POD (plain old data) types is a bit copy
This may not be what you want in the case of pointers due to the potential of a double deletion occurring for allocated
memory
A programmer should always explicitly have a copy constructor declared in their class
If they don't want to actually have one, make it private and provide no body for it
Prevent Compiler from Generating Copy Constructor
/*
* If a copy constructor isn't wanted, declare a private copy constructor and do not
* define it. Otherwise the compiler will generate one for the class automatically
*/
class A
{
...
private:
A(const A &other);
...
public:
...
};
Assignment Operators
Like the copy constructor, the compiler will automatically generate an assignment operator if one is not declared
The default assignment operator invokes the assignment operator for all underlying data members
A programmer should always explicitly have an assignment operator declared in their class
If the assignment operator is not desired for the class, make it private and provide no definition for it
Destructors
If no destructor is declared for a class, the compiler generates one automatically
Each member's destructor is invoked in the compiler-generated destructor
The destructor for a pointer variable does nothing, so any allocated memory that the pointer points to may be lost
In general it's good style to always provide a destructor
Summary
The default constructor is auto-generated is there is no user-declared constructor (note declared vs defined)
The copy constructor is auto-generated if there is no user-declared copy constructor
The assignment operator is auto-generated if there is no assignment operator for the class
The destructor is auto-generated if there is no destructor declared for the class
In all 4 cases above, the corresponding operation is performed for each member variable in the class
In an auto-generated default constructor, each member variable's default constructor is invoked
In an auto-generated copy constructor, each member variable's copy constructor is invoked
In an auto-generated assignment operator, each member variable's assignment operator is invoked
In an auto-generated destructor, each member variable's destructor is invoked
Syllabus
This course covers object-oriented (OO) programming generically, and delves into the C++ language specifically. Generic OO concepts are useful
across a broad array of programming languages including Java, Python, Perl 5, C++, C#, Smalltalk, and many, many others. The C++ language
was designed by Bjarne Stroustrup during his time at Bell Labs in the late 1970's and early 1980's. He was attempting to marry some of the OO
constructs of the Simula programming language to C while maintaining backward compatibility with C. Backward compatibility was a requirement
due to the millions of lines of C code already in place, but it led to compromises in the language. As a result, there are many things that C++ does
well from an OO standpoint, but others that remain awkward, to put it mildly. Despite its limitations, though, C++ continues to be one of the most
popular programming languages in use today, especially in environments in which execution speed and tight integration to hardware is critical.
After the generic OO constructs have been covered, C++ features and libraries are presented. In this section of the course, specific C++
implementation specifics, and the relationship of the language to the hardware, is given greater emphasis. The goal of this portion of the course is
to give the student a sufficient understanding of C++ so that various implementation tradeoffs can be analyzed so that the desired speed or
memory footprint can be achieved for a given object or module.
NOTE: The syllabus is still under construction and may change as the course content is finalized.