You're reading from The C++ Programmer's Mindset Learn computational, algorithmic, and systems thinking to become a better C++ programmer

Product type Paperback

Published in Nov 2025

Publisher Packt

ISBN-13 9781835888421

Length 398 pages

Edition 1st Edition

Languages

C++

Concepts

Programming Language

Author (1):

Sam Morley

View More author details

Table of Contents (19) Chapters

Preface

1. Thinking Computationally

2. Abstraction in Detail FREE CHAPTER

3. Algorithmic Thinking and Complexity

4. Understanding the Machine

5. Data Structures

6. Reusing Your Code and Modularity

7. Outlining the Challenge

8. Building a Simple Command-Line Interface

9. Reading Data from Different Formats

10. Finding Information in Text

11. Clustering Data

12. Reflecting on What We Have Built

13. The Problems of Scale

14. Dealing with GPUs and Specialized Hardware

15. Profiling Your Code

16. Unlock Your Exclusive Benefits

Unlock this book’s free benefits in 3 easy steps

17. Other Books You May Enjoy

Subscribe to Deep Engineering

18. Index

Understanding algorithms

An algorithm is a set of instructions for taking data, subject to some conditions called preconditions, and producing an output, satisfying some postconditions. Being able to formulate, articulate, and understand algorithms is an essential skill for anyone who writes software. Algorithms are generally written in a pseudocode language that describes the steps in a language-agnostic way that should be comprehensible to anyone familiar with basic programming constructions.

Reasoning about algorithms and understanding their computational complexity is a much larger topic that we will return to in Chapter 3. In this section, we focus on how to read algorithms and understand what they do.

Before we continue, we need to understand some basic theory of computation. Broadly speaking, there are two (equivalent) models of computation: sequential (Turing machine) and functional (lambda calculus). In the sequential model, one starts at the beginning and performs one step at a time until the task is complete, whereas in the functional model, one tackles parts of the problem by recursive calls to routines. Most programming languages favor one model or the other, though it is common to take aspects from both models. For instance, C++ is primarily sequential, though C++ templates are functional. On the other hand, Haskell is a purely functional language. Regardless of whether you explicitly make use of either model, it is important that you have knowledge of how both models operate.

Discussing algorithms is best done by means of example. We will now look at a very simple algorithm that will serve as a good introduction to the terms, and learn how to read the pseudocode descriptions of the steps of an algorithm.

Finding the maximum value in a list

Suppose you have a list of numbers (for simplicity, let’s say these are all integers) and you wish to find the maximum value contained therein. A very simple way to accomplish this is as follows:

Take the first element and store this as the current maximum.
For each of the remaining elements, compare to the current maximum and replace if it is larger.
Return the current maximum, which should now contain the global maximum.

Unless you know more, it is hard to do better than this. To know that you have the maximum value, you must have compared the proposed maximum to all of the elements of the list and checked that no other element exceeds this value.

This is an algorithm, though it is not presented in the pseudocode language mentioned above. To formalize the procedure, we should translate from the plain language above into pseudocode, which is more similar to how it would be written in code. An example of an algorithm that finds the maximum value in a list of numbers is given here.

INPUT: L is a list of numbers with at least one element
OUTPUT: Maximum value of L
max <- first element of L
WHILE not at end of L
  current <- next element of L
  IF current > max
    max <- current
  END
END
RETURN max

The uppercase words are keywords that denote common operations such as conditionals, loops, inputs, outputs, and return outputs. The OUTPUT statement declares the postconditions on the value that is provided by the RETURN statement. The <- denotes assignment. This is to make it fully distinct from the equality operator =. Notice that this form doesn’t make any reference to specific means of accessing the data; that is for the implementation to define based on the form of the data that is provided. Let’s see how this translates to standard C++. We can write this as a function template that takes a “container” that has begin, end, and a dependent type called value_type that supports <.

template <typename Container>
typename Container::value_type max_element(const Container& container) {
    auto begin = container.begin();
    auto end = container.end();
    if (begin == end) {
        throw std::invalid_argument("container must be non-empty");
    }
    auto max = *begin;
    ++begin;
    for (; begin != end; ++begin) {
        const auto& current = *begin;
        if (max < current) {
            max = current;
        }
    }
    return max;
}

This is a very general implementation that makes basically no assumptions about the form of the container or the element type that it contains. We throw an exception if the container is empty, which is one “correct” way to handle this. The maximum of an empty collection is ill-defined; the defining condition is vacuously true for any value. Another option would be to change the return type to optional<...> and return an empty value in this case. This has the advantage of potentially allowing for noexcept to be added to the function declaration, reducing the runtime cost of launching this function. Of course, this implementation is for demonstration only; you should use the constrained algorithm std::max_element from the algorithm header instead.

Notice that the general structure of the implementation is exactly as set out in the pseudocode. This is by design. You might even want to annotate parts of your code with comments to indicate exactly which part of the algorithm is being implemented. This helps other developers (including your future self) understand how you have implemented the algorithm, and what the specific parts are supposed to do.

We could generalize this implementation further by taking an optional comparison operator to be used instead of >, but this is complicated because there are conditions on orderings for which the maximum is a well-defined and unique value. For instance, in some orderings, not all values are comparable, which would demand special handling in the implementation. This is beyond our capabilities at the moment.

Characteristics of an algorithm

Not all lists of instructions are algorithms. To earn that distinction, they must satisfy some reasonable conditions:

Finiteness: An algorithm must terminate after a finite number of iterations. The actual number of iterations will usually depend on the inputs (and outputs), and the number of iterations might grow rapidly, but it must eventually terminate.
Definiteness: The steps of an algorithm should be described precisely and unambiguously. The objective is to translate an algorithm into computer code, so enough information must be present in order to reasonably do this.
Inputs: An algorithm should have zero or more inputs that belong to well-defined sets (defined by the preconditions mentioned above).
Outputs: An algorithm should have one or more outputs, derived from the inputs using the steps of the algorithm.
Effectiveness: An algorithm should be effective in producing the desired output from the input parameters. The individual steps should be sufficiently basic that the process can be carried out exactly using pen and paper.

One usually turns to the rigor of mathematical proof to show that an algorithm satisfies these properties. For instance, mathematical induction can be used to prove that an algorithm is effective. The number of steps required by an algorithm usually depends on the size and nature of the inputs (and possibly the outputs). This relationship is called the complexity of the algorithm, which we shall discuss in more detail later, in Chapter 3.

The preconditions on the inputs should usually be checked in a good implementation of the algorithm. This can be done implicitly by means of static types, such as those in C++, or explicitly by conditional statements. There are various ways to do this, of course, depending on how robustly these checks should be performed.

One additional consideration when writing algorithms out is clarity. An algorithm is only useful to you and others if it can be understood and implemented. When presenting the pseudocode for an algorithm, you should make some effort to make sure the steps are clearly presented and easy to follow. The same way that decomposition can help solve problems, it can also help articulate their solutions, especially when the decomposition is “obvious.”

Let’s examine our algorithm for finding the maximum value of a list of numbers for these properties. The algorithm “visits” each element of the list exactly once, so for a list that contains elements, the algorithm will terminate after exactly steps. Thus the finiteness condition is satisfied. The algorithm is written clearly and unambiguously. The actual mode of traversing the list is not specified exactly, but, as we see in the C++ implementation, this is necessary to accommodate the different forms of “list” that might be available in any given programming language. (Not all languages have a std::vector, and not all containers support index access.) The only input is a list of numbers, which must satisfy the precondition of being non-empty. The only output is a single number that satisfies the postcondition of being the maximum value from the list. (A number is the maximum value of a set of numbers if is a member of and if each taken from satisfies .) The final condition is effectiveness. The steps listed do indeed produce the valid maximum value, and each step specifies exactly (though not specifically) one operation that must be performed.

Recursive algorithms

Not all problems have an algorithm that is so easy to write down. Let’s look at a more complicated example. Consider a very simple “language” defined by the following grammar.

letter ::= 'a' | 'b'
word   ::= letter | '[' word ',' word ']'

This language consists of “words,” that consist of either a single “letter” (taken from an alphabet of two letters, ‘a’ and ‘b’) or a pair of words surrounded by square brackets and separated by a comma. The characters that appear in quotations are literals that are exactly as they should appear. The other terms are as defined by the language. For instance, all of the following are valid words in this language.

a
 [a,a]
 [a,[a,b]]
 [[a,a],[a,[a,b]]]

Notice that this language is recursive in nature. A word might contain a pair of words, so it is natural that algorithms to work with this language might also be recursive in nature.

Suppose that we want to design an algorithm to extract the end of the first valid word from a string. This problem is complicated because we need to make sure that every open bracket is correctly matched with its closing partner. There are ways to do this without recursion (simply counting brackets might be sufficient), but the purpose of this example is to demonstrate recursive algorithms. Here’s how this algorithm might be defined.

INPUT: String s that starts with a valid word
OUTPUT: the position of the last character of the first valid word
character <- get first character from s
IF character = 'a' or character = 'b'
    RETURN 0
END
position <- 0
# s[0] is a '['
position <- position + 1
# find the first word after '['
a <- substring of s starting from index position
i <- end index of first word from a
position <- position + i
# s[position+1] is ','
position <- position + 1
# get the end of first word after ','
b <- substring of s starting at index position
j <- end index of first word from b
position <- position + j
# s[position + 1] is a ']' matching s[0]
position <- position + 1
RETURN position

The lines prefixed by a # are comments that are there for exposition only. Notice that this algorithm invokes itself twice when there is a word that is not a letter. This is the best way to ensure that one always contains the correct number of matching pairs. This is how we might implement the preceding algorithm in C++.

size_t end_of_first_word(std::string_view s) noexcept {
    if (!s.starts_with('['))  {
        return 0;
    }
    size_t position = 0;
    assert(s[position] == '[');
    position += 1;
    auto a = s.substr(position);
    auto i = end_of_first_word(a);
    position += i;
    position += 1;
    assert(s[position] == ',');
    position += 1;
    auto b = s.substr(position);
    auto j = end_of_first_word(b);
    position += j;
    position += 1;
    assert(s[position] == ']');
    return position;
}

This is not an optimal implementation, but that doesn’t matter right now. This is an exact translation of the pseudocode set out in the algorithm into C++, including assertions for the comments that describe what should be the case if our algorithm is correct.

We’re making use of the string_view class from C++17, which is a better way to work with non-owning strings than using raw const char* C-style strings. Using string_view will help ensure we don’t access memory outside of the string, which would be easily done with a C-style string. Moreover, it provides many convenience methods such as substr and starts_with. An alternative would be to work directly with a pair of iterators defining the range of values, but this is not much better than working with C-style strings.

We haven’t included any error checking in this implementation beyond the assertions, and the function is marked noexecpt. This means calling this function on a string that doesn’t start with a valid word is undefined behavior. The precondition on the string is that the string starts with a valid word, so it is the responsibility of the caller to ensure that this condition holds. This might be necessary on the “hot path” of a program, where checking for invalid strings might be too costly.

Writing out computations recursively is often easier than writing them out in a sequential manner, but one must remember that languages like C++ are not designed to work in this way. Recursive implementations might perform worse than a sequential implementation, since calling functions in C++ can be an expensive operation. Modern optimizing compilers might be able to inline function calls or otherwise reduce the cost of invoking these functions, by tail recursion optimization or otherwise. However, if the number of recursions cannot be known at compile time, the options are limited. Here is an example of how one might implement an algorithm that does not rely on recursion.

size_t end_of_first_word(std::string_view s) noexcept {
    size_t position = 0;
    int depth = 0;
    for (const auto& c : s) {
        switch (c) {
            case '[': ++depth; break;
            case ']': --depth;
            default:
                if (depth == 0) {
                    return position;
                }
        }
        ++position;
    }
    return position;
}

You may notice that this implementation is more difficult to reason about. Moreover, this implementation cannot so easily be generalized if some other operation needs to be performed, other than simply finding the position of the end of the first valid word.

The trade-off between flexibility and performance is a common dilemma for programmers. It is crucial to understand the properties of your solutions and design algorithms according to what is required by the context. If flexibility is not a concern, then it’s fine to optimize further and cut off the pathway to adding capabilities. However, one should never optimize until the performance is measured and the algorithm is known to underperform; measure twice, cut once.

This concludes the four components of computational thinking. Now we can see how modern features of C++ and good software engineering practices can help solve problems too.