0% found this document useful (0 votes)
33 views

Chapter 4 - Solving Problems With Pointers and Dynamic Memory - Think Like A Programmer

Uploaded by

İlkin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Chapter 4 - Solving Problems With Pointers and Dynamic Memory - Think Like A Programmer

Uploaded by

İlkin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

As with topics covered in previous chapters, you should have had some

exposure to basic pointer use, but to make sure we’re on the same page,

4 here’s a quick review.

SOLVING PROBLEMS WITH POINTERS AND Pointers in C++ are indicated with an asterisk (*). Depending on the
context, the asterisk indicates either that a pointer is being declared or
DYNAMIC MEMORY
that we mean the pointed-to memory, not the pointer itself. To declare a
pointer, we place the asterisk between the type name and the identifier:

int * intPointer;

This declares the variable intPointer as a pointer to an int. Note that


the asterisk binds with the identifier, not the type. In the following,
In this chapter, we’ll learn to solve problems using pointers and dynamic variable1 is a pointer to an int, but variable2 is just an int:
memory, which will allow us to write flexible programs that can accom-
modate data sizes that are unknown until the program runs. Pointers and int * variable1, variable2;
dynamic memory allocation are “hard-core” programming. When you
can write programs that grab blocks of memory on the fly, link them into An ampersand in front of a variable acts as the address-of operator. So

useful structures, and clean up everything at the end so there is no we could assign the address of variable2 to variable1 with:

residue, you’re not just someone who can do a little coding—you’re a


programmer. variable1 = &variable2;

Because pointers are tricky, and because many popular languages, We can also assign the value of one pointer variable to another

such as Java, appear to forgo the use of pointers, some fledgling program- directly:

mers will convince themselves that they can skip this subject entirely.
intPointer = variable1;
This is a mistake. Pointers and indirect memory access will always be
used in advanced programming, even though they may be hidden by the
Perhaps most importantly, we can allocate memory during runtime
mechanisms of a high-level language. Therefore, to truly think like a pro-
that can be accessed only through a pointer. This is accomplished with
grammer, you have to be able to think your way through pointers and
the new operator:
pointer-based problems.

double * doublePointer = new double;


Before we get down to solving pointer problems, though, we’re going
to carefully examine all aspects of how pointers work, both on the sur-
Accessing the memory at the other end of the pointer is known as
face and behind the scenes. This study provides two benefits. First, this
dereferencing and is accomplished with an asterisk to the left of a pointer
knowledge will allow us to make the most effective use of pointers.
identifier. Again, this is the same placement we would use for a pointer
Second, by dispelling the mysteries of pointers, we can employ them with
declaration. The context makes the meaning different. Here’s an example:
confidence.

Review of Pointer Fundamentals


➊ *doublePointer = 35.4; Resizable Data Structures

➋ double localDouble = *doublePointer;


We can also make pointer-based data structures that grow or shrink dur-
ing runtime as needed. The most basic resizable data structure is the
We assign a value to the double allocated by the previous code ➊ be-
linked list, which you may have already seen. Although the data in the
fore copying the value from this memory location to the variable local-
structure can be accessed only in sequential order, the linked list always
Double ➋.
has just as many places for data as it has data itself, with no wasted space.
Other, more elaborate pointer-based data structures, as you will see later,
To deallocate memory allocated with new, once we no longer need it,
have orderings and “shapes” that can reflect the relationship of the un-
we use the keyword delete:
derlying data better than an array can. Because of this, even though an

delete doublePointer; array offers full random-access that no pointer-based structure offers, the
retrieval operation (where we find the element in the structure that best

The mechanics of this process are described in detail in “Memory meets a certain criterion) can be much faster with a pointer-based struc-

Matters” on page 85. ture. We’ll use this benefit later in this chapter to create a data structure
for student records that grows as needed.
Benefits of Pointers
Memory Sharing
Pointers give us abilities not available with static memory allocation and
also provide new opportunities for efficient use of memory. The three Pointers can improve program efficiency by allowing memory blocks to

main benefits of using pointers are: be shared. For example, when we call a function, we can pass a pointer to
a block of memory instead of passing a copy of the block using reference
• Runtime-sized data structures parameters. You’ve most likely seen these before; they are parameters in
which an ampersand (&) appears between the type and the name in the
• Resizable data structures formal parameter list:

• Memory sharing void refParamFunction (int ➊& x) {


➋x = 10;
Let’s take a look at each of these in a bit more detail.
}

Runtime-Sized Data Structures


int number = 5;

By using pointers, we can make an array with a size determined at run- refParamFunction(➌number);

time, rather than having to choose the size before building our applica- cout << ➍number << "\n";

tion. This saves us from having to choose between potentially running out
of space in the array and making the array as large as could possibly be NOTE

needed, thereby wasting much of the array space in the average case. We
The spaces shown before and after the ampersand symbol are
first saw runtime data sizing in “Deciding When to Use Arrays” on page
not required—I just include them here for aesthetic reasons. In
74. We’ll use this concept later in this chapter, in “Variable-Length
other developers’ code, you may see int& x, int &x, or perhaps
Strings” on page 91.
even int&x.
Given pointers’ notorious reputation as one of the most difficult C++
In this code, the formal parameter x ➊ is not a copy of the argument features, you might think that no programmer would ever try to use a
number ➌; rather, it is a reference to the memory where number is stored. pointer when it isn’t necessary. I have been surprised many times, how-
Therefore, when x is changed ➋, the memory space for number is changed, ever, to find otherwise. Sometimes programmers simply trick themselves
and the output at the end of the code snippet is 10 ➍. Reference parame- into thinking a pointer is required. Suppose you are making a call to a
ters can be used as a mechanism to send values out of a function, as function written by someone else, from a library or application program-
shown in this example. More broadly, reference parameters allow the ming interface, perhaps, with the following prototype:
called function and the calling function to share the same memory, thus
lowering overhead. If a variable being passed as a parameter occupies a void compute(int input, int* output);
kilobyte of memory, passing the variable as a reference means copying a
32- or 64-bit pointer instead of the kilobyte. We can signal that we are us- We might imagine that this function is written in C, not C++, and that is

ing a reference parameter for performance, not its output, by using the why it uses a pointer rather than a reference (&) to make an “outgoing”

const keyword: parameter. In calling this function, a programmer might carelessly do


something like this:
int anotherFunction(const int & x);
int num1 = 10;

By prefixing the word const in the declaration of the reference param- int* num2 = new int;

eter x, anotherFunction will receive a reference to the argument passed in compute(num1, num2);

the call but will be unable to modify the value in that argument, just like
any other const parameter. This code is inefficient in space because it creates a pointer where
none is needed. Instead of the space for two integers, it uses the space for
In general, we can use pointers in this way to allow different parts of a two integers and a pointer. The code is also inefficient in time because the
program, or different data structures within the program, to have access unnecessary memory allocation takes time (as explained in the next sec-
to the same data without the overhead of copying. tion). Lastly, the programmer now has to remember to delete the allo-
cated memory. All of this could’ve been avoided by using the other aspect
When to Use Pointers of the & operator, which allows you to get the address of a statically allo-
cated variable, like this:
As we discussed with arrays, pointers have potential drawbacks and
should be used only when appropriate. How do we know when pointer int num1 = 10;
use is appropriate? Having just listed the benefits of pointers, we can say int num2;
that pointers should be used only when we require one or more of their compute(num1, &num2);
benefits. If your program needs a structure to hold an aggregate of data,
but you can’t accurately estimate how much data ahead of runtime; if you Strictly speaking, we’re still using a pointer in the second version, but
need a structure that can grow and shrink during execution; or if you we’re using it implicitly, without a pointer variable or dynamic memory
have large objects or other blocks of data being passed around your pro- allocation.
gram, pointers may be the way to go. In the absence of any of these situa-
tions, though, you should be wary of pointers and dynamic memory Memory Matters
allocation.
To understand how dynamic memory allocation gives us runtime sizing int functionB(int inputValue) {
and memory sharing, we have to understand a little bit about how mem- ➊return inputValue - 10;
ory allocation works in general. This is one of the areas where I think it }
benefits new programmers to learn C++. All programmers must eventu- int functionA(int num) {
ally understand how memory systems work in a modern computer, and int localVariable = functionB(num * 10);
C++ forces you to face this issue head-on. Other languages hide enough of return localVariable;
the dirty details of memory systems that new programmers convince }
themselves that these details are of no concern, which is simply not the int main()
case. Rather, the details are of no concern so long as everything is work- {
ing. As soon as there is a problem, however, ignorance of the underlying int x = 12;
memory models creates an insurmountable obstacle between the pro- int y = functionA(x);
grammer and the solution. return 0;
}
The Stack and the Heap

In this code, the main function calls functionA, which in turn calls func-
C++ allocates memory in two places: the stack and the heap. As the names
tionB. Figure 4-1 (b) shows a simplified version of how the runtime stack
imply, the stack is organized and neat, and the heap is disjointed and
would be arranged at the point right before we execute the return state-
messy. The name stack is especially descriptive because it helps you visu-
ment of functionB ➊. The activation records for all three functions would
alize the contiguous nature of the memory allocation. Think of a stack of
be arranged in a stack of contiguous memory, with the main function at
crates, as in Figure 4-1 (a). When you have a crate to store, you place it on
the bottom of the stack. (Just to make things extra confusing, it’s possible
the top of the stack. To remove a particular crate from the stack, you have
that the stack begins at the highest possible point in memory and is built
to first remove all the crates that are on top of it. In practical program-
downward to lower memory addresses rather than upward to higher
ming terms, this means that once you have allocated a block of memory
memory addresses. You do yourself no harm, though, by ignoring the pos-
(a crate) on the stack, there’s no way to resize it because at any time you
sibility.) Logically, the main function activation record is on the bottom of
may have other memory blocks immediately following it (other crates on
the stack, with the functionA activation record on top of it and the func-
top of it).
tionB activation record on top of functionA. Neither of the lower two acti-
vation records can be removed before functionB’s activation record is
In C++, you might explicitly create your own stack for use in a particu-
lar algorithm, but regardless, there is one stack your program will always removed.

be using, known as the program’s runtime stack. Every time a function is


called (and this includes the main function), a block of memory is allocated
on the top of the runtime stack. This block of memory is called an activa-
tion record. A full discussion of its contents is beyond the scope of this
text, but for your understanding as a problem-solver, the main content of
the activation record is the storage space for variables. Memory for all the
local variables, including the function’s parameters, is allocated within
the activation record. Let’s take a look at an example:
tions, we’ll end up with lots of gaps between the remaining allocated
memory blocks. This problem is known as memory fragmentation.

Every program has its own heap, from which memory is dynamically
allocated. In C++, this usually means an invocation of the new keyword,
but you will also see calls to the old C functions for memory allocation,
such as malloc. Each call to new (or malloc) sets aside a chunk of memory in
the heap and returns a pointer to the chunk, while each call to delete (or
free if the memory was allocated with malloc) returns the chunk to the
pool of available heap memory. Because of fragmentation, not all of the
memory in the pool is equally useful. If our program begins by allocating
variables A, B, and C in heap memory, we might expect those blocks to be
contiguous. If we deallocate B, the gap it leaves behind can be filled only
by another request that is of B’s size or smaller, until either A or C is also
deallocated.

Figure 4-2 clarifies the situation. In part (a), we see the floor of our
room littered with crates. At one point the room was probably well orga-

Figure 4-1: A stack of crates and a stack of function calls nized, but over time, the arrangement became haphazard. Now there is a
small crate (b) that cannot fit in any open space on the floor, even though
While a stack is highly organized, a heap, by contrast, has little organi- the overall unused floor area greatly exceeds the footprint of the crate. In
zation. Suppose you’re storing things in crates again, but these crates are part (c), we represent a small heap. The dashed-line squares are the
fragile and you can’t stack them on top of each other. You’ve got a big, ini- smallest (indivisible) chunks of memory, which might be a single byte, a
tially empty room to store the crates, and you can put them anywhere memory word, or something larger, depending on the heap manager. The
you want on the floor. The crates are heavy, however, so once you put shaded areas represent allocations of contiguous memory; for clarity, one
one down, you’d rather just leave it where it is until you’re ready to take allocation has some of its chunks numbered. As with the fragmented
it out of the room. This system has advantages and disadvantages com- floor, the fragmented heap has the unallocated memory chunks sepa-
pared to the stack. On the one hand, this storage system is flexible and al- rated, which reduces their usability. There are a total of 85 unused
lows you to get to the contents of any crate at any time. On the other chunks of memory, but the largest contiguous range of unused memory,
hand, the room is going to quickly become a mess. If the crates are all dif- as indicated by the arrow, is only 17 chunks long. In other words, if each
ferent sizes, it’s going to be especially difficult to make use of all of the chunk were a byte, this heap could not fulfill any request from an invoca-
available space on the floor. You’ll end up with a lot of gaps between tion of new for more than 17 bytes, even though the heap has 85 bytes free.
crates that are too small to fill with another crate. Because the crates can’t
be easily moved, removing several crates just creates several hard-to-fill
gaps rather than providing the wide-open storage of our original empty
floor. In practical programming terms, our heap is like the floor of that
room. A block of memory is a contiguous series of addresses; thus, over
the lifetime of a program with many memory allocations and dealloca-
of your work first, just to be safe. What should happen is that the pro-
gram halts and your operating system complains that the code generated
but did not handle a bad_alloc exception. This exception is thrown by new
when no block of unallocated memory in the heap is large enough to ful-
fill the request. Running out of heap memory is called a heap overflow. On
some systems, a heap overflow can be common, while on other systems, a
program will cause thrashing long before it produces a bad_alloc (on my
system, the new call didn’t fail until I had allocated two gigabytes in previ-
ous calls).

Figure 4-2: A fragmented floor, a crate that cannot be placed, and frag- A similar situation exists with the runtime stack. Each function call al-
mented memory locates space on the stack, and there is some fixed overhead for each acti-
vation record, even for a function with no parameters or local variables.
Memory Size The easiest way to demonstrate this is with a runaway recursive function:

The first practical issue with memory is limiting its use to what is neces-
➊ int count = 0;
sary. Modern computer systems have so much memory that it’s easy to
void stackOverflow() {
think of it as an infinite resource, but in fact each program has a limited
➋count++;
amount of memory. Also, programs need to use memory efficiently to
➌stackOverflow();
avoid overall system slowdown. In a multitasking operating system
}
(which means just about every modern operating system), every byte of
int main()
memory wasted by one program pushes the system as a whole toward the
{
point where the set of currently running programs doesn’t have enough
➍stackOverflow();
memory to run. At that point, the operating system constantly swaps out
return 0;
chunks of one program for another and thus grinds to a crawl. This condi-
}
tion is known as thrashing.

This code has a global variable ➊, which in most cases is bad style, but
Note that, beyond the desire to keep the overall program memory foot-
here I need a value that persists throughout all of the recursive calls. As
print as small as possible, the stack and the heap have maximum sizes. To
this variable is declared outside of the function, no memory is allocated
prove this, let’s allocate memory from the heap a kilobyte at a time, until
for it in the function’s activation record, nor are there any other local
something blows up:
variables or parameters. All the function does is increment count ➋ and
make a recursive call ➌. Recursion is discussed extensively in Chapter 6
const int intsPerKilobyte = 1024 / sizeof(int);
but is used here simply to make the chain of function calls as long as pos-
while (true) {
sible. The activation record of a function remains on the stack until that
int *oneKilobyteArray = new int[intsPerKilobyte];
function ends. So when the first call is made to stackOverflow from main ➍,
}
an activation record is placed on the runtime stack that cannot be re-

Let me emphasize that this is horrible code written purely to demon- moved until that first function call ends. This will never happen because

strate a point. If you try this code out on your system, you should save all the function makes a second call to stackOverflow, placing another activa-
tion record on the stack, which then makes a third call, and so on. These call delete on any of those variables, we have effectively deallocated the
activation records stack up until the stack runs out of room. On my sys- memory for all of the variables. If we don’t explicitly clear the variables
tem, count is around 4,900 when the program bombs. My development en- to NULL, they will be known as dangling references, and calling delete on
vironment, Visual Studio, defaults to a 1MB stack allocation, which means any of them will produce a runtime error.
that each of these function calls, even without any local variables or pa-
rameters, creates an activation record of over 200 bytes. Solving Pointer Problems

Lifetime By this point, you’re probably ready for some problems, so let’s look at a
couple and see how we can use pointers and dynamic memory allocation
The lifetime of a variable is the time span between allocation and deallo- to solve them. First we’ll work with some dynamically allocated arrays,
cation. With a stack-based variable, meaning either a local variable or a which will demonstrate how to keep track of heap memory through all of
parameter, the lifetime is handled implicitly. The variable is allocated our manipulations. Then we’ll get our feet wet with a truly dynamic
when the function is called and deallocated when the function ends. With structure.
a heap-based variable, meaning a variable dynamically allocated using
new, the lifetime is in our hands. Managing the lifetime of dynamically al- Variable-Length Strings
located variables is the bane of every C++ programmer. The most obvious
In this first problem, we’re going to create functions to manipulate
issue is the dreaded memory leak, a situation in which memory is allo-
strings. Here, we’re using the term in its most general sense: a sequence
cated from the heap but never deallocated and not referenced by any
of characters, regardless of how those characters are stored. Suppose we
pointer. Here’s a simple example:
need to support three functions on our string type.

➊ int *intPtr = new int;


➋ intPtr = NULL; PROBLEM: VARIABLE-LENGTH STRING MANIPULATION

In this code, we declare a pointer to an integer ➊, initializing it by allo- Write heap-based implementations for three required string functions:
cating an integer from the heap. Then in the second line, we set our inte-
ger pointer to NULL ➋ (which is simply an alias for the number zero). The append This function takes a string and a character and appends the char-
integer we allocated with new still exists, however. It sits, lonely and for- acter to the end of the string.
lorn, in its place in the heap, awaiting a deallocation that can never come.
We cannot deallocate the integer because to deallocate a block of mem- concatenate This function takes two strings and appends the characters of

ory, we use delete followed by a pointer to the block, and we no longer the second string onto the first.

have a pointer to the block. If we tried to follow the code above with
characterAt This function takes a string and a number and returns the
delete intPtr, we would get an error because intPtr is zero.
character at that position in the string (with the first character in the

Sometimes, instead of memory that never gets deallocated, we have string numbered zero).

the opposite problem, attempting to deallocate the same memory twice,


Write the code with the assumption that characterAt will be called fre-
which produces a runtime error. This might seem like an easy problem to
quently, while the other two functions will be called relatively seldom.
avoid: Just don’t call delete twice on the same variable. What makes this
The relative efficiency of the operations should reflect the calling
situation tricky is that we may have multiple variables pointing to the
frequency.
same memory. If multiple variables point to the same memory and we
case while also double-checking how each step transforms your sample to
make sure that you reach the desired output state. This technique is espe-
In this case, we want to choose a representation for our string that al-
cially helpful when dealing with pointers and dynamically allocated
lows for a fast characterAt function, which means we need a fast way to
memory, because so much of what happens in the program is outside of
locate a particular character. As you probably recall from the previous direct view. Following through a case on paper forces you to track all the
chapter, this is what an array does best: random access. So let’s solve this changing values in memory—not just those directly represented by vari-
problem using arrays of char. The append and concatenate functions change ables but also those in the heap.
the size of the string, which means we run into all the array problems we
discussed earlier. Because there’s no built-in limitation to the size of the Suppose we start with the string test, which is to say we have an array
string in this problem, we can’t pick a large initial size for our arrays and of characters in the heap with t, e, s, and t, in that order, and we want to
hope for the best. Instead, we’ll need to resize our arrays during runtime. append, using our function, an exclamation point. Figure 4-3 shows the
state of memory before (a) and after (b) this operation. In these diagrams,
To start off, let’s create a typedef for our string type. We know we’re
anything to the left of the dashed vertical line is stack memory (local vari-
going to be dynamically creating our arrays, so we need to make our ables or parameters) and anything to the right is heap memory, dynami-
string type a pointer to char. cally allocated using new.

typedef char * arrayString;

With that in place, let’s start on the functions. Using the principle of
starting with what we already know how to do, we can quickly write the
characterAt function.

char characterAt(arrayString s, int position) {


➊return s[position]; Figure 4-3: Proposed “before” (a) and “after” (b) states for append function
}
Looking at this figure, right away I’m seeing a potential issue for our
Recall from Chapter 3 that if a pointer is assigned the address of an ar- function. Based on our implementation approach for the strings, the func-
ray, we can access elements in the array using normal array notation ➊. tion is going to create a new array that is one element larger than the
Note, however, that bad things can happen if position is not actually a original array and copy all the characters from the first array to the sec-
valid element number for the array s, and this code places the responsi- ond. But how are we to know how large the first array is? From the previ-
bility of validating the second parameter on the caller. We’ll consider al- ous chapter, we know that we have to track the size of our arrays our-
ternatives to this situation in the exercises. For now, let’s move onto the selves. So something is missing.
append function. We can imagine what this function will do generally, but
to get the specifics right, we should consider an example. This is a tech- If we’ve had experience working with strings in the standard C/C++ li-

nique I call solving by sample case. brary, we will already know the missing ingredient, but if we don’t, we
can quickly reason it out. Remember that one of our problem-solving
Start with a nontrivial sample input for the function or program. techniques is looking for analogies. Perhaps we should think about other
Write down all the details of that input along with all the details of the problems in which the length of something was unknown. Back in
output. Then when you write your code, you’ll be writing for the general Chapter 2, we processed identification codes with an arbitrary number of
digits for the “Luhn Checksum Validation” problem. In that problem, we wrote the shell, I realized that the arrayString parameter had to be a ref-
didn’t know how many digits the user would enter. In the end, we wrote a erence (&) ➊ because the function is going to create a new array in the
while loop that continued until the last character read was the end-of-line. heap. That’s the whole point, after all, of using dynamic memory here: to
create a new array whenever the string is resized. Therefore, the value
Unfortunately, there is no end-of-line character waiting for us at the that the variable a has when passed to append is not the same value it
end of our arrays. But what if we put an end-of-line character in the last should have when the function is through, because it needs to point to a
element of all our string arrays? Then we could discover the length of our new array. Note that because our arrays use the null-character termina-
arrays the same way we discovered how many digits were in the identifi- tion expected by the standard libraries, we can send the array referenced
cation codes. The only downside to this approach is that we could no by the pointer a directly to the output stream to check the value ➎.
longer use the end-of-line character in our strings, except as the string
terminator. That’s not necessarily a huge restriction, depending on how Figure 4-4 shows our new understanding of what the function will do
the strings will be used, but for maximum flexibility, it would be best to with our test case. The array terminators are in place, shown as NULL for
choose a value that cannot be confused with any character anyone might clarity. In the after (b) state, it’s clear that s is pointing at a new allocation
actually want to use. Therefore, we’ll use a zero to terminate our arrays of memory. The previous array is now in a shaded box; in these diagrams,
because a zero represents a null character in ASCII and other character I’m using shaded boxes to indicate memory that has been deallocated.
code systems. This is exactly the method used by the standard C/C++ Including the deallocated memory in our diagrams helps remind us to ac-
library. tually perform the deallocation.

With that issue cleared up, let’s get more specific about what append
will do with our sample data. We know our function is going to have two
parameters, the first being an arrayString, a pointer to an array of charac-
ters in the heap, and the second being the char to be appended. To keep
things straight, let’s go ahead and write the outline of the append function
and the code to test it.

void append(➊arrayString& s, char c) {


}
void appendTester() {
➋arrayString a = new char[5];
➌a[0] = 't'; a[1] = 'e'; a[2] = 's'; a[3] = 't'; a[4] = 0; Figure 4-4: Updated and elaborated memory states before (a) and after (b)
➍append(a, '!'); the append function

➎cout << a << "\n";


With everything properly visualized, we can write this function:
}

void append(arrayString& s, char c) {


The appendTester function allocates our string in the heap ➋. Note that
int oldLength = 0;
the size of the array is five, which is necessary so that we can assign all
➊while (s[oldLength] != 0) {
four letters of the word test along with our terminating null character ➌.
oldLength++;
Then we call append ➍, which at this point is just an empty shell. When I
}
➋arrayString newS = new char[oldLength + 2];
➌for (int i = 0; i < oldLength; i++) {
newS[i] = s[i];
}
➍newS[oldLength] = c;
➎newS[oldLength + 1] = 0;
➏delete[] s;
➐s = newS;
}

There’s a lot going on in this code, so let’s check it out piece by piece.
At the beginning of the function, we have a loop to locate the null charac-
ter that terminates our array ➊. When the loop completes, oldLength will
be the number of legitimate characters in the array (that is, not including
the terminating null character). We allocate the new array from the heap Figure 4-5: Showing the relationship of a local variable, parameters, and al-
with a size of oldLength + 2 ➋. This is one of those details that is tricky to located memory before and after the append function
keep straight if you’re figuring it all out in your head but easy to get right
if you have a diagram. Following the code through our example in Figure The last three lines in the append function are all about that shaded box

4-5, we see that oldLength would be four in this case. We know that in part (b) of the figure. To avoid a memory leak, we have to deallocate

oldLength would be four because test has four characters and that the new the array in the heap that our parameter s originally pointed to ➏. Finally,

array in part (b) requires six characters because we need space for the we leave our function with s pointing to the new, longer array ➐.
appended character and the null terminator. Unfortunately, one of the reasons memory leaks are so common in C++
programming is that until the total amount of memory leaks is large, the
With the new array allocated, we copy all of the legitimate characters program and overall system will display no ill effects. Thus, the leaks can
from the old array to the new ➌, and we then assign the appended char- go totally unnoticed by programmers during testing. As programmers,
acter ➍ and the null character terminator ➎ to their appropriate locations therefore, we must be diligent and always consider the lifetime of our
in the new array. Again, our diagram helps us keep things straight. To heap memory allocations. Every time you use the keyword new, think
make things even clearer, Figure 4-5 shows how the value of oldLength about where and when the corresponding delete will occur.
was computed and what position that value would indicate in the new ar-
ray. With that visual reminder, it’s easy to get the subscripts correct in Notice how everything in this function follows directly from our dia-

those two assignment statements. grams. Tricky programming becomes so much less tricky with good dia-
grams, and I wish more new programmers would take the time to draw
before they code. This goes back to our most fundamental problem-solv-
ing principle: Always have a plan. A well-drawn diagram for a problem
example is like having a mapped-out route to your destination before
starting on a long vacation drive. It’s a little bit of extra effort at the start
to potentially avoid much more effort and frustration at the end.
CREATING DIAGRAMS Let’s look at append again, checking for special cases—in other words,
making sure we don’t have any oddball situations among the possible
All you need to draw a diagram is a pencil and paper. If you’ve got the good input values. The most common culprits for special cases are at the
time, though, I would recommend using a drawing program. There are extremes, such as the smallest or largest possible input. With append,
drawing tools with templates specifically for programming problems, but there’s no maximum size for our string array, but there is a minimum
any general vector-based drawing program will get you started (the term size. If the string has no legitimate characters, it would actually corre-
vector here means the program works with lines and curves and isn’t a spond to an array of one character (the one character being the null ter-
paintbox program like Photoshop). I made the original illustrations for minating character). As before, let’s make a diagram to keep things
this book using a program called Inkscape, which is freely available. straight. Suppose we appended the exclamation point to a null string, as
Creating the diagrams on your computer allows you to keep them orga- shown in Figure 4-6.
nized in the same place where you store the code that the diagrams illus-
trate. The diagrams are also likely to be neater and therefore more easily
understood if you come back to them after an absence. Finally, it’s easy to
copy and modify a computer-created diagram, as I did when I created
Figure 4-5 from Figure 4-4, and if you want to make some quick tempo-
rary notations, you can always print out a copy to doodle on.

Getting back to our append function, the code looks solid, but remem-
ber that we based this code on a particular sample case. Thus, we
shouldn’t get cocky and assume that the code will work for all valid cases.
In particular, we need to check for special cases. In programming, a spe-
cial case is a situation in which valid data will cause the normal flow of
code to produce erroneous results.

Note that this problem is distinct from that of bad data, such as out-of- Figure 4-6: Testing the smallest case for the append function
range data. In the code for this book, we’ve made the assumption of good
input data for programs and individual functions. For example, if the pro- When we look at the diagram, this doesn’t appear to be a special case,
gram is expecting a series of integers separated by commas, we’ve as- but we should run the case through our function to check. Let’s add the
sumed that’s what the program is getting, not extraneous characters, non- following to our appendTester code:
numbers, and so on. Such an assumption is necessary to keep code length
reasonable and to avoid repeating the same data-checking code over and arrayString b = new char[1];

over. In the real world, however, we should take reasonable precautions b[0] = 0;

against bad input. This is known as robustness. A robust program per- append(b, '!');

forms well even with bad input. For example, such a program could dis- cout << b << "\n";

play an error message to the user instead of crashing.


That works, too. Now that we’re reasonably sure that the append func-
Checking for Special Cases tion is correct, do we like it? The code seemed straightforward, and I’m
not getting any “bad smells,” but it does seem a little long for a simple op- Copying Dynamically Allocated Strings
eration. As I think ahead to the concatenate function, it occurs to me that,
like append, the concatenate function will need to determine the length of a Now it’s time to tackle that concatenate function. We’ll take the same ap-
proach we did with append. First, we’ll write an empty shell version of the
string array—or maybe the lengths of two string arrays. Because both op-
erations will need a loop that finds the null character that terminates the function to get the parameters and their types straight in our heads. Then,

string, we could put that code in its own function, which is then called we’ll make a diagram of a test case, and finally, we’ll write code to match

from append and concatenate as needed. Let’s go ahead and do that and our diagram. Here is the shell of the function, along with additional test-

modify append accordingly. ing code:

void concatenate(➊arrayString& s1, ➋arrayString s2) {


int length(arrayString s) {
}
➊int count = 0;
void concatenateTester() {
while (s[count] != 0) {
arrayString a = new char[5];
count++;
a[0] = 't'; a[1] = 'e'; a[2] = 's'; a[3] = 't'; a[4] = 0;
}
arrayString b = new char[4];
return count;
b[0] = 'b'; b[1] = 'e'; b[2] = 'd'; b[3] = 0;
}
concatenate(a, b);
void append(arrayString& s, char c) {
}
➋int oldLength = length(s);
arrayString newS = new char[oldLength + 2];
Remember that the description of this function says that the charac-
for (int i = 0; i < oldLength; i++) {
ters in the second string (the second parameter) are appended to the end
newS[i] = s[i];
of the first string. Therefore, the first parameter to concatenate will be a
}
reference parameter ➊, for the same reason as the first parameter of ap-
newS[oldLength] = c;
pend. The second parameter ➋, though, should not be changed by the func-
newS[oldLength + 1] = 0;
tion, so it will be a value parameter. Now for our sample case: We’re con-
delete[] s;
catenating the strings test and bed. The before-and-after diagram is
s = newS;
shown in Figure 4-7.
}

The details of the diagram should be familiar from the append function.
The code in the length function ➊ is essentially the same code that pre-
Here, for concatenate, we start with two dynamically allocated arrays in
viously began the append function. In the append function itself, we’ve re-
the heap, pointed to by our two parameters, s1 and s2. When the function
placed that code with a call to length ➋. The length function is what’s
is complete, s1 will point to a new array in the heap that’s nine characters
known as a helper function, a function that encapsulates an operation
long. The array that s1 previously pointed to has been deallocated; s2 and
common to several other functions. Besides reducing the length of our
its array are unchanged. While it might seem pointless to include s2 and
code, the elimination of redundant code means our code is more reliable
the bed array on our diagram, when trying to avoid coding errors, keeping
and easier to modify. It also helps our problem solving because helper
track of what doesn’t change is as important as keeping track of what
functions divide our code into smaller chunks, making it easier for us to
does. I’ve also numbered the elements of the old and new arrays, as that
recognize opportunities for code reuse.
came in handy with the append function. Everything is in place now, so
let’s write this function.
terminator. Then we copy the characters from the two original strings to
the new string ➌. The first loop is straightforward, but notice the compu-
tation of the subscript in the second loop ➍. We’re copying from the be-
ginning of s2 into the middle of newS; this is yet another example of trans-
lating from one range of values to another range of values, which we’ve
been doing in this text since Chapter 2. By looking at the element num-
bers on my diagram, I’m able to see what variables I need to put together
to compute the right destination subscript. The remainder of the function
puts the null terminator in place at the end of the new string ➎. As with
append, we deallocate the original heap memory pointed to by our first pa-
rameter ➏ and repoint the first parameter at the newly allocated string ➐.

This code appears to work, but as before, we want to make sure that
Figure 4-7: Showing the “before” (a) and “after” (b) states for the concate- we haven’t inadvertently made a function that succeeds for our test case
nate method but not all cases. The most likely trouble cases would be when either or
both of the parameters are zero-length strings (just the null terminator).
void concatenate(arrayString& s1, arrayString s2) { We should check these cases explicitly before moving on. Note that when
➊int s1_OldLength = length(s1); you are checking for correctness in code that uses pointers, you should
int s2_Length = length(s2); take care to look at the pointers themselves and not just the values in the
int s1_NewLength = s1_OldLength + s2_Length; heap that they reference. Here is one test case:
➋arrayString newS = new char[s1_NewLength + 1];
➌for(int i = 0; i < s1_OldLength; i++) { arrayString a = new char[5];

newS[i] = s1[i]; a[0] = 't'; a[1] = 'e'; a[2] = 's'; a[3] = 't'; a[4] = 0;

} arrayString c = new char[1];

for(int i = 0; i < s2_Length; i++) { c[0] = 0;

newS[➍s1_OldLength + i] = s2[i]; concatenate(c, a);


cout << a << "\n" << c << "\n";
}
➊ cout << (void *) a << "\n" << (void *) c << "\n";
➎newS[s1_NewLength] = 0;
➏delete[] s1;
I wanted to be sure that the call to concatenate results in a and c both
➐s1 = newS;
pointing to the string test—that is, that they point to arrays with identical
}
values. Equally important, though, is that they point to different strings, as
shown in Figure 4-8 (a). I check this in the second output statement by
First, we determine the lengths of both of the strings we’re concatenat-
changing the types of the variables to void *, which forces the output
ing ➊, and then we sum those values to get the length the concatenated
stream to display the raw value of the pointers ➊. If the pointers them-
string will have when we are done. Remember that all of these lengths
selves had the same value, then we would say that the pointers had be-
are for the number of legitimate characters, not including the null termi-
come cross-linked, as shown in Figure 4-8 (b). When pointers have un-
nator. Thus, when we create the array in the heap to store the new string
knowingly become cross-linked, subtle problems occur because changing
➋, we allocate one more than the combined length to have a space for the
the contents of one variable in the heap mysteriously changes another addRecord This function takes a pointer to a collection of student records, a
variable—really the same variable, but in a large program, that can be student number, and a grade, and it adds a new record with this data to
hard to see. Also, remember that if two pointers are cross-linked, when the collection.
one of them is deallocated via delete, the remaining pointer becomes a
dangling reference. Therefore, we have to be diligent when we review averageRecord This function takes a pointer to a collection of student
our code and always check potential cross-linking. records and returns the simple average of student grades in the collection
as a double.

The collection can be of any size. The addRecord operation is expected to


be called frequently, so it must be implemented efficiently.

A number of approaches would meet the specifications, but we’re go-


ing to choose a method that helps us practice our pointer-based problem-
solving techniques: linked lists. You may have already seen a linked list
before, but if not, know that the introduction of linked lists represents a
kind of sea change from what we have discussed so far in this text. A
good problem-solver could have developed any of the previous solutions

Figure 4-8: concatenate should result in two distinct strings (a), not two given enough time and careful thought. Most programmers, however,
wouldn’t come up with the linked list concept without help. Once you see
cross-linked pointers (b).
it and master the basics, though, other linked structures will come to

With all three functions implemented—characterAt, append, and con- mind, and then you are off and running. A linked list is truly a dynamic

catenate—we’ve completed the problem. structure. Our string arrays were stored in dynamically allocated mem-
ory, but once created, they were static structures, never getting any larger
Linked Lists or smaller, just being replaced. A linked list, in contrast, grows piece by
piece over time like a daisy chain.
Now we’re going to try something trickier. The pointer manipulations will
be more complicated, but we’ll keep everything straight now that we Building a List of Nodes
know how to crank out the diagrams.
Let’s construct a sample linked list of student records. To make a linked
list, you need a struct that contains a pointer to the same struct, in addi-
PROBLEM: TRACKING AN UNKNOWN QUANTITY OF STUDENT
tion to whatever data you want to store in the collection represented by
RECORDS
the linked list. For our problem, the struct will contain a student number
and grade.
In this problem, you will write functions to store and manipulate a collec-
tion of student records. A student record contains a student number and
struct ➊listNode {
a grade, both integers. The following functions are to be implemented:
➋int studentNum;
int grade;
➌listNode * next;
}; tion before, it’s used to indicate the field of a pointed-to struct (or class).
➍ typedef listNode * studentCollection; So node1->studentNum means “the studentNum field in the struct pointed to
by node1” and is equivalent to (*node1).studentNum. We then repeat the
The name of our struct is listNode ➊. A struct used to create a linked same process for node2 and node3. After assigning the field values to the
list is always referred to as a node. Presumably the name is an analogy to last node, the state of memory is as shown in Figure 4-9. In these dia-
the botanical term, meaning a point on a stem from which a new branch grams, we’ll use the divided-box notation we previously used for arrays
grows. The node contains the student number ➋ and grade that make up to show the node struct.
the real “payload” of the node. The node also contains a pointer to the
very type of struct we are defining ➌. The first time most programmers
see this, it looks confusing and perhaps even a syntactical impossibility:
How can we define a structure in terms of itself? But this is legal, and the
meaning will become clear shortly. Note that the self-referring pointer in
a node is typically given a name like next, nextPtr, or the like. Lastly, this
code declares a typedef for a pointer to our node type ➍. This will help the
readability of our functions. Now let’s build our sample linked list using
these types: Figure 4-9: Halfway through building a sample linked list

➊ studentCollection sc; Now that we have all of our nodes, we can string them together to
➋ listNode * node1 = new listNode; form a linked list. That’s what the rest of the previous code listing does.
➌ node1->studentNum = 1001; node1->grade = 78; First, we point our studentCollection variable to the first node ➎, then we
listNode * node2 = new listNode; point the next field of the first node to the second node ➏, and then we
node2->studentNum = 1012; node2->grade = 93; point the next field of the second node to the third node ➐. In the next
listNode * node3 = new listNode; step, we assign NULL (again, this is just a synonym for zero) to the next
➍ node3->studentNum = 1076; node3->grade = 85; field of the third node ➑. We do this for the same reason we put a null
➎ sc = node1; character at the end of our arrays in the previous problem: to terminate
➏ node1->next = node2; the structure. Just as we needed a special character to show us the end of
➐ node2->next = node3; the array, we need a zero in the next field of the last node in our linked
➑ node3->next = NULL; list so that we know it is the last node. Finally, to clean things up and
➒ node1 = node2 = node3 = NULL; avoid potential cross-linking problems, we assign NULL to each of the indi-
vidual node pointers ➒. The resulting state of memory is shown in Figure
We begin by declaring a studentCollection, sc ➊, which will eventually 4-10.
become the name for our linked list. Then we declare node1 ➋, a pointer to
a listNode. Again, studentCollection is synonymous with listNode *, but for
readability I’m using the studentCollection type only for variables that
will refer to the whole list structure. After declaring node1 and pointing it
to a newly allocated listNode in the heap ➋, we assign values to the stu-
dentNum and grade fields in that node ➌. At this point, the next field is unas-
signed. This is not a book on syntax, but if you haven’t seen the -> nota-
void addRecord(studentCollection& sc, int stuNum, int gr) {
}
➊ addRecord(sc, 1274, 91);

Again, the ➊ call would come at the end of the previous listing. With
the function shell outlining the parameters, we can diagram the “before”
state of this call, as shown in Figure 4-11.

Figure 4-10: The completed sample linked list

With this visual in front of us, it’s clear why the structure is called a
linked list: Each node in the list is linked to the next. You’ll often see
linked lists drawn linearly, but I actually prefer the scattered-in-memory
look of this diagram because it emphasizes that these nodes have no rela-
tionship to each other besides the links; each of them could be anywhere
inside the heap. Make sure you trace through the code until you are confi-
dent you agree with the diagram. Figure 4-11: The “before” state for the addRecord function

Notice that, in the concluding state, only one stack-based pointer re- Regarding the “after” state, though, we have a choice. We can guess
mains in use, our studentCollection variable sc, which points to the first that we’re going to create a new node in the heap and copy the values
node. A pointer external to the list (that is, not the next field of a node in from the parameters stuNum and gr into the studentNum and grade fields of
the list) that points to the first node in a linked list is known as a head the new node. The question is where this node is going to go, logically, in
pointer. On a symbolic level, this variable represents the list as a whole, our linked list. The most obvious choice would be at the end; there’s a
but of course it directly references only the first node. To get to the sec- NULL value in a next field just asking to be pointed to a new node. That
ond node, we have to go through the first, and to get to the third node, we would correspond to Figure 4-12.
have to go through the first two, and so on. This means that linked lists of-
fer only sequential access, as opposed to the random access provided by
arrays. Sequential access is the weakness of linked-list structures. The
strength of linked-list structures, as previously alluded to, is our ability to
grow or shrink the size of the structure by adding or removing nodes,
without having to create an entirely new structure and copy the data
over, as we’ve done with arrays.

Adding Nodes to a List


Figure 4-12: Proposed “after” state for addRecord function

Now let’s implement the addRecord function. This function is going to cre-
But if we can assume that the order of the records doesn’t matter (that
ate a new node and connect it into an existing linked list. We’ll use the
we don’t need to keep the records in the same order they were added to
same techniques we used in the previous problem. First up: a function
the collection), then this is the wrong choice. To see why, consider a col-
shell and a sample call. For testing, we’ll add code to the previous listing,
lection, not of three student records, but of 3,000. To reach the last record
so sc already exists as the head pointer to the list of three nodes.
in our linked list in order to modify its next field would require traveling dent number and grade from the parameters ➋. Then we link the new
through all 3,000 nodes. That’s unacceptably inefficient because we can node into the list, first by pointing the next field of the new node to the
get the new node into the list without traveling through any of the exist- former first node (by assigning it the value of sc) ➌ and then by pointing
ing nodes. sc itself at the new node ➍. Note that the last two steps have to happen in
that order; we need to use the original value of sc before we change it.
Figure 4-13 shows how. After the new node is created, it is linked into Also note that because we change sc, it must be a reference parameter.
the list at the beginning, not at the end. In the “after” state, our head
pointer sc points to the new node, while the next field of the new node As always, when we build code from a sample case, we have to check
points to what was previously the first node in the list, the one with stu- potential special cases. Here, that means checking to see that the function
dent number 1001. Note that while we assign a value to that next field of works with an empty list. With our string arrays, an empty string was still
the new node, the only existing pointer that changes is sc, and none of the a valid pointer because we still had an array to point to, an array with
values in the existing nodes are altered or even inspected. Working from just the null terminating character. Here, though, the number of nodes is
our diagram, here’s the code: the same as the number of records, and an empty list would be a NULL
head pointer. Will our code still hold up if we try to insert our sample
void addRecord(studentCollection& sc, int stuNum, int gr) { data when the incoming head pointer is NULL? Figure 4-14 shows the “be-
➊listNode * newNode = new listNode; fore” state and the desired “after” state.
➋newNode->studentNum = stuNum;
newNode->grade = gr; Walking this example through our code, we see that it handles this
➌newNode->next = sc; case fine. The new node is created just as before. Because sc is NULL in the
➍sc = newNode; “before” state, when ➌ this value is copied into the next field of our new
} node, that’s exactly what we want, and our one-node list is properly ter-
minated. Note that if we had continued with the other implementation
idea—adding the new node at the end of the linked list rather than at the
beginning—an initially empty list would be a special case because it
would then be the only case in which sc is modified.

Figure 4-13: Acceptable “after” state for addRecord function. The dashed ar-
row indicates the previous value of the pointer stored in sc.

Again, let me emphasize that translating a diagram and that code is a


lot easier than trying to keep things straight in your head. The code comes
directly from the illustration. We create a new node ➊ and assign the stu-
Figure 4-14: The “before” and “after” states for the smallest addRecord case ➌listNode * loopPtr = sc;
➍while (loopPtr != NULL) {
List Traversal
➎sum += loopPtr->grade;
➏count++;
Now it’s time to figure out the averageRecord function. As before, we’ll start
➐loopPtr = loopPtr->next;
with a shell and a diagram. Here’s the function shell and sample call.
}
Assume the sample call ➊ occurs after the creation of our original sample
➑double average = sum / count;
list, as shown in Figure 4-10.
return average;
}
double averageRecord(studentCollection sc) {
}
We start by declaring a variable count to store the number of nodes we
➊ int avg = averageRecord(sc);
encounter in the list ➊; this will also be the number of values in the col-
lection, which we’ll use to compute the average. Next we declare a vari-
As you can see, I’ve chosen to compute the average as an int, as we did
able sum to store the running total of grade values in the list ➋. Then we
with arrays in the previous chapter. Depending on the problem, however,
it might be better to compute it as a floating point value. Now we need a declare a listNode * called loopPtr, which we’ll use to traverse the list ➌.

diagram, but we pretty much already have a “before” state with Figure 4- This is the equivalent of our integer loop variable in an array-processing

9. We don’t need a diagram for the “after” state because this function isn’t for loop; it keeps track of where we are in the linked list, not with the po-

going to change our dynamic structure, just report on it. We just need to sition number but by storing a pointer to the node we are processing

know the expected result, which in this case is about 85.3333. currently.

So how do we actually compute the average? From our experience At this point, the traversal itself begins. The traversal loop continues

computing the average of all values in an array, we know the general con- until our loop-tracking pointer reaches our terminating NULL ➍. Inside the

cept. We need to add up every value in the collection and then divide that loop, we add the value of the grade field in the currently referenced node

sum by the number of values. With our array averaging code, we in- to sum ➎. We increment the count ➏, and then we assign the next field of

spected every value using a for loop from 0 to one less than the size of the the current node to our loop-tracking pointer ➐. This has the effect of

array, using the loop counter as the array subscript. We can’t use a for moving our traversal one node ahead. This is the tricky part of the code,
loop here because we don’t know ahead of time how many numbers are so let’s make sure we have this straight. In Figure 4-15, I’m showing how
in the linked list; we have to keep going until we reach the NULL value in a the node variable changes over time. The letters (a) through (d) mark dif-
   
ferent points during the execution of the code on our sample data, show-
node’s next field indicating list termination. This suggests a while loop, Copy Highlight Add Note Get Link
something like what we used earlier in this chapter to process our arrays ing different points during the lifetime of loopPtr and the locations from

of unknown length. Running through a linked list like this, from begin- which loopPtr’s value has been obtained. Point (a) is just as the loop be-

ning to terminus, is known as a list traversal. This is one of the basic oper- gins; loopPtr has just been initialized with the value of sc. Therefore,
ations on a linked list. Let’s put the traversal idea to work to solve this loopPtr points to the first node in the list, just as sc does. During the first
problem: iteration of the loop, then, the first node’s grade value of 78 is added to
sum. The first node’s next value is copied to loopPtr so that now loopPtr
double averageRecord(studentCollection sc) { points to the second node of the list; this is point (b). During the second
➊int count = 0; iteration, we add 93 to sum and copy the next field of the second node to
➋double sum = 0; loopPtr; this is point (c). Finally, during the third and last iteration of the
loop, we add 85 to sum and assign the NULL of the next field in the third if (sc == NULL) return 0;
node to loopPtr; this is point (d). When we reach the top of the while loop
again, the loop ends because loopPtr is NULL. Because we incremented As this example shows, handling special cases is usually pretty simple.
count each time we iterated, count is three. We just have to make sure we take the time to identify them.

Conclusion and Next Steps

This chapter has just scratched the surface of problem solving using
pointers and dynamic memory. You’ll see pointers and heap allocations
throughout the rest of this text. For example, object-oriented program-
ming techniques, which we’ll discuss in Chapter 5, are especially helpful
when dealing with pointers. They allow us to encapsulate pointers in
such a way that we don’t have to worry about memory leaks, dangling
pointers, or any of the other common pointer pitfalls.

Even though there is much more to learn about problem solving in


this area, you’ll be able to develop your skills with pointer-based struc-
Figure 4-15: How the local variable loopPtr changes during loop iterations
tures of increasing complexity if you follow the basic ideas in this chap-
in the averageRecord function
ter: First, apply the general rules of problem solving. Then, apply specific
rules for pointers, and use a diagram or similar tool to visualize each so-
Once the loop is all done, we just divide the sum by the count and re-
lution before you start coding.
turn the result ➑.

Exercises
The code works on our sample case, but as always, we need to check
for potential special cases. Again, with lists, the most obvious special case I’m not kidding about doing the exercises. You’re not just reading the
is an empty list. What happens with our code if sc is NULL when the func- chapters and moving on, are you?
tion begins?
4-1. Design your own: Take a problem that you already know how to solve us-
Guess what? The code blows up. (I had to make one of these special ing an array but that is limited by the size of the array. Rewrite the code
cases turn out badly; otherwise, you wouldn’t take me seriously.) There’s to remove that limitation using a dynamically allocated array.
nothing wrong with the loop for the processing of the linked list itself. If
sc is NULL, then loopPtr is initialized to NULL, the loop ends as soon as it be- 4-2. For our dynamically allocated strings, create a function substring that
gins, and sum is left at zero, which seems reasonable enough. The problem takes three parameters: an arrayString, a starting position integer, and an
is when we perform the division to compute the average ➑, count is also integer length of characters. The function returns a pointer to a new dy-
zero, which means we are dividing by zero and which will result in either namically allocated string array. This string array contains the characters
a program crash or a garbage result. To handle this special case, we could in the original string, starting at the specified position for the specified
check count against zero at the end of the function, but why not handle length. The original string is unaffected by the operation. So if the original
the situation up front and check sc? Let’s add the following as the new string was abcdefg, the position was 3, and the length was 4, then the new
first line in our averageRecord function: string would contain cdef.
4-3. For our dynamically allocated strings, create a function replaceString that 4-9. Imagine a linked list where instead of the node storing a character, the
takes three parameters, each of type arrayString: source, target, and re- node stores a digit: an int in the range 0–9. We could represent positive
placeText. The function replaces every occurrence of target in source with numbers of any size using such a linked list; the number 149, for exam-
replaceText. For example, if source points to an array containing abcdabee, ple, would be a linked list in which the first node stores a 1, the second a
target points to ab, and replaceText points to xyz, then when the function 4, and the third and last a 9. Write a function intToList that takes an inte-
ends, source should point to an array containing xyzcdxyzee. ger value and produces a linked list of this sort. Hint: You may find it eas-
ier to build the linked list backward, so if the value were 149, you would
4-4. Change the implementation of our strings such that location[0] in the ar- create the 9 node first.
ray stores the size of the array (and therefore location[1] stores the first
actual character in the string), rather than using a null-character termina- 4-10. For the digit list of the previous exercise, write a function that takes two
tor. Implement each of the three functions, append, concatenate, and char- such lists and produces a new list representing their sum.
actertAt, taking advantage of the stored size information whenever possi-
ble. Because we’ll no longer be using the null-termination convention ex-
pected by the standard output stream, you’ll need to write your own out-
put function that loops through its string parameter, displaying the
characters.

4-5. Write a function removeRecord that takes a pointer to a studentCollection Support Sign Out

and a student number and that removes the record with that student
number from the collection. ©2022 O'REILLY MEDIA, INC. TERMS OF SERVICE PRIVACY POLICY

4-6. Let’s create an implementation for strings that uses a linked list of charac-
ters instead of dynamically allocated arrays. So we’ll have a linked list
where the data payload is a single char; this will allow strings to grow
without having to recreate the entire string. We’ll start by implementing
the append and characterAt functions.

4-7. Following up on the previous exercise, implement the concatenate func-


tion. Note that if we make a call concatenate(s1, s2), where both parame-
ters are pointers to the first nodes of their respective linked lists, the func-
tion should create a copy of each of the nodes in s2 and append them to
the end of s1. That is, the function should not simply point the next field of
the last node in s1’s list to the first node of s2’s list.

4-8. Add a function to the linked-list string implementation called removeChars


to remove a section of characters from a string based on the position and
length. For example, removeChars(s1, 5, 3) would remove the three char-
acters starting at the fifth character in the string. Make sure the removed
nodes are properly deallocated.

You might also like