0% found this document useful (0 votes)
156 views

Computer Science C++

The document discusses character strings in C++. It defines character strings as arrays of characters with a null terminator byte at the end. It explains that strings in C++ are usually null-terminated. It also discusses defining and initializing string variables, inputting strings using extraction and get/getline functions, and two common methods for handling strings of fixed or variable lengths in input data.

Uploaded by

sellary
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views

Computer Science C++

The document discusses character strings in C++. It defines character strings as arrays of characters with a null terminator byte at the end. It explains that strings in C++ are usually null-terminated. It also discusses defining and initializing string variables, inputting strings using extraction and get/getline functions, and two common methods for handling strings of fixed or variable lengths in input data.

Uploaded by

sellary
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Strings

500

Chapter 11 Strings

Section A: Basic Theory


Defining Character Strings
A character string in C++, such as Hello World, is stored in an array of characters with an extra byte on the end marking the end of the string. This extra end of string marker is called the null terminator and consists of a byte whose value is zero, that is, all the bits are zero. In C++, nearly all strings ever used are null-terminated. However, the language provides for non-null terminated strings as well. But, in C++, unlike C, these are seldom used. Suppose for example you used the string literal value Sam. We know that if we had written cout << "Sam"; then the output stream displays Sam But how does the computer know how long the literal string is and where it ends? The answer is that the literal Sam is a null-terminated string consisting of four bytes containing 'S', 'a', 'm', and 0. This null-terminator can be represented by a numerical 0 or by the escape sequence \0. Most all of the C++ functions that take a character string as an argument expect that string to be null-terminated. The null-terminator marks the end of the characters in the variable. For example, suppose that a variable is defined to hold a person's name as follows char name[21]; This definition is saying that the maximum number of characters that can be stored is twenty plus one for the null-terminator. This maximum length is different from the number of characters actually stored when a person's name is entered. For example, assume that the program has inputted the name as follows cin >> name; Assume that the user has entered Sam from the keyboard. In this instance, only four of the possible twenty-one are in use with the null terminator in the 4th character. Make sure you understand the distinction between the maximum size of a string and the actual size in a specific instance.

Strings

501

When defining a string variable, it makes good programming sense not to hard-code the array bounds but to use a const int, just as is done with other kinds of arrays. Thus, the name variable ought to have been coded like this const int NAMELENGTH = 21; int main () { char name[NAMELENGTH]; If at a later date you decide that twenty character names are too short, it is a simple matter of changing the constant value and recompiling. When character string variables are defined, they can also be initialized. However, two forms are possible. Assume that when the name variable is defined, it should be given the value of Sam. Following the initialization syntax for any other kind of array, one could code char name[NAMELENGTH] = {'S', 'a', 'm', '\0'}; Here each specific character is assigned its starting value; do not fail to include the null terminator. However, the compiler allows a string to be initialized with another literal string as follows char name[NAMELENGTH] = "Sam"; Clearly this second form is much more convenient. With all forms of arrays, when defining and initializing an array, it is permissible to omit the array bounds and let the compiler determine how many elements the array must have based on the number of initial values you provide. Thus, the following is valid. char name[] = "Sam"; However, in general, this approach is lousy programming style and error prone. Why? In the above case, the compiler allocates an array just large enough to hold the literal Sam. That is, the name array is only four characters long. What would happen if later on one attempted to input another name that needed more characters? Disaster. Always provide the array bounds whenever possible.

Inputting Character Strings


Using the Extraction Operator
The extraction operator can be used to input character strings. The specific rules of string extraction follow those for the other data types we have covered. It skips over whitespace to the first non-whitespace character, inputs successive characters storing them into successive bytes in the array until the extraction operator encounters whitespace or the end of file. Lastly, it stores the null terminator. There are two aspects of this input operation that frequently make the use of the extraction operator useless. Notice that the extraction operator does not permit a blank to be in the string. Suppose that you prompted the user to input their name and age and then used cin to input them as

Strings follows

502

cin >> name >> age; What results if the user enters the following data? Sam Spade 25 The input stream goes into the bad or fail state. It inputs the characters Sam and stores them along with the trailing null-terminator into the name field. It skips over the blank and attempts to input the character S of Spade into the age integer and goes immediately into the bad state. If you reflect upon all the different kinds of strings that you might encounter in the real world of programming (names, product descriptions, addresses, cities), the vast majority may have embedded blanks in them. This rules out the extraction operation as a method of inputting them. The other part of the extraction operator rules is quite destructive, especially if you are running on the Windows 95/98 platform. It inputs all characters until it finds whitespace or EOF. Now suppose that the field name is defined to be an array of 21 characters. What happens if in response to the prompt to enter a name, the user enters the following name. Rumplestillskinchevskikov The computer attempts to store 26 characters into an array that is only 21 characters long. Four bytes of memory are now overlaid. What happens next is unpredictable. If another variable in your program occupies that overlaid memory, its contents are trashed. If that memory is not even part of your program, but is part of some other program, such as a Windows system dll, it is overlaid; even wilder things can happen! Under Windows NT/2000, if you attempt to overlay memory that is not part of your data segment, the program is aborted instead. This is one reason for so many system crashes under Windows 95/98. One way to get around the extraction operator's disadvantages is to use either the get() or getline() function. The get() function can be used in one of two ways. Note: while I am using cin in these examples, any ifstream instance can be used as well. cin.get (string variable, sizeof (string variable)); cin.get (string variable, sizeof (string variable), delimiter character); These input all characters from the current position in the stream until either the maximum number of characters including the null terminator has been read or EOF or the delimiter is found. By default the delimiter is a new line code. The delimiter is not extracted but remains in the input stream. cin.getline (string variable, sizeof (string variable)); cin.getline (string variable, sizeof (string variable), delimiter character); This function works the same way except the delimiter is removed from the input stream but never stored in the string variable. It also defaults to the new line code.

Strings

503

Method A All Strings Have the Same Length


This is a common situation. In the input set of data or file, all character strings are the same length, the maximum. Shorter strings have blanks added onto the end of the character series to fill out the maximum length. Assume that a cost record input set of data contains the item number, quantity, description and cost fields. The program defines the input fields as follows. const int DescrLimit = 21; long itemnumber; long quantity; char description[DescrLimit]; double cost; The description field can hold up to twenty characters plus one for the null terminator. The input set of data would appear as 12345 10 Pots and Pans 14.99 34567 101 Cups 5.99 45667 3 Silverware, Finished 10.42 Notice how the shorter strings are padded with blanks so that in all circumstances the description field is 20 characters long. The data is then input this way. infile >> itemnumber >> quantity >> ws; infile.get (description, sizeof (description)); infile >> cost; Observe that the first line ends by skipping over whitespace to position the input stream to the first character of the description field. sizeof() always returns the number of bytes the variable occupies. In the case of the description field, it yields twenty-one. If one used sizeof(quantity), it would return four bytes, since longs occupy four bytes. One could also use the constant integer DescrLimit instead of the sizeof(); this subtle difference will be important shortly. Many company input data files are set up in this manner. What is input and stored in the description field when the second line of data above is input? The description contains Cups " that is, the characters C-u-p-s followed by sixteen blanks and then the null terminator. There is one drawback to this method. The blanks are stored. Shortly we will see how character strings can be compared to see if two contain the same values. Clearly, if we compared this description to the literal Cups, the two would not be equal. Can you spot why? The inputted description contains sixteen blanks that the literal does not contain! Thus, if the trailing blanks are going to present a problem to the processing logic of the program, they need to be removed. On the other hand, if the description field is only going to be displayed, the presence of the blanks is harmless. With a few lines of coding, the blanks can be removed. The idea is to begin at the end of the string and if that byte contains a blank, back up another byte until a byte that is non-blank is

Strings

504

found. Then place a null terminator in the last blank position. Since the length of all strings must be twenty characters (after the get() function is done, the null terminator is in the twenty-first position), the location of the last byte that contains real data must be subscript 19. The null terminator must be at subscript 20. The following coding can be used to remove the blanks at the end, if any. int index = DescrLimit - 2; // or 19 while (index >= 0 && description[index] == ' ') index--; // here index = subscript of the first non-blank char index++; description[index] = 0; // insert a null-terminator // over last blank If the description contains all blanks or if the string contains a non-blank character in the 20th position, this coding still works well. The main problem to consider when inputting strings with the get() function is handling the detection of the end of file properly. We are used to seeing coding such as while (cin >> itemnumber >> quantity) { But in this case, the input operation cannot be done with one chained series of extraction operators. Rather, it is broken into three separate statements. Consider replacing the three lines of coding with a new user helper function. while (GetData (infile, itemnumber, quantity, description, cost, DescrLimit)) { The function would be istream& GetData (istream& infile, long& itemnumber, long& quantity, char description[], double& cost, int descrLimit) { infile >> itemnumber >> quantity >> ws; if (!infile) return infile; infile.get (description, descrLimit); if (!infile) return infile; infile >> cost; if (!infile) return infile; int index = descrLimit - 2; while (index >= 0 && description[index] == ' ') index--; index++; description[index] = 0; return infile; } Vitally important is that the number of bytes to use in the get() function this time is not sizeof(description). Why? Within the function, the description is the memory address of where the first element of the array of characters is located. Memory addresses are always four bytes in size on a 32-bit platform. Thus, had we used sizeof(description), then 4 bytes would have been the limit!

Strings

505

Method A, where all strings are the same length, also applies to data files that have more than one string in a line of data. Consider a customer data line, which contains the customer number, name, address, city, state and zip code. Here three strings potentially contain blanks, assuming the state is a two-digit abbreviation. Thus, Method A is commonly used.

Method B String Contains Only the Needed Characters, But Is the Last Field on a Line
In certain circumstances, the string data field is the last item on the input data line. If so, it can contain just the number of characters it needs. Assume that the cost record data were reorganized as shown (<CRLF> indicates the enter key). 12345 10 14.99 Pots and Pans<CRLF> 34567 101 5.99 Cups<CRLF> 45667 3 10.42 Silverware, Finished<CRLF> This data can be input more easily as follows. infile >> itemnumber >> quantity >> cost >> ws; infile.get (description, sizeof (description)); Alternately, the getline() function could also be used. There are no excess blanks on the end of the descriptions to be removed. It is simpler. However, its use is limited because many data entry lines contain more than one string and it is often impossible to reorganize a company's data files just to put the string at the end of the data entry lines. Method B works well when prompting the user to enter a single string. Consider the action of asking the user to enter a filename for the program to use for input. Note on the open function call for input, we can use the ios::in flag and for output we use the ios::out flag. char filename[_MAX_PATH]; cin.getline (filename, sizeof(filename)); ifstream infile; infile.open (filename, ios::in); When dealing with filenames, one common problem to face is just how many characters long should the filename array actually be? The compiler provides a #define of _MAX_PATH (in the header file <iostream>) that contains the platform specific maximum length a complete path could be. For Windows 95, that number is 256 bytes.

Strings

506

Method C All strings Are Delimited


The problem that we are facing is knowing where a string actually ends because a blank is not usually a good delimiter. Sometimes quote marks are used to surround the string data. Here a " mark begins and ends a string. Suppose that the input data appeared as follows. 12345 10 "Pots and Pans" 14.99 34567 101 "Cups" 5.99 45667 3 "Silverware, Finished" 10.42 When a string is delimited, the data can be input rather easily if we use the alternate form of the get() function, supplying the delimiter \". char junk; infile >> itemnumber >> quantity >> junk; infile.get (description, sizeof (description), '\"'); infile >> junk >> cost; Notice that we must input the beginning quote mark. The get() function leaves the delimiter in the input stream, so we must extract it before continuing on with the next field, cost. On the other hand, the getline() function removes the delimiter. Coding becomes simpler. char junk; infile >> itemnumber >> quantity >> junk; infile.getline (description, DescrLimit, '\"'); infile >> cost;

Outputting Character Strings


Outputting strings presents a different set of problems, ones of spacing and alignment. In most all cases, the insertion operator handles the output of strings quite well. In the most basic form one might output a line of the cost record as follows cout << setw (10) << itemnumber << setw (10) << quantity << description << setw (10) << cost << endl; If the entire program output consisted of one line, the above is fine. Usually, the output consists of many lines, columnarly aligned. If so, the above fails utterly. With a string, the insertion operator outputs all of the characters up to the null terminator. It does not output the null terminator. With strings of varying length, there is going to be an unacceptable jagged right edge in the description column. On the other hand, if Method A was used to input the strings and all strings are of the same length, all is well until the setw() function is used to define the total field width. Suppose that the description field should be displayed within a width of thirty columns. One might be tempted to code cout << setw (10) << itemnumber

Strings << setw (10) << quantity << setw (30) << description << setw (10) << cost << endl;

507

The default field alignment of an ostream is right alignment. All of our numeric fields display perfectly this way. But when right alignment is used on character strings, the results are usually not acceptable as shown below 12345 10 Pots and Pans 14.99 34567 101 Cups 5.99 45667 3 Silverware, Finished 10.42 Left alignment must be used when displaying strings. Right alignment must be used when displaying numerical data. The alignment is easily changed by using the setf() function. cout << setw (10) << itemnumber << setw (10) << quantity; cout.setf (ios::left, ios::adjustfield); cout << setw (30) << description; cout.setf (ios::right, ios::adjustfield); cout << setw (10) << cost << endl; In the call to setf(), the second parameter ios::adjustfield clears all the justification flags that is, turns them off. Then left justification is turned on. Once the string is output, the second call to setf() turns right justification back on for the other numerical data. It is vital to use the ios::adjustfield second parameter. The Microsoft implementation of the ostream contains two flags, one for left and one for right justification. If the left justification flag is on, then left justification occurs. Since there are two separate flags, when setting justification, failure to clear all the flags can lead to the weird circumstance in which both left and right justification flags are on. Now you have left-right justification (a joke) from now on, the output is hopelessly messed up justification-wise. Alternatively, one can use the much more convenient manipulator functions: left and right. cout << setw (10) << itemnumber << setw (10) << quantity << left << setw (30) << description << right << setw (10) << cost << endl; Finally, the insertion operator displays all characters in a string until it encounters the null terminator. What happens if by accident a string is missing its null terminator? Simple, the insertion operator displays all bytes until it finds a null terminator. I often refer to this action as a light show. Yes, one sees the contents of the string appear, but garbage characters follow that. If a line gets full, DOS line wraps and continues on the next line. If the screen fills, DOS scrolls. All of this occurs at a blazing speed. Sit back and relax; don't panic if this happens to you. It is harmless. Enjoy the show. It will stop eventually when it finds a byte with a zero in it.

Strings

508

Passing a String to a Function


When passing a string to a function, the prototype of the string is just like that of any other array. Suppose that we have a PrintRecord() function whose purpose was to display one cost record. The description string must be passed. The prototype of the PrintRecord() function is void PrintRecord (const char description[],... and the main() function could invoke it as PrintRecord (description,... Recall that the name of an array is always the memory address of the first element, or a pointer. Sometimes you may see the prototype for a string using pointer notation instead of array notation. void PrintRecord (const char* description, ... These are entirely equivalent notations when passing a string to a function. Remember, if a function is not going to alter the callers character string, it should have the const qualifier.

Working with Strings


Working with character string fields presents some new problems that we have not encountered before. Suppose that we have the following fields defined and have inputted some data into them. const int NameLen = 21; char previousName[NameLen]; char currentName[NameLen]; Suppose that we needed to compare the two names to see if they were equal or not that is, they contain the same series of characters. Further, suppose that if they are not the same, we needed to copy the current name into the previous name field. One might be tempted to code the following. if (previousName != currentName) { previousName = currentName; Coding the above cannot possibly work. Why? Remember that the name of an array is the memory address where that array begins in memory. For the sake of illustration, assume that the previousName array begins at memory address 5000 and that the currentName array begins at memory location 8000. If you substitute these values for the variable array names in the above coding as the compiler does, you end up with this if (5000 != 8000) { 5000 = 8000; In all cases, the test condition is always true, for 5000 is not 8000, ever. But look at the assignment, it is ludicrous. Although the test condition compiles with no errors, the assignment line generates an error message.

Strings

509

To our rescue comes the library of string functions. The prototypes of all of these string functions are in the header file <string>.

Comparing Strings
Here is where the new changes Microsoft has made in .NET 2005 come to the forefront. Older code now recompiled using .NET 2005 will produce a large number of warning message about function calls now being depricated, that is obsolete. First, lets examine the older versions and then see why Microsoft has made unilateral, not yet in the C++ Standard, changes. The Old Way: To compare two strings, use either strcmp() or stricmp(). strcmp() is a case sensitive string compare function. stricmp() is a case insensitive string compare function. Both functions return an integer indicating the result of the comparison operation. The prototype of the string comparison function is this. int strcmp (const char* string1, const char* string2); It is showing that we pass it the two strings to be compared. However, the notation, const char* also indicates that the strings contents are constant. That is, the comparison function cannot alter the contents of either string. If the parameters were just char* string1, then potentially the contents of the string we passed could be altered in some way. The const char* notation is our guarantee that the function cannot alter the contents of the string we pass. It is rather like making the string read-only. The integer return code indicates the result: 0 => the two strings are the same positive => the first string is larger negative => the first string is smaller The New Way: To compare two strings, use either strcmp() or _stricmp(). strcmp() is a case sensitive string compare function. _stricmp() is a case insensitive string compare function. Both functions return an integer indicating the result of the comparison operation. The prototype of the string comparison function is this. int strcmp (const char* string1, const char* string2); int _stricmp (const char* string1, const char* string2); It is showing that we pass it the two strings to be compared. Both functions abort the program if either of the two passed memory addresses is zero or NULL. While the meaning of the results phrase, the two strings are the same, is obvious, the other two results might not be so clear. Character data is stored in an encoding scheme, often ASCII, American Standard Code for Information Interchange. In this scheme, the decimal number 65 represents the letter A. The letter B is a 66; C, a 67, and so on. If the first string begins with the letter A and the second string begins with the letter B, then the first string is said to be smaller than the second string because the 65 is smaller than the 66. The comparison

Strings

510

function returns the value given by A B or (65 66) or a negative number indicating that the first string is smaller than the second string. When comparing strings, one is more often testing for the equal or not equal situation. Applications that involve sorting or merging two sets of strings would make use of the smaller/larger possibilities. To fix up the previous example in which we wanted to find out if the previousName was not equal to the currentName, we should code the following assuming that case was important. if (strcmp (previousName, currentName) != 0) { If we wanted to ignore case sensitivity issues, then code this. if (_stricmp (previousName, currentName) != 0) {

Copying Strings
The older function to copy a string is strcpy(). Its prototype is char* strcpy (char* destination, const char* source); It copies all characters including the null terminator of the source string, placing them in the destination string. In the previous example where we wanted to copy the currentName into the previousName field, we code strcpy (previousName, currentName); Of course, the destination string should have sufficient characters in its array to store all the characters contained in the source string. If not, a memory overlay occurs. For example, if one has defined the following two strings char source[20] = "Hello World"; char dest[5]; If one copies the source string to the destination string, memory is overlain. strcpy (dest, source); Seven bytes of memory are clobbered in this case and contain a blank, the characters World and the null terminator. This clobbering of memory, the core overlay, or more politically correct, buffer overrun, has taken its toll on not only Microsoft coding but many other applications. Hackers and virus writers often take advantage of this inherently insecure function to overwrite memory with malicious machine instructions. Hence, Microsoft has unilaterally decided to rewrite the standard C Libraries to prevent such from occurring. As of this publication, Microsofts changes are not in the ANSII C++ standard. The new string copy function looks like this. char* strcpy_s (char* destination, size_t destSize, const char* source); It copies all characters including the null terminator of the source string, placing them in the destination string, subject to not exceeding the maximum number of bytes of the destination

Strings string. In all cases, the destination string will be null terminated. However, if the source or destination memory address is 0 or if the destination string is too small to hold the result, the program is basically terminated at run time. In a later course, a program can prevent this abnormal termination and do something about the problem.

511

In the previous example where we wanted to copy the currentName into the previousName field, we now code strcpy_s (previousName, sizeof (previousName), currentName); This text will consistently use these new Microsoft changes. If you are using another compiler, either use the samples provided in the 2002-3 samples folder or remove the sizeof parameter along with the _s in the function names.

Getting the Actual Number of Characters Currently in a String


The next most frequently used string function is strlen(), which returns the number of bytes that the string currently contains. Suppose that we had defined char name[21] = "Sam"; If we code the following int len = strlen (name); // returns 3 bytes int size = sizeof (name); // returns 21 bytes then the strlen(name) function would return 3. Notice that strlen() does NOT count the null terminator. The sizeof(name) gives the defined number of bytes that the variable contains or 21 in this case. Notice the significant difference. Between these two operations.

Concatenating or Joining Two Strings into One Larger String


Again, there is a new version of this function in .NET 2005. The older function is the strcat() function which appends one string onto the end of another string forming a concatenation of the two strings. Suppose that we had defined char drive[3] = "C:"; char path[_MAX_PATH] = "\\Programming\\Samples"; char name[_MAX_PATH] = "test.txt"; char fullfilename[_MAX_PATH]; In reality, when users install an application, they can place it on nearly any drive and nearly any path. However, the application does know the filename and then has to join the pieces together. The objective here is to join the filename components into a complete file specification so that the fullfilename field can then be passed to the ifstream open() function. The sequence would be strcpy (fullfilename, drive); // copy the drive string strcat (fullfilename, path); // append the path

Strings strcat (fullfilename, "\\"); // append the \ strcat (fullfilename, name); // append filename infile.open (fullfilename, ios::in); // open the file

512

The new version is strcat_s() which now takes the destination maximum number of bytes as the second parameter before the source string. The above sequence using the newer functions is this. strcpy_s (fullfilename, _MAX_PATH, drive); // copy the drive strcat_s (fullfilename, _MAX_PATH, path); // append the path strcat_s (fullfilename, _MAX_PATH, "\\"); // append the \ strcat_s (fullfilename, _MAX_PATH, name); // append filename infile.open (fullfilename, ios::in); // open the file

The String Functions


There are a number of other string functions that are available. The next table lists some of these and their use. The prototypes of all of these are in <string>. The data type size_t is really an unsigned integer. Name: strlen Meaning: string length function Prototype: size_t strlen (const char* string); Action done: returns the current length of the string. size_t is another name for an unsigned int. Example: char s1[10] = "Sam"; char s2[10] = ""; strlen (s1) yields 3 strlen (s2) yields 0 Name-old: strcmp and stricmp Meaning: string compare, case sensitive and case insensitive Prototype: int strcmp (const char* string1, const char* string2); int stricmp (const char* string1, const char* string2); Action done: strcmp does a case sensitive comparison of the two strings, beginning with the first character of each string. It returns 0 if all characters in both strings are the same. It returns a negative value if the different character in string1 is less than that in string2. It returns a positive value if it is larger. Example: char s1[10] = "Bcd"; char s2[10] = "Bcd"; char s3[10] = "Abc"; char s4[10] = "Cde";

Strings char s5[10] = "bcd"; strcmp (s1, s2) yields 0 - stings are equal stricmp (s1, s5) yields 0 - strings are equal strcmp (s1, s3) yields a + value s1 > s3 strcmp (s1, s4) yields a value s1 < s4

513

Name-new: strcmp and _stricmp Meaning: string compare, case sensitive and case insensitive Prototype: int strcmp (const char* string1, const char* string2); int _stricmp (const char* string1, const char* string2); Action done: strcmp does a case sensitive comparison of the two strings, beginning with the first character of each string. It returns 0 if all characters in both strings are the same. It returns a negative value if the different character in string1 is less than that in string2. It returns a positive value if it is larger. Both functions abort the program if the memory address is null or 0. Example: char s1[10] = "Bcd"; char s2[10] = "Bcd"; char s3[10] = "Abc"; char s4[10] = "Cde"; char s5[10] = "bcd"; strcmp (s1, s2) yields 0 - stings are equal _stricmp (s1, s5) yields 0 - strings are equal strcmp (s1, s3) yields a + value s1 > s3 strcmp (s1, s4) yields a value s1 < s4 Name-old: strcat Meaning: string concatenation Prototype: char* strcat (char* desString, const char* srcString); Action done: The srcString is appended onto the end of the desString. Returns the desString address Example: char s1[20] = "Hello"; char s2[10] = " World"; strcat (s1, s2); yields "Hello World" in s1. Name-new: strcat_s Meaning: string concatenation Prototype: strcat (char* desString, size_t maxDestSize, const char* srcString); Action done: The srcString is appended onto the end of the desString. Aborts the program if dest is too small. Example: char s1[20] = "Hello"; char s2[10] = " World"; strcat_s (s1, sizeof(s1), s2); yields "Hello World" in s1.

Strings

514

Name-old: strcpy Meaning: string copy Prototype: char* strcpy (char* desString, const char* srcString); Action done: All bytes of the srcString are copied into the destination string, including the null terminator. The function returns the desString memory address. Example: char s1[10]; char s2[10] = "Sam"; strcpy (s1, s2); When done, s1 now contains "Sam". Name-new: strcpy_s Meaning: string copy Prototype: char* strcpy (char* desString, size_t maxDestSize, const char* srcString); Action done: All bytes of the srcString are copied into the destination string, including the null terminator. The function returns the desString memory address. It aborts the program if destination is too small. Example: char s1[10]; char s2[10] = "Sam"; strcpy_s (s1, sizeof (s1), s2); When done, s1 now contains "Sam". Name: strchr Meaning: search string for first occurrence of the character Prototype: char* strchr (const char* srcString, int findChar); Action done: returns the memory address or char* of the first occurrence of the findChar in the srcString. If findChar is not in the srcString, it returns NULL or 0. Example: char s1[10] = "Burr"; char* found = strchr (s1, 'r'); returns the memory address of the first letter r character, so that found[0] would give you that 'r'. Name: strstr Meaning: search string1 for the first occurrence of find string Prototype: char* strstr (const char* string1, const char* findThisString); Action done: returns the memory address (char*) of the first occurrence of findThisString in string1 or NULL (0) if it is not present. Example: char s1[10] = "abcabc"; char s2[10] = "abcdef"; char* firstOccurrence = strstr (s1, "abc");

Strings

515

It finds the first abc in s1 and firstOccurrence has the same memory address as s1, so that s1[0] and firstOccurrence[0] both contain the first letter 'a' of the string char* where = strstr (s2, "def"); Here where contains the memory address of the 'd' in the s2 Name-old: strlwr Meaning: string to lowercase Prototype: char* strlwr (char* string); Action done: All uppercase letters in the string are converted to lowercase letters. All others are left untouched. Example: char s1[10] = "Hello 123"; strlwr (s1); Yields "hello 123" in s1 when done. Name-new: strlwr_s Meaning: string to lowercase Prototype: char* strlwr (char* string, size_t maxSizeOfString); Action done: All uppercase letters in the string are converted to lowercase letters. All others are left untouched. Example: char s1[10] = "Hello 123"; strlwr_s (s1, sizeof (s1)); Yields "hello 123" in s1 when done. Name-old: strupr Meaning: convert a string to uppercase Prototype: char* strupr (char* string); Action done: Any lowercase letters in the string are converted to uppercase; all others are untouched. Example: char s1[10] = "Hello 123"; strupr (s1); When done, s1 contains "HELLO 123" Name-new: strupr_s Meaning: convert a string to uppercase Prototype: char* strupr_s (char* string, size_t maxSizeOfString); Action done: Any lowercase letters in the string are converted to uppercase; all others are untouched. Example: char s1[10] = "Hello 123"; strupr_s (s1, sizeof(s1)); When done, s1 contains "HELLO 123"

Strings Name-old: strrev Meaning: string reverse Prototype: char* strrev (char* string); Action done: Reverses the characters in a string. Example: char s1[10] = "Hello"; strrev (s1); When done, string contains "olleH"

516

Name-new: _strrev Meaning: string reverse Prototype: char* _strrev (char* string); Action done: Reverses the characters in a string. It aborts the program if the memory address passed is null or 0; Example: char s1[10] = "Hello"; _strrev (s1); When done, string contains "olleH"

How Could String Functions Be Implemented?


Next, lets examine how the strcpy() and strcmp() functions could be implemented using array notation. The strcpy() function must copy all bytes from the srcString into the desString, including the null-terminator. It could be done as follows. char* strcpy (char* desString, const char* srcString) { int i = 0; while (desString[i] = srcString[i]) i++; return desString; } The while clause first copies a character from the source into the destination string. Then it compares the character it just copied. If that character was not equal to zero, the body of the loop is executed; i is incremented for the next character. If the character just copied was the null terminator, the test condition is false and the loop ends. Here is how the strcmp() function might be implemented using array notation. int strcmp (const char* string1, const char* string2) { int i=0; while (string1[i] && string1[i] == string2[i]) i++; return string1[i] - string2[i]; } The first test in the while clause is checking to see if we are at the null terminator of string1. If so, the loop ends. If not, then the corresponding characters of string1 and string2 are compared. If those two characters are equal, the loop body is executed and i is incremented. If the two

Strings

517

characters are different, the loop also ends. To create the return integer, the current characters are subtracted. If the two strings are indeed equal, then both bytes must be the null terminators of the respective strings; the return value is then 0. Otherwise, the return value depends on the ASCII numerical values of the corresponding characters.

Section B: A Computer Science Example


Cs11a Character String Manipulation Customer Names
One task frequently encountered when applications work with peoples names is that of conversion from firstname lastname into lastname, firstname. This problem explores some techniques to handle the conversion. There are many ways to accomplish splitting a name apart. Since the use of pointer variables (variables that contain the memory addresses of things), have not yet been discussed, the approach here is to use subscripting to accomplish it. Indeed, a programmer does need to be able to manipulate the contents of a string as well as utilize the higher lever string functions. This example illustrates low-level character manipulation within strings as well as utilizing some commonly used string functions. The problem is to take a customer name, such as John Jones and extract the first and last names (John and Jones) and then to turn it into the alternate comma delimited form, Jones, John. Alternatively, take the comma form and extract the first and last names. At first glance, the approach to take seems simple enough. When extracting the first and last names from the full name, look for a blank delimiter and take whats to the left of it as the first name and whats to the right as the last name. But what about names like Mr. and Mrs. John J. Jones? To find the last name portion, begin on the right or at the end of the string and move through the string in reverse direction looking for the first blank. That works fine until one encounters John J. Jones, Jr.. So we must make a further qualification on that first blank, and that is, there must not be a comma immediately in front of it. If there is, ignore that blank and keep moving toward the beginning of the string. When extracting the first and last names from the comma version (such as Jones, John J.), we can look for the comma followed by a blank pair. However, what about this one, Jones, Jr., John J.? Clearly we need to start at the end of the string and work toward the beginning of the string in the search for the comma-blank pair. Once we know the subscript of the blank or the comma-blank pair, how can the pieces be copied into the first and last name strings? This is done by copying byte by byte from some starting subscript through some ending subscript, appending a null terminator when finished. In this problem, I have made a helper function, CopyPartialString() to do just that.

Strings The function NameToParts() takes a full name and breaks it into first and last name strings. The original passed full name string is not altered and is declared constant.

518

The function NameToCommaForm() takes the first and last names and converts them into the comma-formatted name, last, first. Since the first and last names are not altered, those parameters are also declared constant. The function CommaFormToNames() converts a comma-formatted name into first and last names. Since the comma-formatted name is not altered, it is also declared constant. Lets begin by examining the output of the program to see what is needed. Here is the test run of Cs11a.
+))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))), * Cs11a Character String Manipulation - Sample Execution * /)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))1 * 1 Original Name: |John J. Jones| * * 2 First Name: |John J.| * * 3 Last Name: |Jones| * * 4 Comma Form: |Jones, John J.| * * 5 First and Last from comma form test ok * * 6 * * 7 Original Name: |Betsy Smith| * * 8 First Name: |Betsy| * * 9 Last Name: |Smith| * * 10 Comma Form: |Smith, Betsy| * * 11 First and Last from comma form test ok * * 12 * * 13 Original Name: |Mr. and Mrs. R. J. Smith| * * 14 First Name: |Mr. and Mrs. R. J.| * * 15 Last Name: |Smith| * * 16 Comma Form: |Smith, Mr. and Mrs. R. J.| * * 17 First and Last from comma form test ok * * 18 * * 19 Original Name: |Prof. William Q. Jones| * * 20 First Name: |Prof. William Q.| * * 21 Last Name: |Jones| * * 22 Comma Form: |Jones, Prof. William Q.| * * 23 First and Last from comma form test ok * * 24 * * 25 Original Name: |J. J. Jones| * * 26 First Name: |J. J.| * * 27 Last Name: |Jones| * * 28 Comma Form: |Jones, J. J.| * * 29 First and Last from comma form test ok * * 30 * * 31 Original Name: |Jones| * * 32 First Name: || * * 33 Last Name: |Jones| * * 34 Comma Form: |Jones| *

Strings

519

* 35 First and Last from comma form test ok * * 36 * * 37 Original Name: |Mr. John J. Jones, Jr.| * * 38 First Name: |Mr. John J.| * * 39 Last Name: |Jones, Jr.| * * 40 Comma Form: |Jones, Jr., Mr. John J.| * * 41 First and Last from comma form test ok * * 42 * * 43 Original Name: |Mr. John J. Jones, II| * * 44 First Name: |Mr. John J.| * * 45 Last Name: |Jones, II| * * 46 Comma Form: |Jones, II, Mr. John J.| * * 47 First and Last from comma form test ok * * 48 * * 49 Original Name: |Mr. John J. Jones, MD.| * * 50 First Name: |Mr. John J.| * * 51 Last Name: |Jones, MD.| * * 52 Comma Form: |Jones, MD., Mr. John J.| * * 53 First and Last from comma form test ok * * 54 * * 55 Original Name: |The Honorable Betsy Smith| * * 56 First Name: |The Honorable Betsy| * * 57 Last Name: |Smith| * * 58 Comma Form: |Smith, The Honorable Betsy| * * 59 First and Last from comma form test ok * * 60 * * 61 Original Name: |Betsy O'Neill| * * 62 First Name: |Betsy| * * 63 Last Name: |O'Neill| * * 64 Comma Form: |O'Neill, Betsy| * * 65 First and Last from comma form test ok * * 66 * .)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))-

Program Cs11a is going to input a file of customer names. For each name, it first converts that full name into first and last names. Then, it forms the comma-blank version of the full name. And finally, it extracts the first and last names from the comma-blank string. If the first and last names from the two approaches do not agree, an error is written. If they agree, an Okay message is displayed. Since blanks are important to this problem and since a blank is hard to spot, the | character is printed before and after each string, making errant blanks quite visible. The Top-Down Design is shown in Figure 11.1.

Strings

520

Figure 11.1 Top-Down Design of Name Program The main() function defines the arrays as shown in Figure 11.2. The sequence of processing steps for main() is as follows. open the input file, if it fails, display an error message and quit while we have successfully inputted a line into fullName do the following call NameToParts (fullName, firstName, lastName, MaxNameLen); call NameToCommaForm (commaName, firstName, lastName); call CommaFormToNames (commaName, firstFromCommaForm, lastFromCommaForm); output the results which include fullName, firstName, lastName and commaName if firstName and firstFromCommaForm are the same as well as lastName and lastFromCommaForm then output an Ok message else display an error message and the firstFromCommaForm and lastFromCommaForm end the while clause close the input file

Figure 11.2 Main Storage for main()

Strings

521

NameToParts() must break a full name into first and last names and is passed four parameters: fullName, firstName, lastName, and limit. As we work out the sequence of coding, lets work with a specific example. Suppose that fullName contains the following, where the 0 indicates the null terminator. I have written the subscripts below the corresponding characters. Mr. John J. Jones, MD.0 00000000001111111111222 01234567890123456789012 The strlen(fullName) yields 22 characters as the current length and the subscript for the last character in the string is thus 21. So working from the end of the string, look for a blank that does not have a comma immediately in front of it. i = strlen (fullName) 1; while (i >= 0) do the following does fullName[i] == ? If so do the following if there is a previous character that is, is i>0 and if that previous character is not a comma, fullName[i 1] != , then // we have found the spot so we need to break out of the loop break; with i on the blank end the if test end the does clause back up to the previous character, i--; end the while clause Now split out the two names. Notice we pass i+1, which is the first non-blank character in the last name. CopyParitalString (lastName, fullName, i+1, strlen (fullName)); CopyParitalString (firstName, fullName, 0, i); The CopyParitalString() functions purpose is to copy a series of characters in a source string from some beginning subscript through an ending subscript and then insert a null terminator. It is passed the dest string, the src string, startAt and endAt. is startAt >= endAt meaning we are starting at the ending point, there is nothing to copy, so just make the dest string a properly null-terminated string. dest[0] = 0 and return end is To copy the characters, we need a subscript variable for each string, isrc and ides. let isrc = startAt let ides = 0; Now copy all characters from startAt to endAt while isrc < endAt do the following dest[ides] = src[isrc]; increment both isrc and ides end while Finally, insert the null terminator

Strings dest[ides] = 0;

522

The NameToCommaForm() function is comparatively simple. From two strings containing the first and last names, make one combined new string of the form last name, first name. However, in some cases, there might not be any first name. In that case, the result should just be a copy of the last name string. NameToCommaForm() is passed three strings: the answer string to fill up commaName and the two source strings firstName and lastName. The sequence is as follows. strcpy (commaName, lastName); if a first name exists that is, does strlen (firstName) != 0, if so do append a comma and a blank strcat (commaName, , ) append the first name strcat (commaName, firstName) end if The CommaFormToNames() function must convert a single string with the form of last name, first name into first and last name strings. It is passed commaName to convert and the two strings to fill up - firstName and lastName. This time, we again begin at the end of the string looking for the first comma followed by a blank. Consider these two cases. Jones, Jr., Mr. John J.0 Jones, Prof. William Q.0 Clearly, we want to stop at the first , occurrence to avoid problems with Jr.. let len = strlen (commaName) let commaAt = len 2 while commaAt > 0 do the following if the current character at commaAt is a , and the character at commaAt + 1 is a blank, then break out of the loop back up commaAt end the while clause However, this could be compacted a bit more by using ! (not) logic in the while test condition. while commaAt > 0 and !(commaName[commaAt] == , && commaName[commaAt+1] == )) { When the loop ends, we must guard against no comma and blank found. if (commaAt <= 0) then there is no comma so do the following strcpy (lastName, commaName) firstName[0] = 0 and return end the if Finally, at this point, we have found the , portion; copy the two portions as follows. CopyParitalString (lastName, commaName, 0, commaAt) CopyParitalString (firstName, commaName, commaAt+2, len) As you study the coding, draw some pictures of some test data and trace what is occurring

Strings if you have any doubts about what is going on. Here is the complete program.

523

+))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))), * Cs11a Character String Manipulation * /)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))1 * 1 /***************************************************************/* * 2 /* */* * 3 /* Cs11a Character String Manipulation - Customer Names */* * 4 /* */* * 5 /***************************************************************/* * 6 * * 7 #include <iostream> * * 8 #include <iomanip> * * 9 #include <fstream> * * 10 #include <string> * * 11 using namespace std; * * 12 * * 13 const int MaxNameLen = 51; // the maximum length of names * * 14 * * 15 void NameToParts (const char fullName[],// converts a full name * * 16 char firstName[], // to first & last names * * 17 char lastName[], * * 18 int limit); * * 19 * * 20 void NameToCommaForm (char commaName[], // converts a first and * * 21 const char firstName[], // last name into a * * 22 const char lastName[]); // full name string * * 23 * * 24 void CommaFormToNames (const char commaName[],// converts a comma* * 25 char firstFromCommaForm[], // form of name into * * 26 char lastFromCommaForm[]); // first & last names* * 27 * * 28 void CopyParitalString (char dest[], // copies a part of the src * * 29 const char src[], // string into the dest * * 30 int startAt, // beginning at startAt and * * 31 int endAt); // ending at endAt * * 32 * * 33 int main () { * * 34 char fullName[MaxNameLen]; // original full name as input* * 35 char firstName[MaxNameLen]; // first name from full name * * 36 char lastName[MaxNameLen]; // last name from full name * * 37 char commaName[MaxNameLen]; // full name in comma form * * 38 char firstFromCommaForm[MaxNameLen];// first name from commaform* * 39 char lastFromCommaForm[MaxNameLen]; // last name from comma form* * 40 * * 41 ifstream infile ("Cs11a-Names.txt"); * * 42 if (!infile) { * * 43 cerr << "Error: cannot find the names file\n"; * * 44 return 1; * * 45 } * * 46 ofstream out ("results.txt"); * * 47 * * 48 while (infile.getline (fullName, sizeof (fullName))) { *

Strings
* 49 * 50 * 51 * 52 * 53 * 54 * 55 * 56 * 57 * 58 * 59 * 60 * 61 * 62 * 63 * 64 * 65 * 66 * 67 * 68 * 69 * 70 * 71 * 72 * 73 * 74 * 75 * 76 * 77 * 78 * 79 * 80 * 81 * 82 * 83 * 84 * 85 * 86 * 87 * 88 * 89 * 90 * 91 * 92 * 93 * 94 * 95 * 96 * 97 * 98 * 99 * 100

524
// break full name inputted into first and last names NameToParts (fullName, firstName, lastName, MaxNameLen);

* * * // turn first and last names into a comma form of full name * NameToCommaForm (commaName, firstName, lastName); * * // break comma form of full name into first and last names * CommaFormToNames (commaName, firstFromCommaForm, * lastFromCommaForm); * * // output results * out << "Original Name: |" << fullName << '|' << endl; * out << " First Name: |" << firstName << '|' << endl; * out << " Last Name: |" << lastName << '|' << endl; * out << " Comma Form: |" << commaName << '|' << endl; * * // test that first and last names agree from both forms * // of extraction * if (strcmp (firstName, firstFromCommaForm) == 0 && * strcmp (lastName, lastFromCommaForm) == 0) * out << " First and Last from comma form test ok" << endl; * else { * out << " Error from comma form - does not match\n"; * out << " First Name: |" << firstFromCommaForm << '|' <<endl;* out << " Last Name: |" << lastFromCommaForm << '|' <<endl;* } * out << endl; * } * infile.close (); * out.close (); * return 0; * } * * /***************************************************************/* /* */* /* CopyParitalString: copies src from startAt through endAt */* /* */* /***************************************************************/* * void CopyParitalString (char dest[], const char src[], * int startAt, int endAt) { * if (startAt >= endAt) { // avoid starting after ending * dest[0] = 0; // just set dest string to a null string* return; * } * * int isrc = startAt; * int ides = 0; * // copy all needed chars from startAt to endAt * for (; isrc<endAt; isrc++, ides++) { * dest[ides] = src[isrc]; * } *

Strings
* 101 * 102 * 103 * 104 * 105 * 106 * 107 * 108 * 109 * 110 * 111 * 112 * 113 * 114 * 115 * 116 * 117 * 118 * 119 * 120 * 121 * 122 * 123 * 124 * 125 * 126 * 127 * 128 * 129 * 130 * 131 * 132 * 133 * 134 * 135 * 136 * 137 * 138 * 139 * 140 * 141 * 142 * 143 * 144 * 145 * 146 * 147 * 148 * 149 * 150 * 151 * 152

525
dest[ides] = 0; // insert null terminator

* * * /***************************************************************/* /* */* /* NameToParts: break a full name into first and last name */* /* */* /***************************************************************/* * void NameToParts (const char fullName[], char firstName[], * char lastName[], int limit) { * // working from the end of the string, look for blank separator * // that does not have a , immediately in front of it * int i = (int) strlen (fullName) - 1; * while (i >= 0) { * if (fullName[i] == ' ') { // found a blank and * if (i>0 && fullName[i-1] != ',') { // earlier char is not a, * break; // end with i on the blank * } * } * i--; * } * CopyParitalString (lastName,fullName i+1,(int)strlen(fullName));* CopyParitalString (firstName, fullName, 0, i); * } * * /***************************************************************/* /* */* /* NameToCommaForm: from first & last names, make last, first */* /* */* /***************************************************************/* * void NameToCommaForm (char commaName[], const char firstName[], * const char lastName[]) { * strcpy_s (commaName, MaxNameLen, lastName); * if (strlen (firstName)) { // if a first name exists, * strcat_s (commaName, MaxNameLen, ", "); // add a , and blank * strcat_s (commaName, MaxNameLen, firstName); // add first name * } * } * * /***************************************************************/* /* */* /* CommaFormToNames: convert a last, first name to first & last*/* /* */* /***************************************************************/* * void CommaFormToNames (const char commaName[], char firstName[], * char lastName[]) { * // begin at the end and look for a ,blank * int len = (int) strlen (commaName); * int commaAt = len - 2; *

Strings

526

* 153 while (commaAt > 0 && * * 154 !(commaName[commaAt] == ',' && commaName[commaAt+1] == ' ')) {* * 155 commaAt--; * * 156 } * * 157 if (commaAt <= 0) { // here there is no comma so * * 158 strcpy_s (lastName, MaxNameLen, commaName); * * 159 firstName[0] = 0; // set first name to null string * * 160 return; * * 161 } * * 162 CopyParitalString (lastName, commaName, 0, commaAt); * * 163 CopyParitalString (firstName, commaName, commaAt+2, len); * * 164 } * .)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))-

Section C: An Engineering Example


Engineering problems primarily make use of strings as labels or identifiers associated with a set of numerical values.

Engr11a Weather Statistics Revisited


On a daily basis, weather statistics for cities scattered around the state are collected, summarized and forwarded to our center for processing. Our company maintains an Internet web page that lists the unusual weather occurrences within the last 24-hour period. Write a program that inputs the daily weather file and displays those cities with unusual weather in a nicely formatted report. An input line consists of the city surrounded by double quote marks, such as Peoria. Next, come the high and low temperatures, the rainfall amount, the snowfall amount, and wind speed. Unusual weather is defined to be a high temperature above 95, a low temperature below 0, a rainfall amount in excess of two inches, snowfall accumulations in excess of six inches or a wind speed greater than 45 mph. Since each days data is stored in a different file, the program first should prompt the user to enter the filename to be used for the input. Also prompt the user for the output file to which the report is to be written. An output line might appear as Hi Low

City

Rain

Snow

Winds

Peoria 85 55 0 0 55* Washington 99* 75 0 0 10 A * character is placed after the weather statistic that is unusual.

Strings

527

Since this problem is quite basic, I have not included the coding sketch. By now, the logic should be obvious. Here are the program listing and the sample output. Make sure you examine the instructions that process the new string variables.
+))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))), * Listing for Program Engr11a - Unusual Weather Statistics * /)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))1 * 1 /***************************************************************/* * 2 /* */* * 3 /* Engr11a: Unusual Weather Statistics report */* * 4 /* */* * 5 /***************************************************************/* * 6 * * 7 #include <iostream> * * 8 #include <iomanip> * * 9 #include <fstream> * * 10 #include <string> * * 11 using namespace std; * * 12 * * 13 const int MaxCityLen = 21; // city name length is 20 chars * * 14 * * 15 int main () { * * 16 * * 17 char infilename[_MAX_PATH]; * * 18 char reportname[_MAX_PATH]; * * 19 cout << "Enter the filename with today's weather data\n"; * * 20 cin.getline (infilename, sizeof (infilename)); * * 21 cout << "\nEnter the report filename\n"; * * 22 cin.getline (reportname, sizeof(reportname)); * * 23 * * 24 ifstream infile; * * 25 infile.open (infilename); * * 26 if (!infile) { * * 27 cerr << "Error: cannot open file: " << infilename << endl; * * 28 return 1; * * 29 } * * 30 * * 31 ofstream outfile; * * 32 outfile.open (reportname, ios::out); * * 33 if (!outfile) { * * 34 cerr << "Error: cannot open file: " << reportname << endl; * * 35 return 1; * * 36 } * * 37 // setup floating point output format * * 38 outfile << fixed << setprecision (1); * * 41 * * 42 outfile << "Unusual Weather Report\n\n"; * * 43 outfile<<"City High Low Rain Snow"* * 44 " Wind\n"; * * 45 outfile<<" Fall Fall"* * 46 " Speed\n\n"; * * 47 *

Strings
* 48 * 49 * 50 * 51 * 52 * 53 * 54 * 55 * 56 * 57 * 58 * 59 * 60 * 61 * 62 * 63 * 64 * 65 * 66 * 67 * 68 * 69 * 70 * 71 * 72 * * 76 * 77 * 78 * 79 * 80 * 81 * 82 * 83 * 84 * 85 * 86 * 87 * 88 * 89 * 90 * 91 * 92 * 93 * 94 * 95 * 96 * 97 * 98 * 99 * 100 * 101

528
* * * * * * * char junk; // to hold the " around city names * int line = 0; // line count for error processing * * while (infile >> junk) { // input the leading " of city * infile.get (city, sizeof (city), '\"'); * infile.get (junk); * infile >> high >> low >> rainfall >> snowfall >> windspeed; * // abort if there is incomplete or bad data * if (!infile) { * cerr << "Error: incomplete city data on line " << line <<endl;* infile.close (); * outfile.close (); * return 2; * } * if (high > 95 || low < 0 || rainfall > 2 || snowfall > 6 || * windspeed > 45) { * // unusual weather - display this city data * outfile << left << setw (22) << city << right * << setw (7) << high; * if (high > 95) * outfile << '*'; * else * outfile << ' '; * outfile << setw (7) << low; * if (low < 0) * outfile << '*'; * else * outfile << ' '; * outfile << setw (7) << rainfall; * if (rainfall > 2) * outfile << '*'; * else * outfile << ' '; * outfile << setw (7) << snowfall; * if (snowfall > 6) * outfile << '*'; * else * outfile << ' '; * outfile << setw (7) << windspeed; * if (windspeed > 45) * outfile << '*'; * else * outfile << ' '; * outfile << endl; * } *

char city [MaxCityLen]; float high; float low; float rainfall; float snowfall; float windspeed;

// // // // // //

string to hold city name high temperature of the day - F low temperature of the day - F rainfall in inches snowfall in inches wind speed in mph

Strings

529

* 102 } * * 103 infile.close (); * * 104 outfile.close (); * * 105 return 0; * * 106 } * .)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))+))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))), * Engr11a - Unusual Weather Report Output * /)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))1 * 1 Unusual Weather Report * * 2 * * 3 City High Low Rain Snow Wind * * 4 Fall Fall Speed * * 5 * * 6 Washington 99.0* 70.0 0.0 0.0 20.0 * * 7 Morton 85.0 65.0 5.0* 0.0 40.0 * * 8 Chicago 32.0 -5.0* 0.0 8.0* 25.0 * * 9 Joliet 88.0 70.0 2.0 0.0 60.0* * * 10 Springfield 99.0* 75.0 3.0* 0.0 55.0* * * 11 New Salem 0.0 -3.0* 0.0 9.0* 55.0* * .)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))-

New Syntax Summary


A string is an array of char, so it is defined the same way as any other array. const int MAX = 10; char string [MAX]; Inputting a String: Extraction: cin >> string; Blanks end the extraction; thus the data cannot contain any imbedded blanks. Further, if more than 9 characters are entered, memory is over-written or the program is aborted, depending upon the operating system platform and what is being clobbered. Extraction of a string can only be done in totally controlled situations. All strings are the same length, padded with blanks to the max length: Typically, any leading whitespace must be skipped so that the current position in the input stream is the first character of the string to be input. cin.get (string, sizeof(string)); cin.getline (string, sizeof(string)); istream& get (char* string, size_t maxlength, char delimeterCharacter); istream& getline (char* string, size_t maxlength, char delimeterCharacter); both functions input and store successive characters until

Strings

530

a. eof is reached b. the maximum number of characters minus one for the null is input c. the delimiter code is found By default, if not coded, the delimiter character is a new line code, \n. The only difference between the two functions is if the delimiter code is found, getline removes it while get does not. The string is the last item on an input line: Typically, any leading whitespace must be skipped so that the current position in the input stream is the first character of the string to be input. cin.get (string, sizeof(string)); cin.getline (string, sizeof(string)); Here each string is only as long as it needs to be; they are not padded with blanks. The string ends with a delimiter code: Typical delimiter codes are a " and a , (comma). Again, any leading whitespace must be skipped so that the current position in the input stream is the first character of the string to be input and the leading " must be inputted first. cin.get (string, sizeof(string), '\"'); cin.getline (string, sizeof(string), '\"'); Here each string is only as long as it needs to be. If get is used, remember to next extract the trailing delimiter byte. If a comma ended the string, then use cin.get (string, sizeof(string), ','); cin.getline (string, sizeof(string), ','); Output of a String: Strings are left justified not the default right justification. Hence, use the left and right manipulator functions. cout << left << setw (sizeof(string)+2) << string << right <<... To work with strings, use the built-in string functions. Be alert for the version of the compiler you are using. .NET2005 changed the string functions significantly.

Strings

531

Design Exercises
1. Design a Grade Book Program
The Grade Book Program inputs a set of students grades for a semester. First, design the layout of the data file to be used for input and then design the program to produce the Grade Report shown below. The data consists of a student id number which is their social security number, their name which can be up to 20 characters long, the course name which can be up to 10 characters in length, the course number and finally the letter grade earned. Design how the input lines must be entered. Include in what order they are entered; pay particular attention to specifically how the student names are going to be entered on your lines. The Grade Report produced by the program that is to input your data file appears as follows. Student Grade Report Student Id 111111111 ... Student Name Sam J. Jones ----Course----Name Number Grade Cmpsc 125 A

2. Design the Merge Conference Roster Program


Two sections of a conference course have been merged into one larger section. Each original section has a file of the attendee names. You are to write a program that merges the two into one new file. Each original file contains, in alphabetical order, the attendee names which can be up to 30 characters long, one name per line. The new file this program creates must also be in alphabetical order.

Strings

532

Stop! Do These Exercises Before Programming


1. A programmer needs to input the day of the week as a character string. The following coding failed to run properly. Why? What must be done to fix it up? char dayName[9]; cin >> dayName;

2. A program needs to input the chemical compound names of two substances and then compare to see if the names are the same. The following was coded and compiles without errors but when run always produces the wrong results. Why? How can it be fixed? char compound1[40]; char compound2[40]; infile1.get (compound1, sizeof (compound1)); infile2.get (compound2, sizeof (compound2)); if (compound1 == compound2) { cout << "These compounds match\n"; else cout << "These compounds do not match\n";

3. The programmer inputted a compound name and its cost and then wanted to check to see if it was equal to Sodium Chloride. The following coding compiles with no errors but when it runs, it fails to find Sodium Chloride when that is input. The input line is Sodium Chloride 4.99 What is wrong and how can it be fixed? char compound[20]; double cost; cin.get (compound, sizeof (compound)); cin >> cost; if (stricmp (compound, "Sodium Chloride") == 0) { cout << "Found\n"; } 4. The input file consists of a long student id number followed by a blank and then the students name. The following coding does not input the data properly. Why? What specifically is input when the user enters a line like this? 1234567 Sam Spade<cr> How can it be fixed so that it correctly inputs the data? long id; char name[20]; while (cin >> id) { cin.get (name, sizeof (name)); ... }

Strings

533

5. A file of student names and their grades is to be input. The programmer wrote a GetNextStudent() function. It does not work. How can it be fixed so that it does work properly? char name[20]; char grade; while (GetNextStudent (infile, name, grade, 20)) { ... istream GetNextStudent (istream infile, char name[], char grade, int maxLen) { infile.get (name, sizeof (name)); infile.get (grade); return infile; }

6. The proposed Acme Data Records consist of the following. 12345 Pots and Pans 42 10.99 23455 Coffee #10 can 18 5.99 32453 Peanuts 20 1.25 The first entry is the item number, the second is the product description, the third is the quantity on hand, and the fourth is the unit cost. Assume that no description can exceed 20 characters. The programmer wrote the following code to input the data. int main () { long id; char description[21]; int quantity; double cost; ifstream infile ("master.txt", ios::in | ios::nocreate); while (infile >> id >> description >> quantity >> cost) { ... However, it did not run at all right. What is wrong with it? Is it possible to fix the program so that it would read in that data file? What would you recommend?

Strings

534

Programming Problems
Problem Cs11-1 Life Insurance Problem
Acme Life Insurance has asked you to write a program to produce their Customers Premium Paid Report. The report lists the persons name, age and yearly premium paid. Yearly premiums are based upon the age when the person first became a customer. The table of rates is stored in the file Cs11-1-rates.txt on disk. The file contains the age and the corresponding premium on a line. Since these rates are subject to change, your program should read these values from the file. In other words, do not hard code them in the program. Currently, the data appears as follows (column headings have been added by for clarity). Age Premium Limit Dollars 25 277.00 35 287.50 45 307.75 55 327.25 65 357.00 70 455.00 The ages listed are the upper limits for the corresponding premium. In other words, if a person took out a policy at any age up to and including 25, the premium would be $277.00. If they were 26 through 35, then their premium would be $287.50. If they were above 70, use the age 70 rate of $455.00. Your program should begin by inputting the two parallel arrays, age and premium. Allow for a maximum of 20 in each array. Load these arrays from a function called LoadArrays() that is passed the two arrays and the limit of 20. It returns the number of elements in the parallel arrays. After calling the LoadArrays(), the main() function, inputs the customers data from the Cs11-1-policy.txt file. Each line in this file contains the policy number, name and age fields. The policy number should be a long and the name can be up to 20 characters long. The customer names contain the last name only with no imbedded blanks. For each customer, print out their name, their age and their premium. The report should have an appropriate title and column headings.

Strings

535

Problem Cs11-2 Acme Personnel Report


Write a program to produce the Acme Personnel Report from the Cs11-2-personnel.txt file. In the file are the following fields in this order: employee name (20 characters maximum), integer years employed, the department (15 characters maximum) and the year-to-date pay. The report should look like this. Acme Personnel Report Employee Name Years Emp. Department Year to Date Pay

xxxxxxxxxxxxxxxxxxxx 99 xxxxxxxxxxxxxxx $99999.99 xxxxxxxxxxxxxxxxxxxx 99 xxxxxxxxxxxxxxx $99999.99 The employee name and the department should be left aligned while the numeric fields should be right aligned.

Problem Cs11-3 Palindrome Analysis


A palindrome is a string that is the same whether read forward or backwards. For example, level and Nod Don and 123454321 are all palindromes. For this problem, case is not important. Write a function IsPalindrome() that takes a constant string as its only argument and returns a bool, true if the word is a palindrome or false if it is not. Then write a main() function that inputs file Cs11-3-words.txt. A line in this file cannot exceed 80 characters. For each line input, print out a single line as follows Yes--Nod Don No---Nod Jim

Problem Cs11-4 Merging Customer Files


Write a Merge Files program to merge two separate customer data into one file. Each file contains the following fields: the customers number (up to 7 digits), the customers last name (20 characters maximum), the customers first name (15 characters maximum), the address (20 characters maximum), the city (15 characters maximum), the state code (2 characters) and the zip code (5 digits). The resulting file should be in order by customer last names (a through z). If there are two identical last names, then check the first names to decide which to insert into the new master file first. Names should be case insensitive.

Strings

536

Normally, the only output of the merge program is the new master file called newMaster.txt. However, for debugging purposes, also echo print to the screen the customer last and first names as they are written to the new master file. The two input files are called Cs11-4-mast1.txt and Cs11-4-mast2.txt.

Problem Engr11-1Liquids and Gases in Coexistence (Chemical Engineering)


The chemical and physical interactions between gases and liquids are commonly encountered in chemical engineering. For a specific substance, the mathematical description of the transition from gas to liquid is vital. The basic ideal gas equation for one mole of gas is P = RT / V where P = pressure in N/m2 V = volume of one mole in m3 T = temperature in degrees K R = ideal gas constant of 8.314 J/mol-K This ideal gas equation assumes low pressures and high temperatures such that the liquid state is not present at all. However, this assumption often is not a valid one; many situations exist where there is a combination of a substance in both its gaseous and liquid state present. This situation is called an imperfect gas. Empirical formulas have been discovered that model this behavior. One of these is Van der Waal's equation of state for an imperfect gas. If the formula is simplified, it is

where p, v and t are scaled versions of the pressure, volume and temperature. The scaling is done by dividing the measurement by a known, published critical value of that measurement. These scaled equations are p = P/Pc v = V/Vc t = T/Tc These critical measurements correspond to that point where equal masses of the gas and liquid phase have the same density. The critical values are tabulated for many substances. See for example the Handbook of Chemistry and Physics Critical Constants for Gases section. Since there are actually three variables, v, p and t, the objective for this problem is to see how this equation behaves at that boundary where gas is turning into a liquid. To do so, plot p versus v versus t. An easy way that this can be accomplished is to choose a specific t value and

Strings

537

calculate a set of p versus v values. Then change t and make another set of p versus v values. All told, there are to be three sets of p versus v values. The three t values to use are 1.1, 1.0, and 0.9. For all three cases, the v values range from 0.4 through 3.0; divide this range into 100 uniformly spaced intervals. Then for each of the 100 v values, calculate the corresponding p value. This means that you should define a v array that holds 100 elements. Define three p arrays, one for each of the three t values, each p array to hold 100 elements. One of the p arrays represents the t = 1.1 results; another, the t = 1.0 results; the third, the t = 0.9 results. Create one for loop that calculates all of these values. It is most convenient to define also a function p (v, t) to handle the actual calculation of one specific pressure at a specific volume and temperature. Since these results are scaled values, they can then be applied to any specific substance. Prepare an input data file for the substances listed below. Enter the four fields in this order, substance, Tc, Pc, Vc. Your program should input each of these lines. For each line, in other words each substance, the four arrays are printed in a columnar format, with the scaled t, v, p values converted into T, V and P. In the table below, Tc is in degrees Kelvin; Pc is in atmospheres; Vc is in cubic meters per mole. Substance Tc Pc Vc Water 647.56 217.72 0.00000721 Nitrogen 126.06 33.5 0.00000436 Carbon dioxide 304.26 73.0 0.0000202 The report for a specific substance should appear similar to the following Substance: Carbon dioxide Critical Volume cubic meters/mole 0.00000808 ... Critical Pressures for 3 temps T = 334.69 T = 304.26 T = 273.83 1551.24 1843.23 1259.24

If you have access to a plotter, for each substance, plot all three sets of p versus v curves on the same graph.

Problem Engr11-2 Chemical Formula


Each line of the E11-2-formula.txt file contains the chemical formula for a compound. A blank separates the formula from the compound name. For example, one line could be NaClO3 Sodium Chlorate. In the formula, there can be no blanks; allow for a maximum of 40 characters in the formula and another 40 in the compound name. Further, in the formula, case is significant. The atom identification is one or two characters long, the first of which must be uppercase and the second, if any, must be lowercase. That is, the atom is identified by an uppercase letter. Any trailing numbers represent the number of those atoms at that point in the

Strings formula. In the above example, there is one Na (Sodium), one Cl (Chlorine) and three O (Oxygen) atoms in the compound. For each compound, print a line detailing its component atoms such as this. Sodium Chlorate 1 Na 1 Cl 3 O Sum all like atoms into a single total. For example, if we had Methanol CH3OH, the totals would be 1 C 4 H 1 O

538

You might also like