Computer Science C++
Computer Science C++
500
Chapter 11 Strings
Strings
501
When defining a string variable, it makes good programming sense not to hard-code the array bounds but to use a const int, just as is done with other kinds of arrays. Thus, the name variable ought to have been coded like this const int NAMELENGTH = 21; int main () { char name[NAMELENGTH]; If at a later date you decide that twenty character names are too short, it is a simple matter of changing the constant value and recompiling. When character string variables are defined, they can also be initialized. However, two forms are possible. Assume that when the name variable is defined, it should be given the value of Sam. Following the initialization syntax for any other kind of array, one could code char name[NAMELENGTH] = {'S', 'a', 'm', '\0'}; Here each specific character is assigned its starting value; do not fail to include the null terminator. However, the compiler allows a string to be initialized with another literal string as follows char name[NAMELENGTH] = "Sam"; Clearly this second form is much more convenient. With all forms of arrays, when defining and initializing an array, it is permissible to omit the array bounds and let the compiler determine how many elements the array must have based on the number of initial values you provide. Thus, the following is valid. char name[] = "Sam"; However, in general, this approach is lousy programming style and error prone. Why? In the above case, the compiler allocates an array just large enough to hold the literal Sam. That is, the name array is only four characters long. What would happen if later on one attempted to input another name that needed more characters? Disaster. Always provide the array bounds whenever possible.
Strings follows
502
cin >> name >> age; What results if the user enters the following data? Sam Spade 25 The input stream goes into the bad or fail state. It inputs the characters Sam and stores them along with the trailing null-terminator into the name field. It skips over the blank and attempts to input the character S of Spade into the age integer and goes immediately into the bad state. If you reflect upon all the different kinds of strings that you might encounter in the real world of programming (names, product descriptions, addresses, cities), the vast majority may have embedded blanks in them. This rules out the extraction operation as a method of inputting them. The other part of the extraction operator rules is quite destructive, especially if you are running on the Windows 95/98 platform. It inputs all characters until it finds whitespace or EOF. Now suppose that the field name is defined to be an array of 21 characters. What happens if in response to the prompt to enter a name, the user enters the following name. Rumplestillskinchevskikov The computer attempts to store 26 characters into an array that is only 21 characters long. Four bytes of memory are now overlaid. What happens next is unpredictable. If another variable in your program occupies that overlaid memory, its contents are trashed. If that memory is not even part of your program, but is part of some other program, such as a Windows system dll, it is overlaid; even wilder things can happen! Under Windows NT/2000, if you attempt to overlay memory that is not part of your data segment, the program is aborted instead. This is one reason for so many system crashes under Windows 95/98. One way to get around the extraction operator's disadvantages is to use either the get() or getline() function. The get() function can be used in one of two ways. Note: while I am using cin in these examples, any ifstream instance can be used as well. cin.get (string variable, sizeof (string variable)); cin.get (string variable, sizeof (string variable), delimiter character); These input all characters from the current position in the stream until either the maximum number of characters including the null terminator has been read or EOF or the delimiter is found. By default the delimiter is a new line code. The delimiter is not extracted but remains in the input stream. cin.getline (string variable, sizeof (string variable)); cin.getline (string variable, sizeof (string variable), delimiter character); This function works the same way except the delimiter is removed from the input stream but never stored in the string variable. It also defaults to the new line code.
Strings
503
Strings
504
found. Then place a null terminator in the last blank position. Since the length of all strings must be twenty characters (after the get() function is done, the null terminator is in the twenty-first position), the location of the last byte that contains real data must be subscript 19. The null terminator must be at subscript 20. The following coding can be used to remove the blanks at the end, if any. int index = DescrLimit - 2; // or 19 while (index >= 0 && description[index] == ' ') index--; // here index = subscript of the first non-blank char index++; description[index] = 0; // insert a null-terminator // over last blank If the description contains all blanks or if the string contains a non-blank character in the 20th position, this coding still works well. The main problem to consider when inputting strings with the get() function is handling the detection of the end of file properly. We are used to seeing coding such as while (cin >> itemnumber >> quantity) { But in this case, the input operation cannot be done with one chained series of extraction operators. Rather, it is broken into three separate statements. Consider replacing the three lines of coding with a new user helper function. while (GetData (infile, itemnumber, quantity, description, cost, DescrLimit)) { The function would be istream& GetData (istream& infile, long& itemnumber, long& quantity, char description[], double& cost, int descrLimit) { infile >> itemnumber >> quantity >> ws; if (!infile) return infile; infile.get (description, descrLimit); if (!infile) return infile; infile >> cost; if (!infile) return infile; int index = descrLimit - 2; while (index >= 0 && description[index] == ' ') index--; index++; description[index] = 0; return infile; } Vitally important is that the number of bytes to use in the get() function this time is not sizeof(description). Why? Within the function, the description is the memory address of where the first element of the array of characters is located. Memory addresses are always four bytes in size on a 32-bit platform. Thus, had we used sizeof(description), then 4 bytes would have been the limit!
Strings
505
Method A, where all strings are the same length, also applies to data files that have more than one string in a line of data. Consider a customer data line, which contains the customer number, name, address, city, state and zip code. Here three strings potentially contain blanks, assuming the state is a two-digit abbreviation. Thus, Method A is commonly used.
Method B String Contains Only the Needed Characters, But Is the Last Field on a Line
In certain circumstances, the string data field is the last item on the input data line. If so, it can contain just the number of characters it needs. Assume that the cost record data were reorganized as shown (<CRLF> indicates the enter key). 12345 10 14.99 Pots and Pans<CRLF> 34567 101 5.99 Cups<CRLF> 45667 3 10.42 Silverware, Finished<CRLF> This data can be input more easily as follows. infile >> itemnumber >> quantity >> cost >> ws; infile.get (description, sizeof (description)); Alternately, the getline() function could also be used. There are no excess blanks on the end of the descriptions to be removed. It is simpler. However, its use is limited because many data entry lines contain more than one string and it is often impossible to reorganize a company's data files just to put the string at the end of the data entry lines. Method B works well when prompting the user to enter a single string. Consider the action of asking the user to enter a filename for the program to use for input. Note on the open function call for input, we can use the ios::in flag and for output we use the ios::out flag. char filename[_MAX_PATH]; cin.getline (filename, sizeof(filename)); ifstream infile; infile.open (filename, ios::in); When dealing with filenames, one common problem to face is just how many characters long should the filename array actually be? The compiler provides a #define of _MAX_PATH (in the header file <iostream>) that contains the platform specific maximum length a complete path could be. For Windows 95, that number is 256 bytes.
Strings
506
Strings << setw (10) << quantity << setw (30) << description << setw (10) << cost << endl;
507
The default field alignment of an ostream is right alignment. All of our numeric fields display perfectly this way. But when right alignment is used on character strings, the results are usually not acceptable as shown below 12345 10 Pots and Pans 14.99 34567 101 Cups 5.99 45667 3 Silverware, Finished 10.42 Left alignment must be used when displaying strings. Right alignment must be used when displaying numerical data. The alignment is easily changed by using the setf() function. cout << setw (10) << itemnumber << setw (10) << quantity; cout.setf (ios::left, ios::adjustfield); cout << setw (30) << description; cout.setf (ios::right, ios::adjustfield); cout << setw (10) << cost << endl; In the call to setf(), the second parameter ios::adjustfield clears all the justification flags that is, turns them off. Then left justification is turned on. Once the string is output, the second call to setf() turns right justification back on for the other numerical data. It is vital to use the ios::adjustfield second parameter. The Microsoft implementation of the ostream contains two flags, one for left and one for right justification. If the left justification flag is on, then left justification occurs. Since there are two separate flags, when setting justification, failure to clear all the flags can lead to the weird circumstance in which both left and right justification flags are on. Now you have left-right justification (a joke) from now on, the output is hopelessly messed up justification-wise. Alternatively, one can use the much more convenient manipulator functions: left and right. cout << setw (10) << itemnumber << setw (10) << quantity << left << setw (30) << description << right << setw (10) << cost << endl; Finally, the insertion operator displays all characters in a string until it encounters the null terminator. What happens if by accident a string is missing its null terminator? Simple, the insertion operator displays all bytes until it finds a null terminator. I often refer to this action as a light show. Yes, one sees the contents of the string appear, but garbage characters follow that. If a line gets full, DOS line wraps and continues on the next line. If the screen fills, DOS scrolls. All of this occurs at a blazing speed. Sit back and relax; don't panic if this happens to you. It is harmless. Enjoy the show. It will stop eventually when it finds a byte with a zero in it.
Strings
508
Strings
509
To our rescue comes the library of string functions. The prototypes of all of these string functions are in the header file <string>.
Comparing Strings
Here is where the new changes Microsoft has made in .NET 2005 come to the forefront. Older code now recompiled using .NET 2005 will produce a large number of warning message about function calls now being depricated, that is obsolete. First, lets examine the older versions and then see why Microsoft has made unilateral, not yet in the C++ Standard, changes. The Old Way: To compare two strings, use either strcmp() or stricmp(). strcmp() is a case sensitive string compare function. stricmp() is a case insensitive string compare function. Both functions return an integer indicating the result of the comparison operation. The prototype of the string comparison function is this. int strcmp (const char* string1, const char* string2); It is showing that we pass it the two strings to be compared. However, the notation, const char* also indicates that the strings contents are constant. That is, the comparison function cannot alter the contents of either string. If the parameters were just char* string1, then potentially the contents of the string we passed could be altered in some way. The const char* notation is our guarantee that the function cannot alter the contents of the string we pass. It is rather like making the string read-only. The integer return code indicates the result: 0 => the two strings are the same positive => the first string is larger negative => the first string is smaller The New Way: To compare two strings, use either strcmp() or _stricmp(). strcmp() is a case sensitive string compare function. _stricmp() is a case insensitive string compare function. Both functions return an integer indicating the result of the comparison operation. The prototype of the string comparison function is this. int strcmp (const char* string1, const char* string2); int _stricmp (const char* string1, const char* string2); It is showing that we pass it the two strings to be compared. Both functions abort the program if either of the two passed memory addresses is zero or NULL. While the meaning of the results phrase, the two strings are the same, is obvious, the other two results might not be so clear. Character data is stored in an encoding scheme, often ASCII, American Standard Code for Information Interchange. In this scheme, the decimal number 65 represents the letter A. The letter B is a 66; C, a 67, and so on. If the first string begins with the letter A and the second string begins with the letter B, then the first string is said to be smaller than the second string because the 65 is smaller than the 66. The comparison
Strings
510
function returns the value given by A B or (65 66) or a negative number indicating that the first string is smaller than the second string. When comparing strings, one is more often testing for the equal or not equal situation. Applications that involve sorting or merging two sets of strings would make use of the smaller/larger possibilities. To fix up the previous example in which we wanted to find out if the previousName was not equal to the currentName, we should code the following assuming that case was important. if (strcmp (previousName, currentName) != 0) { If we wanted to ignore case sensitivity issues, then code this. if (_stricmp (previousName, currentName) != 0) {
Copying Strings
The older function to copy a string is strcpy(). Its prototype is char* strcpy (char* destination, const char* source); It copies all characters including the null terminator of the source string, placing them in the destination string. In the previous example where we wanted to copy the currentName into the previousName field, we code strcpy (previousName, currentName); Of course, the destination string should have sufficient characters in its array to store all the characters contained in the source string. If not, a memory overlay occurs. For example, if one has defined the following two strings char source[20] = "Hello World"; char dest[5]; If one copies the source string to the destination string, memory is overlain. strcpy (dest, source); Seven bytes of memory are clobbered in this case and contain a blank, the characters World and the null terminator. This clobbering of memory, the core overlay, or more politically correct, buffer overrun, has taken its toll on not only Microsoft coding but many other applications. Hackers and virus writers often take advantage of this inherently insecure function to overwrite memory with malicious machine instructions. Hence, Microsoft has unilaterally decided to rewrite the standard C Libraries to prevent such from occurring. As of this publication, Microsofts changes are not in the ANSII C++ standard. The new string copy function looks like this. char* strcpy_s (char* destination, size_t destSize, const char* source); It copies all characters including the null terminator of the source string, placing them in the destination string, subject to not exceeding the maximum number of bytes of the destination
Strings string. In all cases, the destination string will be null terminated. However, if the source or destination memory address is 0 or if the destination string is too small to hold the result, the program is basically terminated at run time. In a later course, a program can prevent this abnormal termination and do something about the problem.
511
In the previous example where we wanted to copy the currentName into the previousName field, we now code strcpy_s (previousName, sizeof (previousName), currentName); This text will consistently use these new Microsoft changes. If you are using another compiler, either use the samples provided in the 2002-3 samples folder or remove the sizeof parameter along with the _s in the function names.
Strings strcat (fullfilename, "\\"); // append the \ strcat (fullfilename, name); // append filename infile.open (fullfilename, ios::in); // open the file
512
The new version is strcat_s() which now takes the destination maximum number of bytes as the second parameter before the source string. The above sequence using the newer functions is this. strcpy_s (fullfilename, _MAX_PATH, drive); // copy the drive strcat_s (fullfilename, _MAX_PATH, path); // append the path strcat_s (fullfilename, _MAX_PATH, "\\"); // append the \ strcat_s (fullfilename, _MAX_PATH, name); // append filename infile.open (fullfilename, ios::in); // open the file
Strings char s5[10] = "bcd"; strcmp (s1, s2) yields 0 - stings are equal stricmp (s1, s5) yields 0 - strings are equal strcmp (s1, s3) yields a + value s1 > s3 strcmp (s1, s4) yields a value s1 < s4
513
Name-new: strcmp and _stricmp Meaning: string compare, case sensitive and case insensitive Prototype: int strcmp (const char* string1, const char* string2); int _stricmp (const char* string1, const char* string2); Action done: strcmp does a case sensitive comparison of the two strings, beginning with the first character of each string. It returns 0 if all characters in both strings are the same. It returns a negative value if the different character in string1 is less than that in string2. It returns a positive value if it is larger. Both functions abort the program if the memory address is null or 0. Example: char s1[10] = "Bcd"; char s2[10] = "Bcd"; char s3[10] = "Abc"; char s4[10] = "Cde"; char s5[10] = "bcd"; strcmp (s1, s2) yields 0 - stings are equal _stricmp (s1, s5) yields 0 - strings are equal strcmp (s1, s3) yields a + value s1 > s3 strcmp (s1, s4) yields a value s1 < s4 Name-old: strcat Meaning: string concatenation Prototype: char* strcat (char* desString, const char* srcString); Action done: The srcString is appended onto the end of the desString. Returns the desString address Example: char s1[20] = "Hello"; char s2[10] = " World"; strcat (s1, s2); yields "Hello World" in s1. Name-new: strcat_s Meaning: string concatenation Prototype: strcat (char* desString, size_t maxDestSize, const char* srcString); Action done: The srcString is appended onto the end of the desString. Aborts the program if dest is too small. Example: char s1[20] = "Hello"; char s2[10] = " World"; strcat_s (s1, sizeof(s1), s2); yields "Hello World" in s1.
Strings
514
Name-old: strcpy Meaning: string copy Prototype: char* strcpy (char* desString, const char* srcString); Action done: All bytes of the srcString are copied into the destination string, including the null terminator. The function returns the desString memory address. Example: char s1[10]; char s2[10] = "Sam"; strcpy (s1, s2); When done, s1 now contains "Sam". Name-new: strcpy_s Meaning: string copy Prototype: char* strcpy (char* desString, size_t maxDestSize, const char* srcString); Action done: All bytes of the srcString are copied into the destination string, including the null terminator. The function returns the desString memory address. It aborts the program if destination is too small. Example: char s1[10]; char s2[10] = "Sam"; strcpy_s (s1, sizeof (s1), s2); When done, s1 now contains "Sam". Name: strchr Meaning: search string for first occurrence of the character Prototype: char* strchr (const char* srcString, int findChar); Action done: returns the memory address or char* of the first occurrence of the findChar in the srcString. If findChar is not in the srcString, it returns NULL or 0. Example: char s1[10] = "Burr"; char* found = strchr (s1, 'r'); returns the memory address of the first letter r character, so that found[0] would give you that 'r'. Name: strstr Meaning: search string1 for the first occurrence of find string Prototype: char* strstr (const char* string1, const char* findThisString); Action done: returns the memory address (char*) of the first occurrence of findThisString in string1 or NULL (0) if it is not present. Example: char s1[10] = "abcabc"; char s2[10] = "abcdef"; char* firstOccurrence = strstr (s1, "abc");
Strings
515
It finds the first abc in s1 and firstOccurrence has the same memory address as s1, so that s1[0] and firstOccurrence[0] both contain the first letter 'a' of the string char* where = strstr (s2, "def"); Here where contains the memory address of the 'd' in the s2 Name-old: strlwr Meaning: string to lowercase Prototype: char* strlwr (char* string); Action done: All uppercase letters in the string are converted to lowercase letters. All others are left untouched. Example: char s1[10] = "Hello 123"; strlwr (s1); Yields "hello 123" in s1 when done. Name-new: strlwr_s Meaning: string to lowercase Prototype: char* strlwr (char* string, size_t maxSizeOfString); Action done: All uppercase letters in the string are converted to lowercase letters. All others are left untouched. Example: char s1[10] = "Hello 123"; strlwr_s (s1, sizeof (s1)); Yields "hello 123" in s1 when done. Name-old: strupr Meaning: convert a string to uppercase Prototype: char* strupr (char* string); Action done: Any lowercase letters in the string are converted to uppercase; all others are untouched. Example: char s1[10] = "Hello 123"; strupr (s1); When done, s1 contains "HELLO 123" Name-new: strupr_s Meaning: convert a string to uppercase Prototype: char* strupr_s (char* string, size_t maxSizeOfString); Action done: Any lowercase letters in the string are converted to uppercase; all others are untouched. Example: char s1[10] = "Hello 123"; strupr_s (s1, sizeof(s1)); When done, s1 contains "HELLO 123"
Strings Name-old: strrev Meaning: string reverse Prototype: char* strrev (char* string); Action done: Reverses the characters in a string. Example: char s1[10] = "Hello"; strrev (s1); When done, string contains "olleH"
516
Name-new: _strrev Meaning: string reverse Prototype: char* _strrev (char* string); Action done: Reverses the characters in a string. It aborts the program if the memory address passed is null or 0; Example: char s1[10] = "Hello"; _strrev (s1); When done, string contains "olleH"
Strings
517
characters are different, the loop also ends. To create the return integer, the current characters are subtracted. If the two strings are indeed equal, then both bytes must be the null terminators of the respective strings; the return value is then 0. Otherwise, the return value depends on the ASCII numerical values of the corresponding characters.
Strings The function NameToParts() takes a full name and breaks it into first and last name strings. The original passed full name string is not altered and is declared constant.
518
The function NameToCommaForm() takes the first and last names and converts them into the comma-formatted name, last, first. Since the first and last names are not altered, those parameters are also declared constant. The function CommaFormToNames() converts a comma-formatted name into first and last names. Since the comma-formatted name is not altered, it is also declared constant. Lets begin by examining the output of the program to see what is needed. Here is the test run of Cs11a.
+))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))), * Cs11a Character String Manipulation - Sample Execution * /)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))1 * 1 Original Name: |John J. Jones| * * 2 First Name: |John J.| * * 3 Last Name: |Jones| * * 4 Comma Form: |Jones, John J.| * * 5 First and Last from comma form test ok * * 6 * * 7 Original Name: |Betsy Smith| * * 8 First Name: |Betsy| * * 9 Last Name: |Smith| * * 10 Comma Form: |Smith, Betsy| * * 11 First and Last from comma form test ok * * 12 * * 13 Original Name: |Mr. and Mrs. R. J. Smith| * * 14 First Name: |Mr. and Mrs. R. J.| * * 15 Last Name: |Smith| * * 16 Comma Form: |Smith, Mr. and Mrs. R. J.| * * 17 First and Last from comma form test ok * * 18 * * 19 Original Name: |Prof. William Q. Jones| * * 20 First Name: |Prof. William Q.| * * 21 Last Name: |Jones| * * 22 Comma Form: |Jones, Prof. William Q.| * * 23 First and Last from comma form test ok * * 24 * * 25 Original Name: |J. J. Jones| * * 26 First Name: |J. J.| * * 27 Last Name: |Jones| * * 28 Comma Form: |Jones, J. J.| * * 29 First and Last from comma form test ok * * 30 * * 31 Original Name: |Jones| * * 32 First Name: || * * 33 Last Name: |Jones| * * 34 Comma Form: |Jones| *
Strings
519
* 35 First and Last from comma form test ok * * 36 * * 37 Original Name: |Mr. John J. Jones, Jr.| * * 38 First Name: |Mr. John J.| * * 39 Last Name: |Jones, Jr.| * * 40 Comma Form: |Jones, Jr., Mr. John J.| * * 41 First and Last from comma form test ok * * 42 * * 43 Original Name: |Mr. John J. Jones, II| * * 44 First Name: |Mr. John J.| * * 45 Last Name: |Jones, II| * * 46 Comma Form: |Jones, II, Mr. John J.| * * 47 First and Last from comma form test ok * * 48 * * 49 Original Name: |Mr. John J. Jones, MD.| * * 50 First Name: |Mr. John J.| * * 51 Last Name: |Jones, MD.| * * 52 Comma Form: |Jones, MD., Mr. John J.| * * 53 First and Last from comma form test ok * * 54 * * 55 Original Name: |The Honorable Betsy Smith| * * 56 First Name: |The Honorable Betsy| * * 57 Last Name: |Smith| * * 58 Comma Form: |Smith, The Honorable Betsy| * * 59 First and Last from comma form test ok * * 60 * * 61 Original Name: |Betsy O'Neill| * * 62 First Name: |Betsy| * * 63 Last Name: |O'Neill| * * 64 Comma Form: |O'Neill, Betsy| * * 65 First and Last from comma form test ok * * 66 * .)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))-
Program Cs11a is going to input a file of customer names. For each name, it first converts that full name into first and last names. Then, it forms the comma-blank version of the full name. And finally, it extracts the first and last names from the comma-blank string. If the first and last names from the two approaches do not agree, an error is written. If they agree, an Okay message is displayed. Since blanks are important to this problem and since a blank is hard to spot, the | character is printed before and after each string, making errant blanks quite visible. The Top-Down Design is shown in Figure 11.1.
Strings
520
Figure 11.1 Top-Down Design of Name Program The main() function defines the arrays as shown in Figure 11.2. The sequence of processing steps for main() is as follows. open the input file, if it fails, display an error message and quit while we have successfully inputted a line into fullName do the following call NameToParts (fullName, firstName, lastName, MaxNameLen); call NameToCommaForm (commaName, firstName, lastName); call CommaFormToNames (commaName, firstFromCommaForm, lastFromCommaForm); output the results which include fullName, firstName, lastName and commaName if firstName and firstFromCommaForm are the same as well as lastName and lastFromCommaForm then output an Ok message else display an error message and the firstFromCommaForm and lastFromCommaForm end the while clause close the input file
Strings
521
NameToParts() must break a full name into first and last names and is passed four parameters: fullName, firstName, lastName, and limit. As we work out the sequence of coding, lets work with a specific example. Suppose that fullName contains the following, where the 0 indicates the null terminator. I have written the subscripts below the corresponding characters. Mr. John J. Jones, MD.0 00000000001111111111222 01234567890123456789012 The strlen(fullName) yields 22 characters as the current length and the subscript for the last character in the string is thus 21. So working from the end of the string, look for a blank that does not have a comma immediately in front of it. i = strlen (fullName) 1; while (i >= 0) do the following does fullName[i] == ? If so do the following if there is a previous character that is, is i>0 and if that previous character is not a comma, fullName[i 1] != , then // we have found the spot so we need to break out of the loop break; with i on the blank end the if test end the does clause back up to the previous character, i--; end the while clause Now split out the two names. Notice we pass i+1, which is the first non-blank character in the last name. CopyParitalString (lastName, fullName, i+1, strlen (fullName)); CopyParitalString (firstName, fullName, 0, i); The CopyParitalString() functions purpose is to copy a series of characters in a source string from some beginning subscript through an ending subscript and then insert a null terminator. It is passed the dest string, the src string, startAt and endAt. is startAt >= endAt meaning we are starting at the ending point, there is nothing to copy, so just make the dest string a properly null-terminated string. dest[0] = 0 and return end is To copy the characters, we need a subscript variable for each string, isrc and ides. let isrc = startAt let ides = 0; Now copy all characters from startAt to endAt while isrc < endAt do the following dest[ides] = src[isrc]; increment both isrc and ides end while Finally, insert the null terminator
Strings dest[ides] = 0;
522
The NameToCommaForm() function is comparatively simple. From two strings containing the first and last names, make one combined new string of the form last name, first name. However, in some cases, there might not be any first name. In that case, the result should just be a copy of the last name string. NameToCommaForm() is passed three strings: the answer string to fill up commaName and the two source strings firstName and lastName. The sequence is as follows. strcpy (commaName, lastName); if a first name exists that is, does strlen (firstName) != 0, if so do append a comma and a blank strcat (commaName, , ) append the first name strcat (commaName, firstName) end if The CommaFormToNames() function must convert a single string with the form of last name, first name into first and last name strings. It is passed commaName to convert and the two strings to fill up - firstName and lastName. This time, we again begin at the end of the string looking for the first comma followed by a blank. Consider these two cases. Jones, Jr., Mr. John J.0 Jones, Prof. William Q.0 Clearly, we want to stop at the first , occurrence to avoid problems with Jr.. let len = strlen (commaName) let commaAt = len 2 while commaAt > 0 do the following if the current character at commaAt is a , and the character at commaAt + 1 is a blank, then break out of the loop back up commaAt end the while clause However, this could be compacted a bit more by using ! (not) logic in the while test condition. while commaAt > 0 and !(commaName[commaAt] == , && commaName[commaAt+1] == )) { When the loop ends, we must guard against no comma and blank found. if (commaAt <= 0) then there is no comma so do the following strcpy (lastName, commaName) firstName[0] = 0 and return end the if Finally, at this point, we have found the , portion; copy the two portions as follows. CopyParitalString (lastName, commaName, 0, commaAt) CopyParitalString (firstName, commaName, commaAt+2, len) As you study the coding, draw some pictures of some test data and trace what is occurring
Strings if you have any doubts about what is going on. Here is the complete program.
523
+))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))), * Cs11a Character String Manipulation * /)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))1 * 1 /***************************************************************/* * 2 /* */* * 3 /* Cs11a Character String Manipulation - Customer Names */* * 4 /* */* * 5 /***************************************************************/* * 6 * * 7 #include <iostream> * * 8 #include <iomanip> * * 9 #include <fstream> * * 10 #include <string> * * 11 using namespace std; * * 12 * * 13 const int MaxNameLen = 51; // the maximum length of names * * 14 * * 15 void NameToParts (const char fullName[],// converts a full name * * 16 char firstName[], // to first & last names * * 17 char lastName[], * * 18 int limit); * * 19 * * 20 void NameToCommaForm (char commaName[], // converts a first and * * 21 const char firstName[], // last name into a * * 22 const char lastName[]); // full name string * * 23 * * 24 void CommaFormToNames (const char commaName[],// converts a comma* * 25 char firstFromCommaForm[], // form of name into * * 26 char lastFromCommaForm[]); // first & last names* * 27 * * 28 void CopyParitalString (char dest[], // copies a part of the src * * 29 const char src[], // string into the dest * * 30 int startAt, // beginning at startAt and * * 31 int endAt); // ending at endAt * * 32 * * 33 int main () { * * 34 char fullName[MaxNameLen]; // original full name as input* * 35 char firstName[MaxNameLen]; // first name from full name * * 36 char lastName[MaxNameLen]; // last name from full name * * 37 char commaName[MaxNameLen]; // full name in comma form * * 38 char firstFromCommaForm[MaxNameLen];// first name from commaform* * 39 char lastFromCommaForm[MaxNameLen]; // last name from comma form* * 40 * * 41 ifstream infile ("Cs11a-Names.txt"); * * 42 if (!infile) { * * 43 cerr << "Error: cannot find the names file\n"; * * 44 return 1; * * 45 } * * 46 ofstream out ("results.txt"); * * 47 * * 48 while (infile.getline (fullName, sizeof (fullName))) { *
Strings
* 49 * 50 * 51 * 52 * 53 * 54 * 55 * 56 * 57 * 58 * 59 * 60 * 61 * 62 * 63 * 64 * 65 * 66 * 67 * 68 * 69 * 70 * 71 * 72 * 73 * 74 * 75 * 76 * 77 * 78 * 79 * 80 * 81 * 82 * 83 * 84 * 85 * 86 * 87 * 88 * 89 * 90 * 91 * 92 * 93 * 94 * 95 * 96 * 97 * 98 * 99 * 100
524
// break full name inputted into first and last names NameToParts (fullName, firstName, lastName, MaxNameLen);
* * * // turn first and last names into a comma form of full name * NameToCommaForm (commaName, firstName, lastName); * * // break comma form of full name into first and last names * CommaFormToNames (commaName, firstFromCommaForm, * lastFromCommaForm); * * // output results * out << "Original Name: |" << fullName << '|' << endl; * out << " First Name: |" << firstName << '|' << endl; * out << " Last Name: |" << lastName << '|' << endl; * out << " Comma Form: |" << commaName << '|' << endl; * * // test that first and last names agree from both forms * // of extraction * if (strcmp (firstName, firstFromCommaForm) == 0 && * strcmp (lastName, lastFromCommaForm) == 0) * out << " First and Last from comma form test ok" << endl; * else { * out << " Error from comma form - does not match\n"; * out << " First Name: |" << firstFromCommaForm << '|' <<endl;* out << " Last Name: |" << lastFromCommaForm << '|' <<endl;* } * out << endl; * } * infile.close (); * out.close (); * return 0; * } * * /***************************************************************/* /* */* /* CopyParitalString: copies src from startAt through endAt */* /* */* /***************************************************************/* * void CopyParitalString (char dest[], const char src[], * int startAt, int endAt) { * if (startAt >= endAt) { // avoid starting after ending * dest[0] = 0; // just set dest string to a null string* return; * } * * int isrc = startAt; * int ides = 0; * // copy all needed chars from startAt to endAt * for (; isrc<endAt; isrc++, ides++) { * dest[ides] = src[isrc]; * } *
Strings
* 101 * 102 * 103 * 104 * 105 * 106 * 107 * 108 * 109 * 110 * 111 * 112 * 113 * 114 * 115 * 116 * 117 * 118 * 119 * 120 * 121 * 122 * 123 * 124 * 125 * 126 * 127 * 128 * 129 * 130 * 131 * 132 * 133 * 134 * 135 * 136 * 137 * 138 * 139 * 140 * 141 * 142 * 143 * 144 * 145 * 146 * 147 * 148 * 149 * 150 * 151 * 152
525
dest[ides] = 0; // insert null terminator
* * * /***************************************************************/* /* */* /* NameToParts: break a full name into first and last name */* /* */* /***************************************************************/* * void NameToParts (const char fullName[], char firstName[], * char lastName[], int limit) { * // working from the end of the string, look for blank separator * // that does not have a , immediately in front of it * int i = (int) strlen (fullName) - 1; * while (i >= 0) { * if (fullName[i] == ' ') { // found a blank and * if (i>0 && fullName[i-1] != ',') { // earlier char is not a, * break; // end with i on the blank * } * } * i--; * } * CopyParitalString (lastName,fullName i+1,(int)strlen(fullName));* CopyParitalString (firstName, fullName, 0, i); * } * * /***************************************************************/* /* */* /* NameToCommaForm: from first & last names, make last, first */* /* */* /***************************************************************/* * void NameToCommaForm (char commaName[], const char firstName[], * const char lastName[]) { * strcpy_s (commaName, MaxNameLen, lastName); * if (strlen (firstName)) { // if a first name exists, * strcat_s (commaName, MaxNameLen, ", "); // add a , and blank * strcat_s (commaName, MaxNameLen, firstName); // add first name * } * } * * /***************************************************************/* /* */* /* CommaFormToNames: convert a last, first name to first & last*/* /* */* /***************************************************************/* * void CommaFormToNames (const char commaName[], char firstName[], * char lastName[]) { * // begin at the end and look for a ,blank * int len = (int) strlen (commaName); * int commaAt = len - 2; *
Strings
526
* 153 while (commaAt > 0 && * * 154 !(commaName[commaAt] == ',' && commaName[commaAt+1] == ' ')) {* * 155 commaAt--; * * 156 } * * 157 if (commaAt <= 0) { // here there is no comma so * * 158 strcpy_s (lastName, MaxNameLen, commaName); * * 159 firstName[0] = 0; // set first name to null string * * 160 return; * * 161 } * * 162 CopyParitalString (lastName, commaName, 0, commaAt); * * 163 CopyParitalString (firstName, commaName, commaAt+2, len); * * 164 } * .)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))-
City
Rain
Snow
Winds
Peoria 85 55 0 0 55* Washington 99* 75 0 0 10 A * character is placed after the weather statistic that is unusual.
Strings
527
Since this problem is quite basic, I have not included the coding sketch. By now, the logic should be obvious. Here are the program listing and the sample output. Make sure you examine the instructions that process the new string variables.
+))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))), * Listing for Program Engr11a - Unusual Weather Statistics * /)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))1 * 1 /***************************************************************/* * 2 /* */* * 3 /* Engr11a: Unusual Weather Statistics report */* * 4 /* */* * 5 /***************************************************************/* * 6 * * 7 #include <iostream> * * 8 #include <iomanip> * * 9 #include <fstream> * * 10 #include <string> * * 11 using namespace std; * * 12 * * 13 const int MaxCityLen = 21; // city name length is 20 chars * * 14 * * 15 int main () { * * 16 * * 17 char infilename[_MAX_PATH]; * * 18 char reportname[_MAX_PATH]; * * 19 cout << "Enter the filename with today's weather data\n"; * * 20 cin.getline (infilename, sizeof (infilename)); * * 21 cout << "\nEnter the report filename\n"; * * 22 cin.getline (reportname, sizeof(reportname)); * * 23 * * 24 ifstream infile; * * 25 infile.open (infilename); * * 26 if (!infile) { * * 27 cerr << "Error: cannot open file: " << infilename << endl; * * 28 return 1; * * 29 } * * 30 * * 31 ofstream outfile; * * 32 outfile.open (reportname, ios::out); * * 33 if (!outfile) { * * 34 cerr << "Error: cannot open file: " << reportname << endl; * * 35 return 1; * * 36 } * * 37 // setup floating point output format * * 38 outfile << fixed << setprecision (1); * * 41 * * 42 outfile << "Unusual Weather Report\n\n"; * * 43 outfile<<"City High Low Rain Snow"* * 44 " Wind\n"; * * 45 outfile<<" Fall Fall"* * 46 " Speed\n\n"; * * 47 *
Strings
* 48 * 49 * 50 * 51 * 52 * 53 * 54 * 55 * 56 * 57 * 58 * 59 * 60 * 61 * 62 * 63 * 64 * 65 * 66 * 67 * 68 * 69 * 70 * 71 * 72 * * 76 * 77 * 78 * 79 * 80 * 81 * 82 * 83 * 84 * 85 * 86 * 87 * 88 * 89 * 90 * 91 * 92 * 93 * 94 * 95 * 96 * 97 * 98 * 99 * 100 * 101
528
* * * * * * * char junk; // to hold the " around city names * int line = 0; // line count for error processing * * while (infile >> junk) { // input the leading " of city * infile.get (city, sizeof (city), '\"'); * infile.get (junk); * infile >> high >> low >> rainfall >> snowfall >> windspeed; * // abort if there is incomplete or bad data * if (!infile) { * cerr << "Error: incomplete city data on line " << line <<endl;* infile.close (); * outfile.close (); * return 2; * } * if (high > 95 || low < 0 || rainfall > 2 || snowfall > 6 || * windspeed > 45) { * // unusual weather - display this city data * outfile << left << setw (22) << city << right * << setw (7) << high; * if (high > 95) * outfile << '*'; * else * outfile << ' '; * outfile << setw (7) << low; * if (low < 0) * outfile << '*'; * else * outfile << ' '; * outfile << setw (7) << rainfall; * if (rainfall > 2) * outfile << '*'; * else * outfile << ' '; * outfile << setw (7) << snowfall; * if (snowfall > 6) * outfile << '*'; * else * outfile << ' '; * outfile << setw (7) << windspeed; * if (windspeed > 45) * outfile << '*'; * else * outfile << ' '; * outfile << endl; * } *
char city [MaxCityLen]; float high; float low; float rainfall; float snowfall; float windspeed;
// // // // // //
string to hold city name high temperature of the day - F low temperature of the day - F rainfall in inches snowfall in inches wind speed in mph
Strings
529
* 102 } * * 103 infile.close (); * * 104 outfile.close (); * * 105 return 0; * * 106 } * .)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))+))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))), * Engr11a - Unusual Weather Report Output * /)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))1 * 1 Unusual Weather Report * * 2 * * 3 City High Low Rain Snow Wind * * 4 Fall Fall Speed * * 5 * * 6 Washington 99.0* 70.0 0.0 0.0 20.0 * * 7 Morton 85.0 65.0 5.0* 0.0 40.0 * * 8 Chicago 32.0 -5.0* 0.0 8.0* 25.0 * * 9 Joliet 88.0 70.0 2.0 0.0 60.0* * * 10 Springfield 99.0* 75.0 3.0* 0.0 55.0* * * 11 New Salem 0.0 -3.0* 0.0 9.0* 55.0* * .)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))-
Strings
530
a. eof is reached b. the maximum number of characters minus one for the null is input c. the delimiter code is found By default, if not coded, the delimiter character is a new line code, \n. The only difference between the two functions is if the delimiter code is found, getline removes it while get does not. The string is the last item on an input line: Typically, any leading whitespace must be skipped so that the current position in the input stream is the first character of the string to be input. cin.get (string, sizeof(string)); cin.getline (string, sizeof(string)); Here each string is only as long as it needs to be; they are not padded with blanks. The string ends with a delimiter code: Typical delimiter codes are a " and a , (comma). Again, any leading whitespace must be skipped so that the current position in the input stream is the first character of the string to be input and the leading " must be inputted first. cin.get (string, sizeof(string), '\"'); cin.getline (string, sizeof(string), '\"'); Here each string is only as long as it needs to be. If get is used, remember to next extract the trailing delimiter byte. If a comma ended the string, then use cin.get (string, sizeof(string), ','); cin.getline (string, sizeof(string), ','); Output of a String: Strings are left justified not the default right justification. Hence, use the left and right manipulator functions. cout << left << setw (sizeof(string)+2) << string << right <<... To work with strings, use the built-in string functions. Be alert for the version of the compiler you are using. .NET2005 changed the string functions significantly.
Strings
531
Design Exercises
1. Design a Grade Book Program
The Grade Book Program inputs a set of students grades for a semester. First, design the layout of the data file to be used for input and then design the program to produce the Grade Report shown below. The data consists of a student id number which is their social security number, their name which can be up to 20 characters long, the course name which can be up to 10 characters in length, the course number and finally the letter grade earned. Design how the input lines must be entered. Include in what order they are entered; pay particular attention to specifically how the student names are going to be entered on your lines. The Grade Report produced by the program that is to input your data file appears as follows. Student Grade Report Student Id 111111111 ... Student Name Sam J. Jones ----Course----Name Number Grade Cmpsc 125 A
Strings
532
2. A program needs to input the chemical compound names of two substances and then compare to see if the names are the same. The following was coded and compiles without errors but when run always produces the wrong results. Why? How can it be fixed? char compound1[40]; char compound2[40]; infile1.get (compound1, sizeof (compound1)); infile2.get (compound2, sizeof (compound2)); if (compound1 == compound2) { cout << "These compounds match\n"; else cout << "These compounds do not match\n";
3. The programmer inputted a compound name and its cost and then wanted to check to see if it was equal to Sodium Chloride. The following coding compiles with no errors but when it runs, it fails to find Sodium Chloride when that is input. The input line is Sodium Chloride 4.99 What is wrong and how can it be fixed? char compound[20]; double cost; cin.get (compound, sizeof (compound)); cin >> cost; if (stricmp (compound, "Sodium Chloride") == 0) { cout << "Found\n"; } 4. The input file consists of a long student id number followed by a blank and then the students name. The following coding does not input the data properly. Why? What specifically is input when the user enters a line like this? 1234567 Sam Spade<cr> How can it be fixed so that it correctly inputs the data? long id; char name[20]; while (cin >> id) { cin.get (name, sizeof (name)); ... }
Strings
533
5. A file of student names and their grades is to be input. The programmer wrote a GetNextStudent() function. It does not work. How can it be fixed so that it does work properly? char name[20]; char grade; while (GetNextStudent (infile, name, grade, 20)) { ... istream GetNextStudent (istream infile, char name[], char grade, int maxLen) { infile.get (name, sizeof (name)); infile.get (grade); return infile; }
6. The proposed Acme Data Records consist of the following. 12345 Pots and Pans 42 10.99 23455 Coffee #10 can 18 5.99 32453 Peanuts 20 1.25 The first entry is the item number, the second is the product description, the third is the quantity on hand, and the fourth is the unit cost. Assume that no description can exceed 20 characters. The programmer wrote the following code to input the data. int main () { long id; char description[21]; int quantity; double cost; ifstream infile ("master.txt", ios::in | ios::nocreate); while (infile >> id >> description >> quantity >> cost) { ... However, it did not run at all right. What is wrong with it? Is it possible to fix the program so that it would read in that data file? What would you recommend?
Strings
534
Programming Problems
Problem Cs11-1 Life Insurance Problem
Acme Life Insurance has asked you to write a program to produce their Customers Premium Paid Report. The report lists the persons name, age and yearly premium paid. Yearly premiums are based upon the age when the person first became a customer. The table of rates is stored in the file Cs11-1-rates.txt on disk. The file contains the age and the corresponding premium on a line. Since these rates are subject to change, your program should read these values from the file. In other words, do not hard code them in the program. Currently, the data appears as follows (column headings have been added by for clarity). Age Premium Limit Dollars 25 277.00 35 287.50 45 307.75 55 327.25 65 357.00 70 455.00 The ages listed are the upper limits for the corresponding premium. In other words, if a person took out a policy at any age up to and including 25, the premium would be $277.00. If they were 26 through 35, then their premium would be $287.50. If they were above 70, use the age 70 rate of $455.00. Your program should begin by inputting the two parallel arrays, age and premium. Allow for a maximum of 20 in each array. Load these arrays from a function called LoadArrays() that is passed the two arrays and the limit of 20. It returns the number of elements in the parallel arrays. After calling the LoadArrays(), the main() function, inputs the customers data from the Cs11-1-policy.txt file. Each line in this file contains the policy number, name and age fields. The policy number should be a long and the name can be up to 20 characters long. The customer names contain the last name only with no imbedded blanks. For each customer, print out their name, their age and their premium. The report should have an appropriate title and column headings.
Strings
535
xxxxxxxxxxxxxxxxxxxx 99 xxxxxxxxxxxxxxx $99999.99 xxxxxxxxxxxxxxxxxxxx 99 xxxxxxxxxxxxxxx $99999.99 The employee name and the department should be left aligned while the numeric fields should be right aligned.
Strings
536
Normally, the only output of the merge program is the new master file called newMaster.txt. However, for debugging purposes, also echo print to the screen the customer last and first names as they are written to the new master file. The two input files are called Cs11-4-mast1.txt and Cs11-4-mast2.txt.
where p, v and t are scaled versions of the pressure, volume and temperature. The scaling is done by dividing the measurement by a known, published critical value of that measurement. These scaled equations are p = P/Pc v = V/Vc t = T/Tc These critical measurements correspond to that point where equal masses of the gas and liquid phase have the same density. The critical values are tabulated for many substances. See for example the Handbook of Chemistry and Physics Critical Constants for Gases section. Since there are actually three variables, v, p and t, the objective for this problem is to see how this equation behaves at that boundary where gas is turning into a liquid. To do so, plot p versus v versus t. An easy way that this can be accomplished is to choose a specific t value and
Strings
537
calculate a set of p versus v values. Then change t and make another set of p versus v values. All told, there are to be three sets of p versus v values. The three t values to use are 1.1, 1.0, and 0.9. For all three cases, the v values range from 0.4 through 3.0; divide this range into 100 uniformly spaced intervals. Then for each of the 100 v values, calculate the corresponding p value. This means that you should define a v array that holds 100 elements. Define three p arrays, one for each of the three t values, each p array to hold 100 elements. One of the p arrays represents the t = 1.1 results; another, the t = 1.0 results; the third, the t = 0.9 results. Create one for loop that calculates all of these values. It is most convenient to define also a function p (v, t) to handle the actual calculation of one specific pressure at a specific volume and temperature. Since these results are scaled values, they can then be applied to any specific substance. Prepare an input data file for the substances listed below. Enter the four fields in this order, substance, Tc, Pc, Vc. Your program should input each of these lines. For each line, in other words each substance, the four arrays are printed in a columnar format, with the scaled t, v, p values converted into T, V and P. In the table below, Tc is in degrees Kelvin; Pc is in atmospheres; Vc is in cubic meters per mole. Substance Tc Pc Vc Water 647.56 217.72 0.00000721 Nitrogen 126.06 33.5 0.00000436 Carbon dioxide 304.26 73.0 0.0000202 The report for a specific substance should appear similar to the following Substance: Carbon dioxide Critical Volume cubic meters/mole 0.00000808 ... Critical Pressures for 3 temps T = 334.69 T = 304.26 T = 273.83 1551.24 1843.23 1259.24
If you have access to a plotter, for each substance, plot all three sets of p versus v curves on the same graph.
Strings formula. In the above example, there is one Na (Sodium), one Cl (Chlorine) and three O (Oxygen) atoms in the compound. For each compound, print a line detailing its component atoms such as this. Sodium Chlorate 1 Na 1 Cl 3 O Sum all like atoms into a single total. For example, if we had Methanol CH3OH, the totals would be 1 C 4 H 1 O
538