Microsoft PowerPoint - Lect - 10
Microsoft PowerPoint - Lect - 10
BİL 214 – System Programming • In this lecture, we introduce the C Standard Library functions that • Characters are the fundamental building blocks of source programs.
facilitate string and character processing.
TOBB ETU • Every program is composed of a sequence of characters that—when grouped
together meaningfully—is interpreted by the computer as a series of instructions
Fall 2022 • These functions enable programs (editors, word processors, page used to accomplish a task.
Lecture 10 layout software, computerized typesetting systems) to process
characters, strings, lines of text and blocks of memory. • A program may contain character constants.
C Programming
Characters and Strings • A character constant is an int value represented as a character in single quotes.
• The text manipulations performed by formatted input/output
functions like printf and scanf can be implemented using the
functions discussed in this chapter. • The value of a character constant is the integer value of the character in the
machine’s character set.
1 2 3
Fundamentals of strings and characters Fundamentals of strings and characters Fundamentals of strings and characters
• For example, 'z' represents the integer value of z, and '\n' the integer • A string in C is an array of characters ending in the null character ( '\0'). • A character array or a variable of type char * can be initialized with a string in a
definition.
value of newline (122 and 10 in ASCII, respectively).
• A string is accessed via a pointer to the first character in the string. • The definitions
• char color[] = "blue";
• A string is a series of characters treated as a single unit. • const char *colorPtr = "blue";
• The value of a string is the address of its first character.
each initialize a variable to the string "blue".
• A string may include letters, digits and various special characters such • Thus, in C, it is appropriate to say that a string is a pointer—in fact, a
as +, -, *, / and $. pointer to the string’s first character. • The first definition creates a 5-element array color containing the characters 'b', 'l', 'u', 'e'
and '\0'.
• String literals, or string constants, in C are written in double quotation • In this sense, strings are like arrays, because an array is also a pointer to its • The second definition creates pointer variable colorPtr that points to the string "blue"
first element. somewhere in memory.
marks. (" ")
4 5 6
• The preceding definition automatically determines the size of the • EOF normally has the value –1, and some hardware architectures do
array based on the number of initializers in the initializer list. not allow negative values to be stored in char variables, so the
character-handling functions manipulate characters as integers.
7 8 9
ctype.h
• Another set of useful functions are isspace, iscntrl, ispunct, isprint
and isgraph.
10 11 12
13 14 15
16 17 18
strtod strtod strtol
• Function strtod converts a sequence of characters representing a floating-point value to double. const char *string = "51.2% are admitted"; // initialize string • Function strtol converts to long int a sequence of characters representing
char *stringPtr; // create char pointer an integer.
• The function returns 0 if it’s unable to convert any portion of its first argument to double. double d = strtod(string, &stringPtr);
• The function receives two arguments—a string (char *) and a pointer to a string (char **). • The function returns 0 if it’s unable to convert any portion of its first
• d is assigned the double value converted from string, and stringPtr is argument to long int.
• The string argument contains the character sequence to be converted to double—any whitespace assigned the location of the first character after the converted value
characters at the beginning of the string are ignored.
(51.2) in string. • The function’s three arguments are a string (char *), a pointer to a string
• The function uses the char ** argument to modify a char * in the calling function (stringPtr) so
and an integer.
that it points to the location of the first character after the converted portion of the string or to
the entire string if no portion can be converted.
• The string contains the character sequence to be converted to long—any
whitespace characters at the beginning of the string are ignored.
19 20 21
• The function uses the char ** argument to modify a char * in the • strtoul works identically to function strtol. • x is assigned the unsigned long int value converted from string.
calling function (remainderPtr) so that it points to the location of the
first character after the converted portion of the string or to the • The second argument, &remainderPtr, is assigned the remainder of string after
entire string if no portion can be converted. the conversion.
• The integer specifies the base of the value being converted. • The third argument, 0, indicates that the value to be converted can be in octal,
decimal or hexadecimal format.
22 23 24
• The maximum number of characters is one fewer than the value specified in fgets’s
second argument.
• The third argument specifies the stream from which to read characters—in this case, we
use the standard input stream (stdin).
25 26 27
putchar getchar puts
• putchar recursively outputs the characters of the line in reverse order • getchar reads characters from the standard input into character array • puts displays characters as a string.
sentence.
• putchar returns the character written as an unsigned char cast to an • puts takes a string as an argument and displays the string followed by
int or EOF on error. • getchar reads a character from the standard input and returns the a newline character.
character as an integer – recall that an integer is returned to support
the end-of-file indicator.
28 29 30
• sprintf uses the same conversion specifiers as printf • sscanf uses the same conversion specifiers as scanf. • The string-handling library (string.h) provides many useful functions for
• manipulating string data (copying strings and concatenating strings),
• comparing strings,
• searching strings for characters and other strings,
• tokenizing strings (separating strings into logical pieces) and
• determining the length of strings.
• Every function – except for strncpy – appends the null character to its
result.
31 32 33
34 35 36
strcat and strncat string.h Comparison functions
• strcat appends its second argument (a string) to its first argument (a character • Functions strncpy and strncat specify a parameter of type size_t, • Next, we look at the string-handling library’s string-comparison
array containing a string).
which is a type defined by the C standard as the integral type of the functions:
• The first character of the second argument replaces the null ('\0') that terminates
value returned by operator sizeof. • strcmp
the string in the first argument. • strncmp
• You must ensure that the array used to store the first string is large enough to
store the first string, the second string and the terminating null character copied
from the second string.
37 38 39
• strcmp returns 0 if the strings are equal, a negative value if the first string is
less than the second string and a positive value if the first string is greater • Both functions return 0 (strangely, the equivalent of C's false value)
than the second string. for equality.
• strncmp is equivalent to strcmp, except that strncmp compares up to a • Therefore, when comparing two strings for equality, the result of
specified number of characters.
function strcmp or strncmp should be compared with 0 to determine
whether the strings are equal.
• strncmp does not compare characters following a null character in a string.
40 41 42
46 47 48
52 53 54
strtok Character encodings Character encodings
• Function strtok modifies the input string by placing '\0' at the end of • In an effort to standardize character representations, most computer • There are other coding schemes, but these two (ASCII and EBCDIC)
each token manufacturers have designed their machines to utilize one of two are the most popular.
popular coding schemes – ASCII or EBCDIC:
• Therefore, a copy of the string should be made if the string will be • The Unicode standard outlines a specification to produce consistent
used after the calls to strtok. • ASCII stands for “American Standard Code for Information encoding of the vast majority of the world’s characters and symbols.
Interchange,”
• ASCII, EBCDIC and Unicode are called character sets.
• EBCDIC (developed by IBM) stands for “Extended Binary Coded
Decimal Interchange Code.”
55 56 57
• Recall from last lecture that any pointer can be assigned directly to a pointer of
• The functions treat blocks of memory as character arrays and can type void *, and a pointer of type void * can be assigned directly to a pointer of
manipulate any block of data. any other type.
• Note: Each of these functions has a more secure version described in • The memory manipulation functions do not check for terminating null characters,
optional Annex K of the C11 standard. because they manipulate blocks of memory that are not necessarily strings.
58 59 60
char s1[17]; // create char array s1 • Copying is performed as if the bytes were copied from the second argument into
char s2[] = "Copy this string"; // initialize char array s2 a temporary array, then copied from the temporary array into the first argument.
memcpy(s1, s2, 17);
• This allows bytes from one part of a string to be copied into another part of the
• The function can receive a pointer to any type of object. same string, even if the two portions overlap.
• The result of this function is undefined if the two objects overlap in memory (i.e., • String-manipulation functions other than memmove that copy characters have
if they are parts of the same object)—in such cases, use memmove. undefined results when copying takes place between parts of the same string.
61 62 63
memcpm Example memchr
• memcmp compares the specified number of bytes of its first • memchr searches for the first occurrence of a byte, represented as
argument with the corresponding bytes of its second argument. unsigned char, in the specified number of bytes of an object.
• The function returns • If the byte is found, a pointer to the byte in the object is returned;
• A value greater than 0 if the first argument is greater than the second, otherwise, a NULL pointer is returned.
• 0 if the arguments are equal, and
• A value less than 0 if the first argument is less than the second.
64 65 66
67 68 69
70 71 72
strerror Example strlen
• strerror takes an error number and creates an error message string. • strlen takes a string as an argument and returns the number of
characters in the string.
• A pointer to the string is returned.
• The terminating null character is not included in the length.
73 74 75
• In this chapter, we presented functions sprintf, strcpy, strncpy, strcat, strncat, strtok, strlen,
memcpy, memmove and memset.
• More secure versions of these and many other string-processing and input/output functions are
described by the C11 standard’s optional Annex K.
• If your C compiler supports Annex K, you should use the secure versions of these functions.
• Among other things, the more secure versions help prevent buffer overflows by requiring
additional parameters that specify the number of elements in a target array and by ensuring that
pointer arguments are non-NULL.
76