The document provides an overview of C strings, detailing their characteristics, manipulation functions, and safety considerations. It covers string literals, functions like strcpy, strcat, strlen, strcmp, and strtok, as well as memory operations related to strings. Additionally, it emphasizes the importance of handling NUL terminators and the risks associated with string operations in C programming.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
9 views12 pages
Strings
The document provides an overview of C strings, detailing their characteristics, manipulation functions, and safety considerations. It covers string literals, functions like strcpy, strcat, strlen, strcmp, and strtok, as well as memory operations related to strings. Additionally, it emphasizes the importance of handling NUL terminators and the risks associated with string operations in C programming.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 12
C strings
(Reek, Ch. 9)
1 CS 3090: Safety Critical Programming
in C Review of strings Sequence of zero or more characters, terminated by NUL (literally, the integer value 0) NUL terminates a string, but isn’t part of it important for strlen() – length doesn’t include the NUL Strings are accessed through pointers/array names string.h contains prototypes of many useful functions
2 CS 3090: Safety Critical Programming
in C String literals Evaluating ″dog″ results in memory allocated for three characters ′d ′, ′ o ′, ′ g ′, plus terminating NUL char *m = ″dog″; Note: If m is an array name, subtle difference: char m[10] = ″dog″;
10 bytes are allocated for this
array This is not a string literal; It’s an array initializer in disguise! Equivalent to {′d′,′o′,′g′, ′\0′}
3 CS 3090: Safety Critical Programming
in C String manipulation functions Read some “source” string(s), possibly write to some “destination” location char *strcpy(char *dst, char const *src); char *strcat (char *dst, char const *src); Programmer’s responsibility to ensure that: destination region large enough to hold result source, destination regions don’t overlap “undefined” behavior in this case – according to C spec, anything could happen! Assuming that the implementation of char m[10] = ″dog″; strcpy starts copying left-to-right without checking for the presence of a strcpy(m+1, m); terminating NUL first, what will happen?
4 CS 3090: Safety Critical Programming
in C strlen() and size_t size_t strlen(char const *string); /* returns length of string */ size_tis an unsigned integer type, used to define sizes of strings and (other) memory blocks Reasonable to think of “size” as unsigned”... But beware! Expressions involving strlen() may be unsigned (perhaps unexpectedly) if (strlen(x) – strlen(y) >= 0) ... always true! avoid by casting: ((int) (strlen(x) – strlen(y)) >= 0) Problem: what if x or y is a very large string? a better alternative: (strlen(x) >= strlen(y))
5 CS 3090: Safety Critical Programming
in C strcmp() “string comparison” int strcmp(char const *s1, char const *s2); returns a value less than zero if s1 precedes s2 in lexicographical order; returns zero if s1 and s2 are equal; returns a value greater than zero if s1 follows s2. Source of a common mistake: seems reasonable to assume that strcmp returns “true” (nonzero) if s1 and s2 are equal; “false” (zero) otherwise In fact, exactly the opposite is the case!
6 CS 3090: Safety Critical Programming
in C Restricted vs. unrestricted string functions Restricted versions: require an extra integer argument that bounds the operation char *strncpy(char *dst, char const *src, size_t len); char *strncat(char *dst, char const *src, size_t len); int strncmp(char const *s1, char const *s2, size_t len); “safer” in that they avoid problems with missing NUL terminators safety concern with strncpy: If bound isn’t large enough, terminating NUL won’t be written Safe alternative: strncpy(buffer, name, BSIZE); buffer[BSIZE-1] = ′\0′;
7 CS 3090: Safety Critical Programming
in C String searching char *strpbrk(char const *str, char const *group); /* return a pointer to the first character in str that matches *any* character in group; return NULL if there is no match */
/* return number of characters at beginning of str that match *any* character in group */
8 CS 3090: Safety Critical Programming
in C strtok “string tokenizer” char *strtok(char *s, char const *delim); /* delim contains all possible ″tokens″: characters that separate ″tokens″. if delim non-NULL: return ptr to beginning of first token in s, and terminate token with NUL. if delim is NULL: use remainder of untokenized string from the last call to strtok */
9 CS 3090: Safety Critical Programming
in C strtok in action for ( token = strtok(line, whitespace); token != NULL; token = strtok(NULL, whitespace)) printf(″Next token is %s\n″, token);
d o g NUL c a t NUL NUL
NUL
line token
10 CS 3090: Safety Critical Programming
in C An implementation of strtok char* strtok(char *s, const char *delim) { static char *old = NULL; old contains the remains of an earlier s value char *token; (note use of static) if (! s) { s = old; if (! s) return NULL; } if (s) { NULL has been passed in for s, s += strspn(s, delim); so consult old if (*s == 0) { old = NULL; return NULL; } } strspn returns number of delimiters token = s; at beginning of s – skip past these characters s = strpbrk(s, delim); if (s == NULL) old = NULL; else { *s = 0; old = s + 1; } strpbrk gives the position of the next return token; delimiter. } s is updated to this position, but token still points to the token to return. 11 CS 3090: Safety Critical Programming in C Memory operations Like string operations, work on sequences of bytes but do not terminate when NUL encountered void *memcpy(void *dst, void const *src, size_t length); void *memcmp(void const *a, void const *b, size_t length); Note: memmove works like memcpy, but allows overlapping source, destination regions Remember, these operations work on bytes If you want to copy N items of type T, get the length right: memcpy(to, from, N * sizeof(T))