0% found this document useful (0 votes)
29 views30 pages

CStrings

The document discusses strings and characters in C programming. It covers: - The ASCII character set and how characters are stored as unsigned integers (char data type) - Common functions for manipulating individual characters from the ctype.h library - How strings are represented as arrays of characters terminated by a null character ('\0') - Functions for manipulating strings from the string.h library like strlen(), strcpy(), strcmp() - Safe and unsafe methods for inputting strings from the user like using a field width with scanf() - Storing arrays of strings by initializing char arrays of fixed sizes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views30 pages

CStrings

The document discusses strings and characters in C programming. It covers: - The ASCII character set and how characters are stored as unsigned integers (char data type) - Common functions for manipulating individual characters from the ctype.h library - How strings are represented as arrays of characters terminated by a null character ('\0') - Functions for manipulating strings from the string.h library like strlen(), strcpy(), strcmp() - Safe and unsafe methods for inputting strings from the user like using a field width with scanf() - Storing arrays of strings by initializing char arrays of fixed sizes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Programming in C

Characters and Strings


ASCII
• The American Standard Code for Information
Interchange (ASCII) character set, has 128 characters
designed to encode the Roman alphabet used in English
and other Western European languages.
• C was designed to work with ASCII and we will only use
the ASCII character set in this course. The char data
type is used to store ASCII characters in C
• ASCII can represent 128 characters and is encoded in
one eight bit byte with a leading 0. Seven bits can encode
numbers 0 to 127. Since integers in the range of 0 to 127
can be stored in 1 byte of space, the sizeof(char) is 1.
• The characters 0 through 31 represent control characters
(e.g., line feed, back space), 32-126 are printable
characters, and 127 is delete
char type
• C supports the char data type for storing a single
character.
• char uses one byte of memory
• char constants are enclosed in single quotes
▫ char myGrade = ‘A’;
ASCII Character Chart
Special Characters
• The backslash character, \, is used to indicate that
the char that follows has special meaning. E.g. for
unprintable characters and special characters.
• For example
▫ \n is the newline character
▫ \t is the tab character
▫ \” is the double quote (necessary since double quotes
are used to enclose strings
▫ \’ is the single quote (necessary since single quotes are
used to enclose chars
▫ \\ is the backslash (necessary since \ now has special
meaning
▫ \a is beep which is unprintable
Special Char Example Code

printf(“\t\tMove over\n\nWorld, here I come\n");

Move over

World, here I come

printf("I’ve written \”Hello World\”\n\t many


times\n\a“);

I’ve written “Hello World”


many times <beep>
Character Library
• There are many functions to handle characters.
▫ #include <ctype.h> - library of functions
• Note that the function parameter type is int, not
char. Why is this ok?
• Note that the return type for some functions is int
since ANSI C does not support the bool data type.
Recall that zero is “false”, non-zero is “true”.
• A few of the commonly used functions are listed on
the next slide. For a full list of ctype.h functions,
type man ctype.h at the unix prompt.
ctype.h
• int isdigit (int c);
▫ Determine if c is a decimal digit (‘0’ - ‘9’)
• int isxdigit(int c);
▫ Determines if c is a hexadecimal digit (‘0’ - ’9’, ‘a’ - f’, or ‘A’ - ‘F’)
• int isalpha (int c);
▫ Determines if c is an alphabetic character (‘a’ - ‘z’ or ‘A- ‘Z’)
• int isspace (int c);
▫ Determines if c is a whitespace character (space, tab, etc)
• int isprint (int c);
▫ Determines if c is a printable character
• int tolower (int c);
• int toupper (int c);
▫ Returns c changed to lower- or upper-case respectively, if
possible
Character Input/Output
• Use %c in printf( )and fprintf( )to output a single
character.
▫ char yourGrade = ‘A’;
▫ printf( “Your grade is %c\n”, yourGrade);

• Input char(s) using %c with scanf( ) or fscanf( )


▫ char grade, scores[3];

▫ %c inputs the next character, which may be


whitespace
scanf(“%c”, &grade);
Array of char
• An array of chars may be (partially) initialized.
This declaration reserves 20 char (bytes) of
memory, but only the first 5 are initialized
▫ char name2 [ 20 ] = { ‘B’, ‘o’, ‘b’, ‘b’, ‘y’ };
• You can let the compiler count the chars for you.
This declaration allocates and initializes exactly
5 chars (bytes) of memory
▫ char name3 [ ] = { ‘B’, ‘o’, ‘b’, ‘b’, ‘y’ };

• An array of chars is NOT a string


Strings in C
• In C, a string is an array of characters terminated with the
“null” character (‘\0’, value = 0, see ASCII chart).
• A string may be defined as a char array by initializing the last
char to ‘\0’
▫ char name4[ 20 ] = {‘B’, ‘o’, ‘b’, ‘b’, ‘y’, ‘\0’ };
• Char arrays are permitted a special initialization using a
string constant. Note that the size of the array must account
for the ‘\0’ character.
▫ char name5[6] = “Bobby”; // this is NOT assignment
• Or let the compiler count the chars and allocate the
appropriate array size
▫ char name6[ ] = “Bobby”;
• All string constants are enclosed in double quotes and include
the terminating ‘\0 character
String Output
• Use %s in printf( ) or fprintf( ) to print a string. All chars
will be output until the ‘\0’ character is seen.
▫ char name[ ] = “Bobby Smith”;
▫ printf( “My name is %s\n”, name);

• As with all conversion specifications, a minimum field


width and justification may be specified
▫ char book1[ ] = “Flatland”;
▫ char book2[ ] = “Brave New World”;

▫ printf (“My favorite books are %12s and %12s\n”, book1,


book2);
▫ printf (“My favorite books are %-12s and %-12s\n”, book1,
book2);
Dangerous String Input
• The most common and most dangerous method to get
string input from the user is to use %s with scanf( ) or
fscanf( )
• This method interprets the next set of consecutive non-
whitespace characters as a string, stores it in the
specified char array, and appends a terminating ‘\0’
character.
▫ char name[22];
▫ printf(“ Enter your name: “);
▫ scanf( “%s”, name);

• Why is this dangerous?


• See scanfString.c and fscanfStrings.c
Safer String Input
• A safer method of string input is to use %ns with
scanf( ) or fscanf( ) where n is an integer
• This will interpret the next set of consecutive
non-whitespace characters up to a maximum of
n characters as a string, store it in the specified
char array, and append a terminating ‘\0’
character.
▫ char name[ 22 ];
▫ printf( “Enter your name: “);
▫ scanf(“%21s”, name); // note 21, not 22
C String Library
• C provides a library of string functions.
• To use the string functions, include <string.h>.
• Some of the more common functions are listed
here on the next slides.
• To see all the string functions, type
man string.h at the unix prompt.
C String Library (2)
• Commonly used string functions
• These functions look for the ‘\0’ character to determine
the end and size of the string
▫ strlen( const char string[ ] )
 Returns the number of characters in the string, not including
the “null” character
▫ strcpy( char s1[ ], const char s2[ ] )
 Copies s2 on top of s1.
 The order of the parameters mimics the assignment operator
▫ strcmp ( const char s1[ ] , const char s2[ ] )
 Returns < 0, 0, > 0 if s1 < s2, s1 == s2 or s1 > s2 lexigraphically
▫ strcat( char s1[ ] , const char s2[ ])
 Appends (concatenates) s2 to s1
C String Library (3)
• Some function in the C String library have an
additional size parameter.
▫ strncpy( char s1[ ], const char s2[ ], int n )
 Copies at most n characters of s2 on top of s1.
 The order of the parameters mimics the assignment
operator
▫ strncmp ( const char s1[ ] , const char s2[ ], int n )
 Compares up to n characters of s1 with s2
 Returns < 0, 0, > 0 if s1 < s2, s1 == s2 or s1 > s2
lexigraphically
▫ strncat( char s1[ ], const char s2[ ] , int n)
 Appends at most n characters of s2 to s1
String Code
• char first[10] = “bobby”;
• char last[15] = “smith”;
• char name[30];
• char you[ ] = “bobo”;

• strcpy( name, first );


• strcat( name, last );
• printf( “%d, %s\n”, strlen(name), name );

• strncpy( name, last, 2 );


• printf( “%d, %s\n”, strlen(name), name );

• int result = strcmp( you, first );


• result = strncmp( you, first, 3 );
• strcat( first, last );
Simple Encryption
• char c, msg[] = "this is a secret message";
• int i = 0;
• char code[26] = /* Initialize our encryption code */
• {'t','f','h','x','q','j','e','m','u','p','i','d','c',
• 'k','v','b','a','o','l','r','z','w','g','n','s','y'} ;

• /* Print the original phrase */


• printf ("Original phrase: %s\n", msg);

• /* Encrypt */
• while( msg[i] != '\0‘ ){
▫ if( isalpha( msg[ i ] ) ) {
 c = tolower( msg[ i ] ) ;
 msg[ i ] = code[ c - ‘a’ ] ;
▫ }
▫ ++i;
• }
• printf("Encrypted: %s\n", msg ) ;
Arrays of Strings
• Since strings are arrays themselves, using an array
of strings can be a little tricky
• An initialized array of string constants
▫ char months[ ][ 10 ] = {
▫ “Jan”, “Feb”, “March”, “April”, “May”, “June”,
▫ “July”, “Aug”, “Sept”, “Oct”, “Nov”, “Dec”
▫ };
▫ int m;
▫ for ( m = 0; m < 12; m++ )
▫ printf( “%s\n”, months[ m ] );
Arrays of Strings (2)
• An array of 12 string variables, each 20 chars
long
▫ char names[ 12 ] [ 21 ];
▫ int n;
▫ for( n = 0; n < 12; ++n )
▫ {
▫ printf( “Please enter your name: “ );
▫ scanf( “%20s”, names[ n ] );
▫ }
gets( ) to read a line
• The gets( ) function is used to read a line of input
(including the whitespace) from stdin until the \n
character is encountered. The \n character is
replaced with the terminating \0 character.
▫ #include <stdio.h>
▫ char myString[ 101 ];
▫ gets( myString );

• Why is this dangerous?


• See gets.c
fgets( ) to read a line
• The fgets( ) function is used to read a line of
input (including the whitespace) from the
specified FILE until the \n character is
encountered or until the specified number of
chars is read.

• See fgets.c
fgets( )
• #include <stdio.h>
• #include <stdlib.h> /* exit */
• int main ( )
• {
• double x ;
• FILE *ifp ;
• char myLine[42 ]; /* for terminating \0 */

• ifp = fopen("test_data.dat", "r");


• if (ifp == NULL) {
• printf ("Error opening test_data.dat\n");
• exit (-1);
• }

• fgets(myLine, 42, ifp ); /* read up to 41 chars*/


• fclose(ifp); /* close the file when finished */

• /* check to see what you read */


• printf(”myLine = %s\n”, myLine);
• return 0;
• }
Detecting EOF with fgets( )
• fgets( ) returns the memory address in which the line was
stored (the char array provided). However, when fgets( )
encounters EOF, the special value NULL is returned.

FILE *inFile;
inFile = fopen( “myfile”, “r” );

/* check that the file was opened */

char string[120];
while ( fgets(string, 120, inFile ) != NULL )
printf( “%s\n”, string );

fclose( inFile );
Using fgets( ) instead of gets( )
• Since fgets( ) can read any file, it can be used in
place of gets( ) to get input from the user

▫ #include <stdio.h>
▫ char myString[ 101 ];

• Instead of
▫ gets( myString );

• Use
▫ fgets( mystring, 100, stdin );
“Big Enough”
• The “owner” of a string is responsible for allocating
array space which is “big enough” to store the string
(including the null character).
▫ scanf( ), fscanf( ), and gets( ) assume the char array
argument is “big enough”
• String functions that do not provide a parameter for
the length rely on the ‘\0’ character to determine the
end of the string.
• Most string library functions do not check the size of
the string memory. E.g. strcpy

• See strings.c
28

What can happen?


• int main( )
• {
• char first[10] = "bobby";
• char last[15] = "smith";

• printf("first contains %d chars: %s\n", strlen(first), first);


• printf("last contains %d chars: %s\n", strlen(last), last);

• strcpy(first, "1234567890123"); /* too big */

• printf("first contains %d chars: %s\n", strlen(first), first);


• printf("last contains %d chars: %s\n", strlen(last), last);

• return 0;
• }

• /* output */
• first contains 5 chars: bobby
• last contains 5 chars: smith
• first contains 13 chars: 1234567890123
• last contains 5 chars: smith
• Segmentation fault
The Lesson
• Avoid scanf( “%s”, buffer);
• Use scanf(“%100s”, buffer); instead

• Avoid gets( );
• Use fgets(..., ..., stdin); instead
sprintf( )
• Sometimes it’s necessary to format a string in an
array of chars. Something akin to toString( ) in
Java.
• sprintf( ) works just like printf( ) or fprintf( ), but
puts its “output” into the specified character array.
• As always, the character array must be big enough.
• See sprintf.c

• char message[ 100 ];


• int myAge = 4;
• sprintf( message, “I am %d years old\n”, age);
• printf( “%s\n”, message);

You might also like