This document covers text processing in Java, focusing on the String and Character classes, as well as the StringBuilder class for mutable strings. It details methods for string manipulation, comparison, and regular expressions for pattern matching. The document also explains the immutability of strings and the use of type-wrapper classes for primitive types.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
2 views54 pages
Lecture 09 TextProcessinginJava
This document covers text processing in Java, focusing on the String and Character classes, as well as the StringBuilder class for mutable strings. It details methods for string manipulation, comparison, and regular expressions for pattern matching. The document also explains the immutability of strings and the use of type-wrapper classes for primitive types.
Reserved. Objectives • Review char, Character class and String Class: • What does it mean for String class to be immutable? • Use StringBuilder class to deal with mutable strings • Learn about Regular Expressions • Use regular expressions for matching and splitting strings.
Education, Inc. All Rights Reserved. Strings String class represent immutable strings; ▪ String literals (stored in memory as String objects) are written as a sequence of characters in double quotation marks. StringBuilder class represent mutable strings Both in java.lang package.
Education, Inc. All Rights Reserved. 14.3.1 String Constructors No-argument constructor creates a String that contains no characters (i.e., the empty string, which can also be represented as "") and has a length of 0. Constructor that takes a String object copies the argument into the new String. Constructor that takes a char array creates a String containing a copy of the characters in the array. Constructor that takes a char array and two integers creates a String containing the specified portion of the array.
Education, Inc. All Rights Reserved. 14.3.3 Comparing Strings (cont.) String methods startsWith and endsWith determine whether strings start with or end with a particular set of characters.
Education, Inc. All Rights Reserved. 14.4.1 StringBuilder Constructors No-argument constructor creates a StringBuilder with no characters in it and an initial capacity of 16 characters. Constructor that takes an integer argument creates a StringBuilder with no characters in it and the initial capacity specified by the integer argument. Constructor that takes a String argument creates a StringBuilder containing the characters in the String argument. The initial capacity is the number of characters in the String argument plus 16. Method toString of class StringBuilder returns the StringBuilder contents as a String.
Education, Inc. All Rights Reserved. 14.4.2 StringBuilder Methods length , capacity , setLength and ensureCapacity Methods length and capacity return the number of characters currently in a StringBuilder and the number of characters that can be stored in a without allocating more memory, respectively. Method ensureCapacity guarantees that a StringBuilder has at least the specified capacity. Method setLength increases or decreases the length of a StringBuilder. ▪ If the specified length is less than the current number of characters, the buffer is truncated to the specified length. ▪ If the specified length is greater than the number of characters, null characters are appended until the total number of characters in the StringBuilder is equal to the specified length.
Education, Inc. All Rights Reserved. Escape Sequences in Strings There are special characters that cannot be easily printed. Tab, newline, etc. We need a special approach to include special characters in strings ▪ System.out.println("She said:\n\t\"Hello!\"\n to me."); She said: "Hello! “ to me.
Education, Inc. All Rights Reserved. 14.6 Tokenizing String s When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization. String method split breaks a String into its component tokens and returns an array of Strings. Tokens are separated by delimiters ▪ Typically white-space characters such as space, tab, newline and carriage return. ▪ Other characters can also be used as delimiters to separate tokens.
Education, Inc. All Rights Reserved. 14.7 Regular Expressions, Class Pattern and Class Matcher (cont.) To match a set of characters use special characters[]-^. ▪ "[aeiou]" matches any single vowel. ▪ "[A-Y]" matches any single uppercase letter except for Z. ▪ "[A-z]" matches all characters (such as [ and \) with an integer value between uppercase A and lowercase z. ▪ "[A-Za-z]" matches all uppercase and lowercase letters. ▪ If the first character in the brackets is "^", the expression accepts any character other than those indicated. ● "[^Z]" matches any character other than capital Z, including lowercase letters and nonletters such as \n
Education, Inc. All Rights Reserved. 14.7 Regular Expressions, Class Pattern and Class Matcher (cont.) String method matches receives a String that specifies the regular expression and matches the contents of the String object on which it’s called to the regular expression. ▪ The method returns a boolean indicating whether the match succeeded. A regular expression consists of literal characters and special symbols.