String Tokens
String Tokens
Parsing is the division of text into a set of discrete parts, or tokens, which in a certain sequence can convey a semantic meaning. The StringTokenizer class provides the first step in this parsing process, often called the lexer (lexical analyzer) or scanner. StringTokenizer implements the Enumeration interface. Therefore, given an input string, you can enumerate the individual tokens contained in it using StringTokenizer. To use StringTokenizer, you specify an input string and a string that contains delimiters. Delimitersare characters that separate tokens. Each character in the delimiters string is considered a valid delimiterfor example, ",;:" sets the delimiters to a comma, semicolon, and colon. The default set of delimiters consists of the whitespace characters: space, tab, newline, and carriage return. The StringTokenizer constructors are shown here: StringTokenizer(String str) StringTokenizer(String str, String delimiters) StringTokenizer(String str, String delimiters, boolean delimAsToken) In all versions, str is the string that will be tokenized. In the first version, the default delimiters are used. In the second and third versions, delimiters is a string that specifies the delimiters. In the third version, ifdelimAsToken is true, then the delimiters are also returned as tokens when the string is parsed. Otherwise, the delimiters are not returned. Delimiters are not returned as tokens by the first two forms. Once you have created a StringTokenizerobject, the nextToken( ) method is used to extract consecutive tokens. The hasMoreTokens( )method returns true while there are more tokens to be extracted. Since StringTokenizer implementsEnumeration, the hasMoreElements( ) and nextElement( ) methods are also implemented, and they act the same as hasMoreTokens( ) and nextToken( ), respectively. Here is an example that creates a StringTokenizer to parse "key=value" pairs. Consecutive sets of "key=value" pairs are separated by a semicolon. // Demonstrate StringTokenizer. import java.util.StringTokenizer; class STDemo { static String in = "title=Java-Samples;" + "author=Emiley J;" + "publisher=java-samples.com;" + "copyright=2007;"; public static void main(String args[]) { StringTokenizer st = new StringTokenizer(in, "=;"); while(st.hasMoreTokens()) { String key = st.nextToken(); String val = st.nextToken(); System.out.println(key + "\t" + val); } } } The output from this program is shown here: title Java-samples author Emiley J
Introduction
The java.util.StringTokenizer class allows an application to break a string into tokens. This class is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. Its methods do not distinguish among identifiers, numbers, and quoted strings. This class methods do not even recognize and skip comments.
Class declaration
Following is the declaration for java.util.StringTokenizer class: public class StringTokenizer extends Object implements Enumeration<Object>
Class constructors
S.N. Constructor & Description 1 2 3 StringTokenizer(String str) This constructor a string tokenizer for the specified string. StringTokenizer(String str, String delim) This constructor constructs string tokenizer for the specified string. StringTokenizer(String str, String delim, boolean returnDelims) This constructor constructs a string tokenizer for the specified string.
Class methods
S.N. Method & Description 1 int countTokens() This method calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception. boolean hasMoreElements() This method returns the same value as the hasMoreTokens method. boolean hasMoreTokens() This method tests if there are more tokens available from this tokenizer's string.
2 3
Object nextElement() This method returns the same value as the nextToken method, except that its declared return value is Object rather than String. String nextToken() This method returns the next token from this string tokenizer. String nextToken(String delim) This method returns the next token in this string tokenizer's string.
5 6
Methods inherited
This class inherits methods from the following classes: java.util.Object
You can think of the StringTokenizer as a specialized form of StreamTokenizer, where each token that is returned is a String. Depending upon what you want to do with the resultant tokens and/or how complex of a parse job you want to deal with lets you decide which to use. The StringTokenizer parses a string into a series of tokens. It is then your responsibility to figure out what each token returned is, beyond the definition of a token separated by delimiters. Usually, you just treat things as words when using StringTokenizer. On the other hand, a StreamTokenizer allows you to ask is the next token a number, word, quoted string, end-of-file, end-of-line, comment, or whitespace. In this case, the StreamTokenizer is smarter, though can be considerably harder to use.