In the previous sections we learned how to use the split() method of the String class to decompose a string. In this section we will learn how to use the StringTokenizer object to decompose a string. Unlike the split() method, the StringTokenizer object does not use regular expressions. as a separator mark.
First of all, we need to know a concept - language symbols . The so-called language symbols mean that when we analyze a string, we decompose the string into words that can be used independently, and these words are called language symbols.
For example, for the string You are welcome, if spaces are used as the delimiting mark of the string, then the string has three words, that is, three language symbols. For the string You, are, welcome, if the comma is used as the delimiting mark of the string, then the string also has three language symbols.
When we analyze a string and decompose the string into words that can be used independently, we can use the StringTokenizer class in the java.util package, which has two commonly used constructors:
Construct a parser for the string s, using the default delimiter marks, that is, space characters (several spaces are regarded as one space), line feed characters, carriage return characters, Tab characters, and feed characters as separation marks.
Constructs a parser for the string s. The characters in the parameter delim are used as delimiters.
Note : Any combination of delimiters is still a delimiter.
For example:
StringTokenizerfenxi=newStringTokenizer(youarewelcome);StringTokenizerfenxi=newStringTokenizer(you,are;welcome,,;);
Call a StringTokenizer object a string analyzer. An analyzer can use the nextToken() method to obtain the language symbols in the string one by one. Whenever nextToken() is called, the next language symbol will be obtained in the string. Each time When a language symbol is obtained, the value of the counting variable in the string analyzer is automatically decremented by 1. The initial value of the counting variable is equal to the number of words in the string.
Usually a while loop is used to obtain language symbols one by one. In order to control the loop, you can use the hasMoreTokens() method in the StringTokenizer class. As long as there are language symbols in the string, that is, the value of the count variable is greater than 0, this method returns true, otherwise it returns false. In addition, you can have the analyzer call the countTokens() method at any time to get the value of the count variable in the analyzer.
For example:
importjava.util.*;publicclassMain{publicstaticvoidmain(Stringargs[]){Strings=welcometodotcpp(thankyou),nicetomeetyou;StringTokenizerfenxi=newStringTokenizer(s,(),);intnumber=fenxi.countTokens();while(fenxi.hasMoreTokens() ){Stringstr=fenxi.nextToken();System.out.print(str+);}System.out.println(Total words: +number+);}}
The running results are as follows:
welcometodotcppthankyounicetomeetyou total words: 9