dcsimg
Login | Register   
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX

By submitting your information, you agree that devx.com may send you DevX offers via email, phone and text message, as well as email offers about other products and services that DevX believes may be of interest to you. DevX will process your information in accordance with the Quinstreet Privacy Policy.


Tip of the Day
Language: Java Language
Expertise: Beginner
Apr 5, 1999

WEBINAR:

On-Demand

Application Security Testing: An Integral Part of DevOps


StringTokenizer: Multiple Delimiter Characters

Question:
I am writing a text search applet that searches through a text file loaded from the server. I am using StringTokenizer to isolate the individual words. The problem is that the words in the file are not necessarily separated by spaces or just one specific delimiter. How do I make StringTokenizer ignore extra characters such as quotes and just tokenize the single word?

Answer:
The documentation for StringTokenizer can lead you to believe that it is capable of recognizing only a single delimiter at a time. But if read it carefully, you will find that StringTokenizer can recognize any number of delimiters. The delimiter argument of the StringTokenizer constructor is a string whose every character is interpreted as a delimiter. The string as a whole is not the delimiter, but rather, its constituent characters are each a delimiter. For example, to use spaces, commas, and colons as delimiters, you would create a StringTokenizer with:

StringTokenizer tokenizer = new StringTokenizer(input, " ,:");

Using StringTokenizer to parse a file is generally not efficient if you read the file a line at a time because you have to create a new tokenizer for each line. In addition, the parsing ability of StringTokenizer is minimal. Imagine that you wanted to use a multicharacter delimiter; you can't do this with StringTokenizer. For more complicated tokenization, you may want to look into a regular expression library or a lexer generator.

DevX Pro
 
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap
×
We have made updates to our Privacy Policy to reflect the implementation of the General Data Protection Regulation.
Thanks for your registration, follow us on our social networks to keep up-to-date