Browse DevX
Sign up for e-mail newsletters from DevX

Tip of the Day
Language: Java
Expertise: Intermediate
Sep 25, 2009



Building the Right Environment to Support AI, Machine Learning and Deep Learning

Using String.split(String) vs. Using a StringTokenizer

Most programmers use the String.split(String) method to convert a String to a String array specifying a delimiter. However, I feel it's unsafe to rely on the split() method in some cases, because it doesn't always work properly. For example, sometimes after calling split() the first array index holds a space character even though the string contains no leading space. Here's an example where split() fails:

public class StringTest {
   public static void main(String[] args) {
      final String SPLIT_STR = "^";
      final String mainStr = "Token-1^Token-2^Token-3";
      final String[] splitStr = mainStr.split(SPLIT_STR);

      System.out.println("First Index Of ^ : " + 

      for(int index=0; index < splitStr.length; index++) {
         System.out.println("Split : " + splitStr[index]);

This program outputs:

  First Index Of ^ : 7
  Split : Token-1^Token-2^Token-3

But the expected output would be:

  First Index Of ^ : 7
  Split : Token-1
  Split : Token-2
  Split : Token-3

In this case, the split doesn't work because the caret character delimiter needs to be escaped. The workaround in this case is to declare SPLIT_STR = "\\^". With that change, the output matches the expected output.

A safer way to split the string would be by using the StringTokenizer API. Here's an example:

import java.util.StringTokenizer;

public class StringTest {
   public static void main(String[] args) {
      final String SPLIT_STR = "^";
      final String mainStr = "Token-1^Token-2^Token-3";
      final StringTokenizer stToken = new StringTokenizer(
         mainStr, SPLIT_STR);
      final String[] splitStr = new String[stToken.countTokens()];

      int index = 0;
      while(stToken.hasMoreElements()) {
         splitStr[index++] = stToken.nextToken();
      for(index=0; index < splitStr.length; index++) {
         System.out.println("Tokenizer : " + splitStr[index]);

The output of the preceding program is:

Tokenizer : Token-1
Tokenizer : Token-2
Tokenizer : Token-3
Hrudananda Pattanaik
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date