devxlogo

Using String.split(String) vs. Using a StringTokenizer

Using String.split(String) vs. Using a StringTokenizer

Most programmers use the String.split(String) method to convert a String to a String array specifying a delimiter. However, I feel it’s unsafe to rely on the split() method in some cases, because it doesn’t always work properly. For example, sometimes after calling split() the first array index holds a space character even though the string contains no leading space. Here’s an example where split() fails:

public class StringTest {   public static void main(String[] args) {      final String SPLIT_STR = "^";      final String mainStr = "Token-1^Token-2^Token-3";      final String[] splitStr = mainStr.split(SPLIT_STR);      System.out.println("First Index Of ^ : " +          mainStr.indexOf(SPLIT_STR));      for(int index=0; index 

This program outputs:

  First Index Of ^ : 7  Split : Token-1^Token-2^Token-3

But the expected output would be:

  First Index Of ^ : 7  Split : Token-1  Split : Token-2  Split : Token-3

In this case, the split doesn't work because the caret character delimiter needs to be escaped. The workaround in this case is to declare SPLIT_STR = "\^". With that change, the output matches the expected output.

A safer way to split the string would be by using the StringTokenizer API. Here's an example:

import java.util.StringTokenizer;public class StringTest {   public static void main(String[] args) {      final String SPLIT_STR = "^";      final String mainStr = "Token-1^Token-2^Token-3";      final StringTokenizer stToken = new StringTokenizer(         mainStr, SPLIT_STR);      final String[] splitStr = new String[stToken.countTokens()];      int index = 0;      while(stToken.hasMoreElements()) {         splitStr[index++] = stToken.nextToken();      }      for(index=0; index 

The output of the preceding program is:

Tokenizer : Token-1Tokenizer : Token-2Tokenizer : Token-3

devx-admin

Share the Post: