If we have a comma delimited string like “token0,token1,token2,,token4” that has some token(s) missing, and we try to use java.util.StringTokenizer to place each delimited token in a slot of an array A such that:
A[0]="token0"; A[1]="token1"; A[2]="token2"; A[3]=""; A[4]="token4";
we might write code like:
String s = "token0,token1,token2,,token4"; java.util.StringTokenizer stringTokenizer= new java.util.StringTokenizer(s,","); int index = 0; int numTokens = stringTokenizer.countTokens(); System.out.println("num tokens: "+numTokens); String[] A = new String[numTokens]; if(numTokens>0) { while(stringTokenizer.hasMoreTokens()) { String token = stringTokenizer.nextToken(); A[index++] = token; } }
Notice that array A contains:
"token0","token1","token2","token4";
We get no indication of the missing token. In fact, the variable numTokens will have the value of 4, representing the number of tokens that are actually in a tokenized string.
The way around this behavior, if we want an empty string indication of the missing token, lies in one of the constructors of the java.util.StringTokenizer class, namely:
public StringTokenizer(String str, String delim, boolean returnTokens)
Here, the boolean returnTokens indicates to the java.util.StringTokenizer instance that we’d like to have, in addition to the actual tokens, the delimiters to be returned as tokens.
The following method uses the above mentioned StringTokenizer constructor, and returns a java.util.Vector whose elements are the tokens of a tokenized string and empty strings for any missing tokens.
Vector split(String input,String delimiter) { boolean wasDelimiter=true; String token=null; Vector v=new Vector(); StringTokenizer st=new StringTokenizer(input,delimiter,true); while(st.hasMoreTokens()) { token=st.nextToken(); if(token.equals(delimiter)) { if(wasDelimiter) { token=""; } else { token=null; } wasDelimiter=true; } else { wasDelimiter=false; } if(token!=null) { v.addElement(token); } } return v; }