Is StringTokenizer more efficient in splitting strings in JAVA?

2.5k views Asked by At

I have been solving a problem Anti-Blot System from SPOJ

First I tries splitting the input string using String's split method and i got TLE after submission

My Code using split method

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;
import java.util.regex.Pattern;
import java.util.regex.Matcher;


class ABSYS {
    public static void main(String[] args) throws IOException {
        int t;
        String[] numArray = new String[2];
        String[] numArray2 = new String[2];
        BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
        t = Integer.parseInt(reader.readLine());
        while(t > 0) {
            String input = reader.readLine();
            if(input.isEmpty()) {
                continue;
            }
            numArray = input.split("\\s{1}=\\s{1}");
            numArray2 = numArray[0].split("\\s{1}\\+\\s{1}");
            Pattern pattern = Pattern.compile("machula");
            Matcher matcher = pattern.matcher(numArray[1]);
            if(matcher.find()) {
                System.out.println(numArray[0] + " = " + (Integer.parseInt(numArray2[0]) + Integer.parseInt(numArray2[1])));
            }
            else {
                matcher = pattern.matcher(numArray2[0]);
                if(matcher.find()) {
                    System.out.println((Integer.parseInt(numArray[1]) - Integer.parseInt(numArray2[1])) + " + " + numArray2[1] + " = " + numArray[1]);
                }
                else {
                    System.out.println(numArray2[0] + " + " + (Integer.parseInt(numArray[1]) - Integer.parseInt(numArray2[0])) + " = " + numArray[1]);
                }
            }
            t--;
        }
    }
}

After many try i failed and was unable to make my code more time efficient

Then, today i read about StringTokenizer and used it in my code and I got it right there (on spoj)

My Code using StringTokenizer

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;
import java.util.regex.Pattern;
import java.util.StringTokenizer;


class ABSYS {
    public static void main(String[] args) throws IOException {
        int t, a = 0, b = 0, c = 0, matchula = 0;
        BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
        Pattern pattern = Pattern.compile("^(\\d)+$");
        t = Integer.parseInt(reader.readLine());
        while(t > 0) {
            String input = reader.readLine();
            if(input.isEmpty()) {
                continue;
            }
            StringTokenizer tokenizer = new StringTokenizer(input);
            String token = tokenizer.nextToken();
            if(pattern.matcher(token).matches()) {
                a = Integer.parseInt(token);
            }
            else
                matchula = 1;

            tokenizer.nextToken();
            token = tokenizer.nextToken();
            if(pattern.matcher(token).matches()) {
                System.out.println("b = " + token);
                b = Integer.parseInt(token);
            }
            else
                matchula = 2;

            tokenizer.nextToken();
            token = tokenizer.nextToken();
            if(pattern.matcher(token).matches()) {
                c = Integer.parseInt(token);
            }
            else
                matchula = 3;
            switch(matchula) {
                case 1: System.out.println((c-b) + " + " + b + " = " + c);
                        break;
                case 2: System.out.println(a + " + " + (c-a) + " = " + c);
                        break;
                case 3: System.out.println(a + " + " + b + " = " + (a+b));
                        break;
            }
            t--;
        }
    }
}

In JAVA Docs, they discourage to use StringTokenizer.

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

As mentioned in Jason S answer here

if I wanted to tokenize a string with more complex logic than single characters (e.g. split on \r\n), I can't use StringTokenizer but I can use String.split().

My Doubts

  1. Why is it so, even though i found it more time efficient.
  2. What is the reason behind discouraging the use of StringTokenizer ?
  3. What if one wants to use simple regex like my problem, then is StringTokenizer better than String.split() ?
2

There are 2 answers

2
Oyebisi On BEST ANSWER

String.split() is more flexible and easier to use than StringTokenizer. StringTokenizer predates Java support for regular expression while String.split() supports regular expressions, this makes it a whole lot more powerful than StringTokenizer. Also the results of String.split is a string array which is usually how we want our results. StringTokenizer is indeed faster that String.split() but for most practical purposes String.split() is fast enough.

Check the answers on this question for more details Scanner vs. StringTokenizer vs. String.Split

0
MadConan On

While technically true that, overall, StringTokenizer is faster than String.split(), when you narrow the scope to single character delimiters, they are almost the same in terms of performance.

Looking at String.split() source code shows that it checks if the regex pattern is a simple delimiter and if so, it does an old-fashioned while loop to search the String. I saw almost no difference in times when using a single char to parse strings in a simple test I whipped up. This is the typical use case for StringTokenizer. Therefore, it really isn't worth all of the extra code for such a tiny performance boost.