String Tokenizer code not Reading File Correctly

236 views Asked by At

We're learning the uses of a HashMap data structure in class and I've been working on an assignment where we read in a file with 3 columns and a set number of rows. The first column is the full name of a user, the second is their username, and the third is their password.

import java.io.*;
import java.util.HashMap;
import java.util.ArrayList;
import java.util.Scanner;
import java.util.StringTokenizer;
public class Question2Client {
    public static void main(String[] args) throws IOException{
        Scanner in = new Scanner(System.in);
        System.out.println("Read the number file to read from.");
        ArrayList <String> list = new ArrayList<>();
        String filename = in.nextLine();
        File processes = new File(filename);
        Scanner inputFile = new Scanner(processes);
        String line, word;
        StringTokenizer token;
        HashMap <String, String> userDatabase = new HashMap<>();
        HashMap <String, String> fullName = new HashMap<>();

The problem comes in the reading of the file. I implement the above arraylist because I was running into issues with the StringTokenizer before. The logic for the code below is that whenever there is a string deliminated by a tab, it is added to the list (the code reads in left to right, and the file separates its entries by a tab indentation). Note: after debugging this is where I've identified the problem to be.

    while (inputFile.hasNext()){
        line = inputFile.nextLine();
        token = new StringTokenizer(line, "\t");
        while(token.hasMoreTokens()){
            word = token.nextToken();
            list.add(word);
        }
    }

From there I take the first items in the list and assign them to their appropriate places in the hashmap. The user's full name is the value for the second HashMap, the username is the key for both HashMaps, and the password is the value for the first HashMap. Later on I will be adding code to request an input from the user and if the password matches their username, it displays their information (i.e. access through userDatabase, display info from fullName).

for (int i=0; i<list.size(); i++){
        String name = list.remove(0);
        String uname = list.remove(0);
        String pass = list.remove(0);
        userDatabase.put(uname, pass);
        fullName.put(uname, name);
    }

The problem lies in the while loop: the StringTokenizer is not deliminating properly and I'm not sure why. The code for the HashMaps is fine (I've used it and variants myself in a different application), but the StringTokenizer effectively assigns the whole line as the variable 'word' and then adds it to the list. The output is as follows:

run:
Read the number file to read from.
MapTest.txt
[Ichabod Crane   icrane  qwerty123, Brom Bones  bbones  pass456!, Emboar Pokemon  epokemon    password123, Rayquaza Pokemon    rpokemon    drow456, Cool Dude   cdude   gh456!32, Trend Chaser    tchaser xpxo567!, Chuck Norris    cnorris power332*, Drum Dude   ddude   jflajdljfped]
[Trend Chaser    tchaser xpxo567!, Emboar Pokemon  epokemon    password123]
[Rayquaza Pokemon    rpokemon    drow456, Ichabod Crane   icrane  qwerty123]
BUILD SUCCESSFUL (total time: 11 seconds)

Can someone explain to me where my code is wrong with the StringTokenizer?

EDIT: Here's the text file, formatted only with spaces, tabs, and new lines:

Ichabod Crane   icrane  qwerty123
Brom Bones  bbones  pass456!
Emboar Pokemon  epokemon    password123
Rayquaza Pokemon    rpokemon    drow456
Cool Dude   cdude   gh456!32
Trend Chaser    tchaser xpxo567!
Chuck Norris    cnorris power332*
Drum Dude   ddude   jflajdljfped

For an easy understanding, think of it as organized in the following columns:

Ichabod Crane       icrane      qwerty123
Brom Bones          bbones      pass456!
Emboar Pokemon      epokemon    password123
Rayquaza Pokemon    rpokemon    drow456
Cool Dude           cdude       gh456!32
Trend Chaser        tchaser     xpxo567!
Chuck Norris        cnorris     power332*
Drum Dude           ddude       jflajdljfped
1

There are 1 answers

6
Joop Eggen On BEST ANSWER

List all delimiters in StringTokenizer, or drop this parameter for the default: " \t\n\r\f".

new StringTokenizer(line, " \t\n\r\f,;.?!");

After comments:

Do it without StringTokenizer, using String.split:

    line = inputFile.nextLine();
    String[] lineWords = line.split("(\t|\\s\\s+)", 3);
    Collection.addAll(list, lineWords);

As you see I too mistrust whether the "tab" chars are real tabs, and also use two or more spaces as delimiter.