Lets say there is a pool of data taken a CSV file where we have a key value pairs, but keys are not unique. The requirement is to sift through each and every line and convert the CSV data into something useful. I'll make an example using a game log with format:
player, pointChange, timestamp
What I would like to do (which seems like a common operation) is to create a summary - how many points has over time. My idea was to create an inner class that represents a single entry:
private class GameFrame{
private String player;
private int points;
private ArrayList<String> timeline = new ArrayList<String>();
private ArrayList<int> pointHistory = new ArrayList<int>();
GameFrame(String player, int points, String time){
this.player = player;
this.points = points;
this.time.add(time);
}
public String getName(){return this.player;}
public void increment(int change){
this.pointHistory.add(this.points);
this.points += change;} //will work with negatives to decrement points as well
public void timeProgress(String time){this.time.add(time);}
}
The actual challenge: The original data is of unknown size and is read line by line. Is there a good practice/recommended method to process such data. I was thinking about making a list of all GameFrame objects and nesting a second loop, something like this:
pseudocode:
for(everything in the input list){
load up line data;
for(everything in gameFrame list){
compare names;
if names match - update with data
return;}
got out of inner loop so it's a new player.
create entry for new player and add it to gameFrame list
}
Is it a good approach or there is a better way of doing it (perhaps to sort the data first or by using a library I don't know about)?
UPDATE: I will try to do this using a hashmap instead of ListArray as suggested by Luke
Heavy solution: Database
More appropriate if you're going to have lots of records, you want to do the parsing/inserting in one session once, and then do processing later/multiple times, and if you're going to be appending data constantly. Databases make it really easy to work with sets of data.
Create a table named
frames
, with fieldsplayer
(varchar),point_change
(int) andtimestamp
(datetime), or similar. In the parsing step, simply insert the rows. Then you canselect distinct player from frames;
to get all players. Orselect player, sum(pointChange) from frames group by player;
to get points for a particular player. Or include the timestamp in a where clause to get points over a particular window of time.Light solution: HashMap
More appropriate if you're going to do this one time. Or if there are so few records that it can be run many times trivially. It avoids the whole 'setting up a database' step.