GC overhead limit exceed when reading large file

984 views Asked by At

I want to read my .csv file line by line to save myself from loading everything into RAM at once. I thought this is the way to do it. I also wrote the code in a way that no variables are declared within the loop to save the JVM from always creating new objects and running the Garbage Collector.

However, I keep running into this "GC overhead limited exceeded" error. My CPU also runs with nearly 100%.

Here the problem was caused by the HashMap storing millions String objects - but mine should "only" store about 20.000 of my Node objects.

Please help me find the problematic part of my code. The error reports the line marked in source code below.

This is my code:

HashMap<String,TweetNode> allNodes = new HashMap<String,TweetNode>();
    // read file
    try {
        BufferedReader br = new BufferedReader(new FileReader(graphFile));
        noOfNodes = 0;
        String line = br.readLine();
        String firstNode;
        String[] lineContent;
        while (line != null) {
            lineContent = line.split("\t"); // error occurs here!
            // always look at the first node
            firstNode = lineContent[0];
            if (! allNodes.containsKey(firstNode)) {
                allNodes.put(firstNode, new TweetNode(noOfNodes, firstNode));
                noOfNodes++;
            }
            allNodes.get(firstNode).addNeighbour(lineContent[1], Double.valueOf(lineContent[2]));
            line = br.readLine();
        }
        br.close();
    } 
    // ... catch stuff ...
return allNodes;
}
1

There are 1 answers

7
Bhargav Kumar R On

The only problem I can see here is your map. Map is filling up the heap memory. You should be running your application with low heap memory. Check the current values by visiting below arguments and set those to a reasonable high value.

The flag Xmx specifies the maximum memory allocation pool for a Java Virtual Machine (JVM), while Xms specifies the initial memory allocation pool.