Method doesn't work for large data set

158 views Asked by At

I am trying to find the most central character in a data set that contains every Marvel character and every book they've been in. The code I have written below works for a small test file that we created ourselves to test the method more quickly but when I run the code on the Marvel file, the code breaks from the very beginning. I put print statements through the whole code to find where it stopped working and I thought it would be something to do with iterating through so many characters but it doesn't work right from the start. In the very first while() loop I add startVertex to the group and I wrote a System.out.println(group) statement right after I added startVertex and when I run the test, the print statement gives "[]" (which I'm pretty sure means that the group isn't getting any anything from startVertex) and then gets stuck in an infinite loop (but for a small list of characters/books the code works perfectly fine)... Any suggestions on how to get it to work for the larger file?

EDIT: Here's links to the files. The large file had to be in raw form because github couldn't open it. They are both formatted the exact same and both files parse correctly from a tsv file into a multigraph.

Large file: https://raw.github.com/EECE-210/2013-L1A1/master/mp5/labeled_edges.tsv?token=5408881__eyJzY29wZSI6IlJhd0Jsb2I6RUVDRS0yMTAvMjAxMy1MMUExL21hc3Rlci9tcDUvbGFiZWxlZF9lZGdlcy50c3YiLCJleHBpcmVzIjoxMzg2NzAyNDczfQ%3D%3D--acf1694845215e7a40aca1d6c456769cd825ebcf

Small File: https://github.com/EECE-210/2013-L1A1/blob/master/mp5/testTSVfile.tsv

   /**
     * First find the largest connected set of characters and then 
     * find the most central character of all characters in this set.
     * 
     * @param none
     * @return the name of the character most central to the graph
     */
    public String findMostCentral() {

            Set<String> vertexSet = new LinkedHashSet<String>();
            vertexSet = vertexMap.keySet();
            Iterator<String> iterator = vertexSet.iterator();

            List<String> group = new ArrayList<String>();
            List<String> largestGroup = new ArrayList<String>();

            List<String> Path = new ArrayList<String>();
            Map<String, Integer> longestPathMap = new HashMap<String, Integer>();

            /*
             * This first while loop sets the starting vertex (ie the character that will be checked
             * with every other character to identify if there is/isn't a path between them.
             * We add the character to a group list to later identify the largest group of 
             * connected characters.
             */
            while(iterator.hasNext()){
                    String startVertex = iterator.next();
                    group.add(startVertex);

                    /*
                     * This second while loop sets the destination/end vertex (ie the character that is the 
                     * destination when compared to the starting character) to see if there is a path between
                     * the two characters. If there is, we add the end vertex to the group with the starting 
                     * vertex.
                     */
                    for(String key : vertexSet){
                            String endVertex = key;

                            if( findShortestPath(startVertex, endVertex) != null )
                                    group.add(endVertex);
                    }

                    /*
                     * If the group of connected characters is larger than the largest group, the largest
                     * group is cleared and replaced with the new largest group.
                     * After the group is copied to largest group, clear group.
                     */
                    if(group.size() > largestGroup.size()){
                            largestGroup.clear();
                            for(int i = 0; i < group.size(); i++){
                                    largestGroup.add(group.get(i));
                            }
                    }
                    group.clear();
            }

            /*
             * Iterate through the largest group to find the longest path each character has 
             * to any other character.
             */
            for(String LG : largestGroup){
                    String startingVertex = LG;
                    int longestPath = 0;

                    for(String LG2 : largestGroup){
                            String endingVertex = LG2;

                            Path = findShortestPath(startingVertex, endingVertex);

                            /*
                             * If the path size from startingVertex to endingVertex is longer than any other
                             * path that startingVertex is connected to, set it as the longest path for that
                             * startingVertex.
                             */
                            if(Path.size() > longestPath){
                                    longestPath = Path.size();
                            }
                    }
                    //save the starting vertex and it's longest path to a map
                    longestPathMap.put(startingVertex, longestPath);
            }

            /*
             * Iterates through the longestPathMap and finds the shortest longest path and assigns
             * the character with the shortest longest path to mostCentralCharacter.
             */
            int shortestLongestPath =  Integer.MAX_VALUE;
            String mostCentralCharacter = new String();

            for(Map.Entry<String, Integer> entry : longestPathMap.entrySet()){

                    if((Integer) entry.getValue() < shortestLongestPath){
                            shortestLongestPath = (Integer) entry.getValue();
                            mostCentralCharacter = (String) entry.getKey();
                    }        
            }

            return mostCentralCharacter;
    }
1

There are 1 answers

0
vastopa On

Thanks for the quick replies! I found the issue when printing vertexSet before any for-in loops started. The vertexSet first string was "" (ie nothing) so it would store the first string of "" in startVertex, then get the endVertex and then get stuck in an infinite loop trying to findShortestPath between nothing and a character.... Thanks for your help!