how to keep 10 biggest integer while reading a list in java?

550 views Asked by At

I have a list of integer in a list and want to keep the 10 biggest while I'm iterating over the list and keep them in a TreeSet. I'm using this code :

for(Integer i : list) {                
            if (treeSet.size() < 10) {                    
                treeSet.add(i);
            } else if (i > treeSet.last()) {
                treeSet.pollLast();
                treeSet.add(i);
            }
        }

but results is defferent when I change order of integers in list before running this code. Is my code true?

4

There are 4 answers

0
user3707125 On BEST ANSWER

treeSet is holding its data in ascending order (first is the smallest), so you should be replacing first element, not last:

for(Integer i : list) {                
    if (treeSet.size() < 10) {                    
        treeSet.add(i);
    } else if (i > treeSet.first()) {
        treeSet.pollFirst();
        treeSet.add(i);
    }
}
1
Jordi Castilla On

Use Collection.sort() and get the desired number of results with List.subList().

// Sort the arraylist
Collections.sort(list); 
//gets the last 10 item, last is the largest for ascending default sort
list.subList(list.size()-10, list.size());
0
griFlo On

Try it with streams (if you can use Java 8):

final Comparator<Integer> order = Integer::compare;
List<Integer> collect = list.stream().sorted(order.reversed()).limit(10).sorted().collect(Collectors.toList());

you could also use parallelStream() instead of stream() which should be faster.

http://blog.takipi.com/new-parallelism-apis-in-java-8-behind-the-glitz-and-glamour/

0
Marco13 On

Warning: Note that your approach is flawed in general (and this flaw is also present in the originally accepted answer): This may deliver wrong results when there are duplicate elements in the list. A random example where I noticed this is an input list like

List<Integer> list = Arrays.asList(
     0, 8, 9, 7, 15, 13, 11, 1, 19, 14, 
     17, 17, 13, 2, 15, 4, 4, 15, 1, 0); 

for which it returns

[1, 7, 8, 9, 11, 13, 14, 15, 17, 19]

instead of

[4, 7, 8, 9, 11, 13, 14, 15, 17, 19]

A simple, straightforward solution would be the following:

private static <T> Collection<T> keepLargest(
    List<? extends T> list, Comparator<T> comparator, int n)
{
    TreeSet<T> treeSet = new TreeSet<T>(comparator);
    for(T t : list) 
    {                
        treeSet.add(t);
        while (treeSet.size() > n)
        {
            treeSet.pollFirst();
        }
    }
    return treeSet;
}

which can be called as

Collection<Integer> kept = 
    keepLargest(list, Comparator.naturalOrder(), 10);

One could now argue about the repeated insertions and removals (that could be avoided), but since the tree set is small, the log(n) for the insertion should be negligible. If this performance is highly relevant, one could consider using a similar approach with a PriorityQueue as well, or try to optimize this solution, but in its current form, it's very simple and not error-prone due to (premature?) optimizations.