Sorting algorithm that is stable in sorting ten million objects

751 views Asked by At

I am trying to sort 10 Million Account objects in an array or array list. The Account class implements the comparable interface. with some variables such as age, acct number, etc. I need to sort this array or array list by age, and I need to keep the relative ordering of the accounts with the same age.

I am thinking that I would use a Mergesort in this application, because 1) Mergesort is a stable comparable sort that will keep the relative ordering, and it has the best worst case time of n log n. However a Binary Tree Sort would have similar effects with the same time complexity with this amount of objects. What do you think?

5

There are 5 answers

1
tj-recess On

If you really wanna sort by 'age', how about using Counting Sort (http://en.wikipedia.org/wiki/Counting_sort)? You can maintain same relative order as original in at most 2 iterations or 2n lookups.

2
Pavan Kumar K On

I prefer using merge sort as it does not add space complexcity.

Quickost would also be considered providing space & memory allocation is not a constraint

0
Cyclotron3x3 On

It depends on the parameters like how optimal of solution does the problem require, what are your main operations after sorting, is it 32 bit or 64-bit numbers . i.e What are your project requirements.

Look at the difference between internal sorting and external sorting. Your approach requires external sorting mechanism.

For example, if they want to count the ages of the employees, you probably use the Counting sort, It can sort the data in the memory.

But for fairly random data, you need external sorting.

0
Peter Pan On

I think you can do it by serial step:

step 1: split 10 Million objects into 2^N slices, and sort for each slice;

step 2: use selectsort for the head objects from 2 slices and merge into new slice;

step 3: again and again do step 2, util just only 1 slice.

0
JB Nizet On

From the javadoc of Collections.sort():

This sort is guaranteed to be stable: equal elements will not be reordered as a result of the sort.

So don't reinvent the wheel, and just use the standard sort algorithm that the JDK provides: Collections.sort() or, better if using Java 8: List.sort(). Without any warmup that would allow the JIT to optimize the code, sorting 10M accounts with an age between 0 and 30 takes 1.4 seconds on my machine.