I want to shuffle a list like this one:-
to_shuffle = [ a, b, b, b, b, a, c, b, a, b ]
to minimise the number of repeating elements. Initially, I
thought about popping elements off the top of to_shuffle
and either
pushing them onto a another list shuffled
if the element is
different from the previously pushed element, or else push it on to
the bottom of to_shuffle
and try another element. This would result
in:-
shuffled = [ a, b, a, c, b, a, b, b, b, b ]
which, in this example, isn't any better - there's still 4 b in a row (although this method sometimes reduces repeating elements).
What I then thought was to start by making a bucket for each class of element:-
buckets = [ (a, [a, a, a]), (b, [b, b, b, b, b, b]), (c, [c]) ]
sort the buckets by size, descending
buckets = [ (b, [b, b, b, b, b, b]), (a, [a, a, a]), (c, [c]) ]
keeping track of the last element shuffled
last = None
cycle through the buckets, starting with the biggest, and pop off an
element if not equal to last
, resort the buckets and do it again:-
sorted = [ b ]
buckets = [ (b, [b, b, b, b, b]), (a, [a, a, a]), (c, [c]) ]
last = b
sorted = [ b, a ]
buckets = [ (b, [b, b, b, b, b]), (a, [a, a]), (c, [c]) ]
last = a
sorted = [ b, a, b ]
buckets = [ (b, [b, b, b, b]), (a, [a, a]), (c, [c]) ]
last = b
sorted = [ b, a, b, a ]
buckets = [ (b, [b, b, b, b]), (a, [a]), (c, [c]) ]
.
.
.
sorted = [ b, a, b, a, b, a, b, c, b, b ]
which is a much better result.
Is there a name for this algorithm and if so is there a python (2.7) implementation of it?
Here's some rather shoddy code:-
test = [ 'a', 'b', 'b', 'b', 'b', 'a', 'c', 'b', 'a', 'b' ]
expected = [ 'b', 'a', 'b', 'a', 'b', 'a', 'b', 'c', 'b', 'b' ]
def sort_buckets(buckets):
return sorted(buckets, key=lambda x: len(x[1]), reverse=True)
def make_buckets(to_shuffle):
h = {}
buckets = []
for e in to_shuffle:
if e not in h:
h[e] = []
h[e].append(e)
for k, elems in h.iteritems():
buckets.append((k, elems))
return buckets
def shuffle(to_shuffle):
buckets = make_buckets(to_shuffle)
shuffled = []
last = ''
while len(buckets) > 1:
buckets = sort_buckets(buckets)
for i in range(len(buckets)):
candidate = buckets[i][0]
if candidate == last:
continue
t = buckets.pop(i)
last = candidate
shuffled.append(t[1][-1])
if len(t[1]) > 1:
buckets.append((t[0], t[1][:-1]))
break
t = buckets.pop()
shuffled += t[1]
return shuffled
print expected
print shuffle(test)
Sort them in order of frequency, then put them into a new array at alternating positions, first working from left to right then from right to left.
So in the example you gave, sorted is
{b,b,b,b,b,b,a,a,a,c}
, then the "every other place" operation takes it to{b,_,b,_,b,_,b,_,b,_}
, then we continue to fill in the blanks from the right (or from the left if that results in fewer identical adjacents):{b,c,b,a,b,a,b,a,b,b}
. There's only one identical adjacent pair in that answer. That ensures that the most frequent item shows up at the beginning (and possibly end) of the list where it can't be beside itself on one side, at least. Then you'll only have identical items beside each other if more than half are of one type.Here's a case where you would want to fill in the second pass starting at the left:
{a,a,b,b,c,c}
=>{a,_,a,_,b,_}
, now if we fill in from the right we'll get a double b so we start again from the left:{a,b,a,c,b,c}
.