Whilst reading Programming in Lua book, I came across an exercise which I'm not sure if I solved right, hence I wanted to ask you all.
Problem statement:
Exercise 6.4: Write a function to shuffle a given list. Make sure that all permutations are equally probable.
Background: I've very basic mathematical and programmings knowledge
I researched a bit online, read a bit about combinations vs permutations, how does that come into play in this problem, and then I finally came across Fisher-Yates Shuffle which I think led me to a "working" solution.
My ask:
- Is my solution right? If not, what am I doing wrong? If yes, perhaps some details entailing why because I'm confused as to how would I be otherwise shuffling the list with unequal probabilities?
- One thing I noticed is if I add
math.randomseed(1)
(or, any other fixed number) in thegetShuffledList
function, I always get the same results, no matter how many times I try to shuffle the list. Why does that happen?
My solution:
local printList = function (a)
for _, value in ipairs(a) do
io.write(value, " ")
end
print()
end
local interchangeElements = function(list, i , j)
local temp
temp = list[i]
list[i] = list[j]
list[j] = temp
return list
end
local getShuffledList = function(list)
-- math.randomseed(1) --> gives constant shuffled list on each call
local j
for i = #list, 1, -1 do
j = math.random(1, #list)
interchangeElements(list, i, j)
end
return list
end
printList(getShuffledList({1, 2, 3, 4}))
printList(getShuffledList({1, 2, 3, 4}))
printList(getShuffledList({1, 2, 3, 4}))
printList(getShuffledList({1, 2, 3, 4}))
Output:
3 2 4 1
3 4 2 1
4 3 1 2
1 2 3 4
[Process exited 0]
This is not a correct Fisher-Yates shuffle, and the results are not equally probable. This is not obvious, but easy to show with some well-known math.
Let's call the length of the list
n
. In each iteration of your loop, you choose a number between1
andn
. There aren
iterations, so there aren^n
equally likely ways that all the numbers can be chosen. The number of permutations of the list isn
factorial. (Except in trivial small cases), there is no way to dividen
factorial permutations evenly into then^n
outcomes, thus it is impossible for all the permutations to be equally likely. For example withn=4
, there are 24 permutations, but 256 ways the random numbers can be chosen. 24 doesn't divide into 256 evenly, so there is no way to assign an permutation to each of the 256 random outcomes so that they all appear the same number of times.To get the proper Fisher-Yates algorithm, change
j = math.random(1, #list)
toj = math.random(1, i)
. This means that the previous counterargument no longer applies: the first random number is from1
ton
, then1
ton - 1
, etc. So, the number of ways the random numbers can be chosen is alson
factorial. This doesn't automatically prove that the algorithm works properly; you need to think about it and realize that each permutation can be reached by exactly one way of choosing the random numbers (or you only have to prove that each permutation is possible since the numbers are the same).math.random
is a pseudo-random number generator. This means that it has some hidden internal state, and each time you call it, it does some mysterious but deterministic operation to modify its internal state and return a mysterious number. But, it is not actually random (because common computer operations are inherently always deterministic) (hence "pseudo-random").math.randomseed
sets the hidden internal state. Because everything is deterministic, if you start with the same state, you will get the same result. This feature can be useful, e.g. to reproduce the same result later, but it is not useful if you just want "random" numbers.In Lua 5.4, Lua tries to automatically choose a different starting seed value each time you run the program, so there is not any great need to manually provide a seed when the program starts (but also a no-argument version of
math.randomseed()
is added to allow this behavior to be invoked explicitly). In earlier versions of Lua it is common to do something likemath.randomseed(os.time())
so that you get a different seed each time (well, it will be different if at least a second of real time has passed).