I'm trying to apply GA to solve a problem and having couple questions.
First question is about selection - I've seen in many implementations that population is sorted according to score/fitness prior selection. Is it really necessary? For example in Tournament selection individuals should be picked uniformly so sorting seems to have no meaning. Or lets consider Roulette Wheel selection. With the following implementation sorting again seems to be unnecessary (pseudo code):
totalFitness = sum individuals i.fitness
rouletteValue = totalFitness * random 0.0 1.0
selected = null
i = 0;
while rouletteValue >= 0.0
selected = individuals[i++]
rouletteValue -= selected.fitness
return selected
So does it really necessary to sort population for selection?
Another question is quick convergence: I'm trying out Simple GA with crossover probability 0.9, mutation probability 0.01, population size 30 and initial population contains 'very-good-solution' (known). Crossover produces two offsprings always, so one iteration is as follows (pseudo code):
for i = 0 to population-size
mother = select-one population
while father != mother
father = select-one population
if should-crossover crossover-probability
(sister, brother) = apply-crossover mother father
else
sister = copy mother
brother = copy father
if should-mutate mutation-probability
apply-mutation sister
if should-mutate mutation-probability
apply-mutation brother
insert-at new-population i sister
insert-at new-population i+1 brother
i = i + 2
swap new-population population
Also because of nature of the problem crossover almost never provides better results then parents had.
So, roughly speaking, what happens is because 'good solution' is much better than any other solution in initial population it is selected for reproduction more often then other individual (lets say with probability 0.1 for 30 individuals). If crossover is not applied (probability 0.1) then 'good solution' is carried over into new population. So as population size is 30 on average crossover is skipped for 1.5 pair. So after 15 iterations 22 crossovers are skipped. Now as 'good solutions' selection probability is 0.1 then after 22 skipped crossovers I'll have 2 'good solutions' inserted into new population. After this happens 'good solution' proliferates to fast.
Is there any way to overcome this problem?
On the convergence problem - judging from the parameters you described, you are converging to your seeded known-good solution because your population is nowhere near large enough.
Assuming that the remainder of your population is randomly generated, the odds that these 29 individuals will have any components that immediately generate a more-fit offspring is extremely low. So combined with the 10% no-crossover chance, the known-good individual will consistently dominate.
To generate a more-fit individual via GA than your known-good, you will need to significantly increase your population, and probably increase the crossover probability as well. You probably will also need to weaken the selection algorithm to prevent any single strong individual from dominating the reproduction cycle in any given generation.