Im rather new to GA's, but thought I'd give one a go. I've programmed a GA (using the Deap library) to replace back-propagation in my Multilayer perceptron. The goal was to find the best weights to solve the XOR operator. The code seems to be producing reliable results, close to convergence.
I've experimented with different crossover/mutations probabilities. My only conclusion so far is that the higher the probability, the slower the convergence, is this accurate?
This question is a bit more on the theoretical side and you may want to seek some expanded guidance on Artificial Intelligence or Cross Validated Stack Exchange forum.
But brief answer is this: you are comparing situations when you generate a lot of divergent individuals of the population in each generation (a. and extreme case - c.) with very stable situation when somewhere around 1 individual is changed each generation (1% + 1% out of 50 members). This means that in cases a. and particularly c. your individuals jump around the solution space and probe it for the optimal solution. At the same time approach b. can be seen as gradual and systematical exploration in small steps.
Depending on the problem at hand - and yours seem quite simple (therefore the solution space is small) - strategy b. is better because the population is quickly nailing the solution. In case of very complex solution space, a. is probably quite good setup of parameters. Approach c. seems overly explorative in either case - there has to be a balance between maintaining promising individuals intact across many generations vs. exploring with new individuals. With c. it is almost impossible to keep these promising candidates over time - almost the entire population is likely to change almost every generation.