I am using Scipy's differential evolution optimizer and am seeing some behaviour that I cannot reconcile and am looking for someone with a bit of expertise with DE.
A little background, I looking at the objective function for all members of the population as a function of iteration with "best1exp" mutation strategy.
The best1exp strategy determines new population candidates by taking
where b_0 is the best known solution, F is a mutation constant m^i_r represents a random member of the i^th population iteration.
The "exp" part in best1exp is the crossover strategy which is independent of the effect I am concerned with (presents even with CR=1 which sets bin=exp).
It looks like the population is supporting two solutions, which should not be possible since all the members are being randomized around the "best" solution for each iteration (I know there is crossover, but ignoring that detail since it does not depend on CR. The same type of structure exists for differing CR, F with and without dithering, exp and bin.)
So the question is, how can DE support a structure like this? To be clear I am not asking the question tell me why "my" code is generating this, but rather how is it even possible.
This could easily be caused by your objective function. (Assuming you are plotting all evals.)
Some perturbations might trigger a fitness change of exactly 500, and your optimum might be just one step away from triggering it. Or maybe your objective function is stochastic, and evaluating it will randomly award 500 points or not.