I am trying to understand how evolution and fitness function work. I implemented my own Knapsack problem, which tries to find 3 most valuable items from a given set. It is my own cost function:
static int counter=1;
static double MyCostFunction(boolean[] chromosome, double[] data){
System.out.println("COST FUNCTION: "+counter++);
double sum = 0;
int count = 0;
for (int i =0; i<chromosome.length; i++){
if (chromosome[i]){
sum += data[i];
count ++;
}
}
return count<=3 ? sum : 0;
}
I’m printing out just to see how many times MyCostFunction is performed. Right now I am working with a small set of 8 items. Here it is:
public static void main(String[] args) {
double[] data = {-15,-7, -1, -1,7,100,200,300};
Integer[] items = new Integer[data.length];
int i = 0;
for (double d : data){
items[i] = i;
i++;
}
ISeq<Integer> zbiorDlaGA = ISeq.of(items);
final ISProblem knapsack= new ISProblem(zbiorDlaGA,
chromosome -> ISProblem.MyCostFunction( chromosome, data)
);
This is the Engine that I’m using:
final Engine<BitGene,Double> engine= Engine.builder(knapsack)
.executor( Runnable::run)
.populationSize(5)
.survivorsSelector(new TournamentSelector<>(3))
.offspringSelector(new RouletteWheelSelector<>())
.alterers(
new Mutator<>(1.0),
new SinglePointCrossover<>(0.0)
).build();
And that’s how I’m obtaining statistics:
final EvolutionStatistics<Double,?> statistics=EvolutionStatistics.ofNumber();
final Phenotype<BitGene,Double> best=engine.stream()
// .limit(bySteadyFitness(15))
.limit(5)
.peek(r-> System.out.println("########CURRENT GEN: "
+r.generation()+
": "+ r.totalGenerations()+
": "+r.bestPhenotype()+
" ALTERED: "+r.alterCount()+
" INVALID: "+r.invalidCount()+
" GENOTYPE: "+r.genotypes()))
.peek(statistics)
.collect(toBestPhenotype());
System.out.println(statistics);
System.out.println(best);
I've got a few questions concerning some behaviours of your library that I don't really understand: Even when I set mutation probability to 1.0 (it should change every bit, I think so) it gives me this result:
COST FUNCTION: 1
COST FUNCTION: 2
COST FUNCTION: 3
COST FUNCTION: 4
COST FUNCTION: 5
COST FUNCTION: 6
COST FUNCTION: 7
COST FUNCTION: 8
########CURRENT GEN: 1: 1: [01100000] -> 300.0 ALTERED: 24 INVALID: 0 GENOTYPE: [[10110100],[00011011],[01011101],[00111001],[01100000]]
COST FUNCTION: 9
COST FUNCTION: 10
COST FUNCTION: 11
########CURRENT GEN: 2: 2: [00011011] -> 0.0 ALTERED: 24 INVALID: 0 GENOTYPE: [[00011011],[01011101],[10100101],[01111010],[00001011]]
COST FUNCTION: 12
COST FUNCTION: 13
COST FUNCTION: 14
########CURRENT GEN: 3: 3: [10100101] -> 0.0 ALTERED: 24 INVALID: 0 GENOTYPE: [[10100101],[01011101],[01111011],[01010011],[10010101]]
COST FUNCTION: 15
COST FUNCTION: 16
COST FUNCTION: 17
########CURRENT GEN: 4: 4: [10000010] -> 293.0 ALTERED: 24 INVALID: 0 GENOTYPE: [[01011101],[01010011],[01111011],[10000010],[00001001]]
COST FUNCTION: 18
COST FUNCTION: 19
COST FUNCTION: 20
########CURRENT GEN: 5: 5: [10000010] -> 293.0 ALTERED: 24 INVALID: 0 GENOTYPE: [[10000010],[01011101],[01000100],[10111100],[10011001]]
+---------------------------------------------------------------------------+
| Time statistics |
+---------------------------------------------------------------------------+
| Selection: sum=0,003352000000 s; mean=0,000670400000 s |
| Altering: sum=0,010647600000 s; mean=0,002129520000 s |
| Fitness calculation: sum=0,011403300000 s; mean=0,002280660000 s |
| Overall execution: sum=0,033579600000 s; mean=0,006715920000 s |
+---------------------------------------------------------------------------+
| Evolution statistics |
+---------------------------------------------------------------------------+
| Generations: 5 |
| Altered: sum=120; mean=24,000000000 |
| Killed: sum=0; mean=0,000000000 |
| Invalids: sum=0; mean=0,000000000 |
+---------------------------------------------------------------------------+
| Population statistics |
+---------------------------------------------------------------------------+
| Age: max=4; mean=0,560000; var=1,090000 |
| Fitness: |
| min = -23,000000000000 |
| max = 300,000000000000 |
| mean = 41,840000000000 |
| var = 10763,306666666665 |
| std = 103,746357365773 |
+---------------------------------------------------------------------------+
[01100000] -> 300.0
Why does it calculate fitness for three individuals only? Is it somehow caching the values already calculated for the same chromosome?
And why in the first generation it is calculated 8 times?
The number of altered individuals is always 3*8, what implies that only 3 chromosomes were modified during the evolution. Why? I thought it has something to do with TournamentSelector sampleSize, but it doesn’t change number of changed chromosomes.
With the probability of mutation equal to 100% and crossover prob.=100% it should change every bit in chromosome, so there will only be 2 versions of chromosome in each generation, but it doesn’t work like that. Why? Does it randomly select value of bit or sets the opposite?
Does the number of bits set to true have to be a constant one(or approximately constant) in population/generation?
I am using these strange values for crossover and mutation probabilities, because I was wondering earlier why it didn’t give the number of fitness calculations performed equal to populationSize*NumberOfGenerations. So I started to experiment.
The described behavior is as expected :)
The following facts leads to the observed outcome:
The outcome:
I think this explains the output, doesn't it?
Control the number of initially set bits
If you want to control the number of initial bits of the
BitChromosome
, you can use the originalCodec
to create your own variation.