Im trying to make a simple Q-learning AI in gms2, but im horrible messing with grinds and aways get the same problem when i try to update the qTable:
index out of bounds
project is simple, the AI can go to all directions and need to touch the "obj_goal" while avoid the "obj_obstacles"
Create event:
QTableWidth = 100; // Ajuste o tamanho conforme necessário
QTableHeight = 100; // Ajuste o tamanho conforme necessário
QTable = ds_grid_create(QTableWidth, QTableHeight);
learningRate = 0.1;
discountFactor = 0.9;
explorationRate = 0.7;
reward = 0;
currentX = x;
currentY = y;
iniX = x;
iniY = y;
spd = 2;
step event:
function getCurrentState() {
return string(currentX) + string("_") + string(currentY);
}
id_action = choose(0, 1, 2, 3);
currentX = x;
currentY = y;
currentState = getCurrentState();
#region randomizar ou andar com base em aprendizado
if random(1) < explorationRate {
// Exploração (ação aleatória)
action = choose("up", "down", "left", "right");
} else {
var bestActionIndex = 0;
var bestActionValue = QTable[# currentState, 0];
for (var i = 1; i < QTableHeight; i++) {
var value = QTable[# currentState, i];
if (value > bestActionValue) {
bestActionValue = value;
bestActionIndex = i;
}
}
action = bestActionIndex;
}
#endregion
#region açoes
switch (action) {
case "up":
y -= spd;
break;
case "down":
y += spd;
break;
case "left":
x -= spd;
break;
case "right":
x += spd;
break;
}
#endregion
if place_meeting(x, y, obj_objetivo) {
reward += 10;
x = iniX;
y = iniY;
} else if place_meeting(x, y, obj_Obstaculo) {
reward -= 10;
x = iniX;
y = iniY;
}
newState = getCurrentState();
var currentStateIndex = floor(currentState);
var idActionIndex = floor(id_action);
if currentState >= 0 && currentState < QTableWidth && id_action >= 0 && id_action < QTableHeight {
QTable[# currentState, id_action] += learningRate * (reward + discountFactor * QTable[# newState, id_action] - QTable[# currentState, id_action]);
} else {
show_debug_message("index out of bounds.");
}
currentX = x;
currentY = y;
it was supossed to update the Qtable to make the AI learn with the rewards that was given.
Since you don't seem to be using any grid-specific functions anyway, you could swap it for a 2d array - that will throw a proper error on out-of-bounds access. Initialization would have to be done like so:
You can run the game in debug mode to inspect the local/instance variables at the time of error.