Could someone be so kind to help me understand how to use the alpha-beta pruning algorithm? I'm making a game similar to connect four. The only differences is there is no diagonal win, and a player can mark an square at any given time (unless it is already occupied, of course). I think I understand how to code the algorithm, I just think I am using it wrong. What I have been doing is having a for loop that looks something like this
for(i=0; i<size; i++)
for(j=0; j<size; j++)
val = alphabeta();
if(val > max)
max = val;
move = set(i,j);
setBoard(move); //sets the to the returned value from alphabeta()
the problem I am having is that the first run of alphabeta returns the max val, so therefore none of the next values are greater, and the board will just be set at board[0][0]. Does anyone know what I am doing wrong?
public int alphabeta(Placement place, int depth, int alpha, int beta, boolean maxPlayer)
{
Placement p = null;
if(depth==0 || board.isWinner())
{
return evaluate(place, maxPlayer);
}
if(maxPlayer)
{
int i=0, j=0;
for(i=0; i<board.size; i++)
{
for(j=0; j<board.size; j++)
{
if(board.validMove(i,j)&&(board.canGetFour(i,j, opponent)&&board.canGetFour(i,j,player)))
{
board.board[i][j] = opponent;
p = new Placement(i, j);
alpha = Math.max(alpha, alphabeta(p, depth-1, alpha, beta, false));
board.board[i][j] = 0;
}
if(beta<=alpha)
break;
}
if(beta<=alpha)
break;
}
return alpha;
}
else
{
int i=0, j=0;
for(i=0; i<board.size; i++)
{
for(j=0; j<board.size; j++)
{
if(board.validMove(i,j)&&(board.canGetFour(i,j,opponent)&&board.canGetFour(i,j,player)))
{
board.board[i][j] = player;
p = new Placement(i, j);
beta = Math.min(beta, alphabeta(p, depth-1, alpha, beta, true));
System.out.println(board);
board.board[i][j] = 0;
}
if(beta<=alpha)
break;
}
if(beta<=alpha)
break;
}
return beta;
}
}
This is the function that makes the move
public void makeMove()
{
int max = -1;
Placement p = null;
int val = -1;
for(int i=0; i<size; i++)
for(int j=0; j<size; j++)
{
if(board.validMove(i, j))
{
if(board.canGetFour(i, j, opponent)||(board.canGetFour(i,j,player)&&board.canGetFour(i,j,opponent)))
{
board.board[i][j] = player;
val = alphabeta(new Placement(i,j), 5, -5000, 5000, true);
board.board[i][j] = 0;
if(val > max)
{
max = val;
p = new Placement(i, j);
}
}
}
}
board.board[p.row][p.col] = player;
board.moves++;
}
So here's my updated code, still not working
public Placement alphabeta(Placement p)
{
int v = max(p,6,-500000, 500000);
return successors(v);
}
public int max(Placement p, int depth, int alpha, int beta)
{
if(depth == 0 || board.isWinner())
{
return evaluateMax(p,player);
}
int v = -500000;
for(int i=0; i<successors.size(); i++)
{
Placement place = new Placement(successors.get(i));
board.board[place.row][place.col] = player;
v = Math.max(v, min(place, depth-1, alpha,beta));
board.board[place.row][place.col] = 0;
if(v>= beta)
return v;
alpha = Math.max(alpha, v);
}
return v;
}
public int min(Placement p, int depth, int alpha, int beta)
{
if(depth == 0||board.isWinner())
{
return evaluateMax(p,opponent);
}
int v = 500000;
for(int i=0; i<successors.size(); i++)
{
Placement place = new Placement(successors.get(i));
board.board[place.row][place.col] = opponent;
v = Math.min(v, max(place,depth-1, alpha,beta));
board.board[place.row][place.col] = 0;
if(v<= alpha)
return v;
beta = Math.min(alpha, v);
}
return v;
}
public void makeMove()
{
Placement p = null;
for(int i=0; i<successors.size(); i++)
{
Placement temp = successors.get(i);
//board.board[temp.row][temp.col] = player;
p = alphabeta(temp);
//board.board[temp.row][temp.col] = 0;
}
System.out.println("My move is "+p.row + p.col);
board.board[p.row][p.col] = player;
successors.remove(p);
}
I changed the algorithm slightly just so I can clearly see what going on with min and max, however, it still does not play correctly
Ok, took some time but I think I have it.
In your evaluate function, you should be returning how good the state is for the actual player. If placement is a
canGetFour
for the "otherPlayer", that is a bad state (the worst state). So you return a small number. However, if placement is acanGetFour
for the "actualPlayer" you return a large number (its a good state).Then in your makeMove, you are just checking whether the state is the best state possible. Note, that using a 2d array for this is just about the least efficient way of storing the "child nodes". It would make a lot more sense to have a placement.getPossibleMoves() which returns an array of all the empty squares (both real and temporary), and iterate over that. Otherwise your algorithm is going to be exponential time in the order of the board size.