Implementing Alpha Beta into Minimax

2.4k views Asked by At

I'm trying to add Alpha Beta pruning into my minimax, but I can't understand where I'm going wrong.

At the moment I'm going through 5,000 iterations, where I should be going through approximately 16,000 according to a friend. When choosing the first position, it is returning -1 (a loss) whereas it should be able to definitely return a 0 at this point (a draw) as it should be able to draw from an empty board, however I can't see where I'm going wrong as I follow my code it seems to be fine

Strangely if I switch returning Alpha and Beta inside my checks (to achieve returning 0) the computer will attempt to draw but never initiate any winning moves, only blocks

My logical flow

If we are looking for alpha: If the score > alpha, change alpha. if alpha and beta are overlapping, return alpha

If we are looking for beta: If the score < beta, change beta. if alpha and beta are overlapping, return beta

Here is my Recursive call

int MinimaxAB(TGameBoard* GameBoard, int iPlayer, bool _bFindAlpha, int _iAlpha, int _iBeta) 
{

    //How is the position like for player (their turn) on iGameBoard?
    int iWinner = CheckForWin(GameBoard);
    bool bFull = CheckForFullBoard(GameBoard);

    //If the board is full or there is a winner on this board, return the winner
    if(iWinner != NONE || bFull == true) 
    {
        //Will return 1 or -1 depending on winner
        return iWinner*iPlayer;
    }

    //Initial invalid move (just follows i in for loop)
    int iMove = -1;
    //Set the score to be instantly beaten
    int iScore = INVALID_SCORE;

    for(int i = 0; i < 9; ++i)
    {
        //Check if the move is possible
        if(GameBoard->iBoard[i] == 0) 
        {
            //Put the move in
            GameBoard->iBoard[i] = iPlayer;

            //Recall function
            int iBestPositionSoFar = -MinimaxAB(GameBoard, Switch(iPlayer), !_bFindAlpha, _iAlpha, _iBeta);

            //Replace Alpha and Beta variables if they fit the conditions - stops checking for situations that will never happen
            if (_bFindAlpha == false)
            {
                if (iBestPositionSoFar < _iBeta)
                {
                    //If the beta is larger, make the beta smaller
                    _iBeta = iBestPositionSoFar;
                    iMove = i;

                    if (_iAlpha >= _iBeta)
                    {
                        GameBoard->iBoard[i] = EMPTY;

                        //If alpha and beta are overlapping, exit the loop
                        ++g_iIterations;
                        return _iBeta;

                    }
                }
            }
            else
            {
                if (iBestPositionSoFar > _iAlpha)
                {
                    //If the alpha is smaller, make the alpha bigger
                    _iAlpha = iBestPositionSoFar;
                    iMove = i;

                    if (_iAlpha >= _iBeta)
                    {
                        GameBoard->iBoard[i] = EMPTY;

                        //If alpha and beta are overlapping, exit the loop
                        ++g_iIterations;
                        return _iAlpha;
                    }
                }
            }

            //Remove the move you just placed
            GameBoard->iBoard[i] = EMPTY;
        }
    }


    ++g_iIterations;

    if (_bFindAlpha == true)
    {
        return _iAlpha;
    }
    else
    {
        return _iBeta;
    }
}

Initial call (when computer should choose a position)

int iMove = -1; //Invalid
int iScore = INVALID_SCORE;

for(int i = 0; i < 9; ++i) 
{
    if(GameBoard->iBoard[i] == EMPTY) 
    {
        GameBoard->iBoard[i] = CROSS;
        int tempScore = -MinimaxAB(GameBoard, NAUGHT, true, -1000000, 1000000);
        GameBoard->iBoard[i] = EMPTY;

        //Choosing best value here
        if (tempScore > iScore)
        {
            iScore = tempScore;
            iMove = i;
        }
    }
}
//returns a score based on Minimax tree at a given node.
GameBoard->iBoard[iMove] = CROSS;

Any help regarding my logical flow that would make the computer return the correct results and make intelligent moves would be appreciated

1

There are 1 answers

0
Zong On

Does your algorithm work perfectly without alpha-beta pruning? Your initial call should be given with false for _bFindAlpha as the root node behaves like an alpha node, but it doesn't look like this will make a difference:

int tempScore = -MinimaxAB(GameBoard, NAUGHT, false, -1000000, 1000000);

Thus I will recommend for you to abandon this _bFindAlpha nonsense and convert your algorithm to negamax. It behaves identically to minimax but makes your code shorter and clearer. Instead of checking whether to maximize alpha or minimize beta, you can just swap and negate when recursively invoking (this is the same reason you can return the negated value of the function right now). Here's a slightly edited version of the Wikipedia pseudocode:

function negamax(node, α, β, player)
    if node is a terminal node
        return color * the heuristic value of node
    else
        foreach child of node
            val := -negamax(child, -β, -α, -player)
            if val ≥ β
                return val
            if val > α
                α := val
        return α

Unless you love stepping through search trees, I think that you will find it easier to just write a clean, correct version of negamax than debug your current implementation.