Minimax algorithm not running correctly

53 views Asked by At

I'm currently trying to implement the Minimax algorithm into a Tic-Tac-Toe game. So far, the algorithm doesn't behave in the way I need it to, as the AI does not choose the most optimal move in the game.

import math


board = [[0, 0, 0],[0, 0, 0],[0, 0, 0]]
def displayTable():
    for i in range(30):
        print('-', end = '')
        if i == 29:
            print('-')
    for i in range(4):
        if i == 3:
            print('|')
            break
        if board[0][i] == 0:
            print('|         ', end = '')
        else:
            print('|   ', board[0][i], '   ', end = '')
    for i in range(30):
        print('-', end = '')
        if i == 29:
            print('-')
    for i in range(4):
        if i == 3:
            print('|')
            break
        if board[1][i] == 0:
            print('|         ', end = '')
        else:
            print('|   ', board[1][i], '   ', end = '')
    for i in range(30):
        print('-', end = '')
        if i == 29:
            print('-')
    for i in range(4):
        if i == 3:
            print('|')
            break
        if board[2][i] == 0:
            print('|         ', end = '')
        else:
            print('|   ', board[2][i], '   ', end = '')
    for i in range(30):
        print('-', end = '')
        if i == 29:
            print('-')

def getUserMove():
    print("Enter the x, y coordinates of the space you want to use (Hint: coordinates are 0 indexed):")
    usermove = input("Separate your coordinates with a comma, no spaces: ")
    userArray = usermove.split(',')
    while True:
        if int(userArray[0]) > 2 or int(userArray[1]) > 2:
            print('That spot is out of range! Try to pick again!')
            usermove = input("Separate your coordinates with a comma, no spaces: ")
            userArray = usermove.split(',')
        elif board[int(userArray[0])][int(userArray[1])] != 0:
            print('That spot is already taken! Try to pick again!')
            usermove = input("Separate your coordinates with a comma, no spaces: ")
            userArray = usermove.split(',')
        else:
            board[int(userArray[0])][int(userArray[1])] = 'X'
            break

def terminalState(gameboard):
    for i in range(3):
        if gameboard[i][0] == 'X' and gameboard[i][1] == 'X' and gameboard[i][2] == 'X':
            return 1
        if gameboard[i][0] == 'O' and gameboard[i][1] == 'O' and gameboard[i][2] == 'O':
            return -1
        if gameboard[0][i] == 'X' and gameboard[1][i] == 'X' and gameboard[2][i] == 'X':
            return 1
        if gameboard[0][i] == 'O' and gameboard[1][i] == 'O' and gameboard[2][i] == 'O':
            return -1
    
    if gameboard[0][0] == 'X' and gameboard[1][1] == 'X' and gameboard[2][2] == 'X':
        return 1
    if gameboard[0][0] == 'O' and gameboard[1][1] == 'O' and gameboard[2][2] == 'O':
        return -1
    if gameboard[2][0] == 'X' and gameboard[1][1] == 'X' and gameboard[0][2] == 'X':
        return 1
    if gameboard[2][0] == 'O' and gameboard[1][1] == 'O' and gameboard[0][2] == 'O':
        return -1
    
    return 0



def aiMove(board):
    aiscore = -math.inf
    aimovex = -1
    aimovey = -1
    
    for i in range(3):
        for j in range(3):
            if board[i][j] == 0:
                board[i][j] = 'O'
                newscore = minimax(board, 0, False)
                board[i][j] = 0
                if newscore > aiscore:
                    aiscore = newscore
                    aimovex = i
                    aimovey = j
    board[aimovex][aimovey] = 'O'

def maxValue(board, depth):
    m = -math.inf
    for i in range(3):
        for j in range(3):
            if board[i][j] == 0:
                board[i][j] = 'O'
                score = minimax(board, depth + 1, False)
                board[i][j] = 0
                m = max(m, score)
    return m

def minValue(board, depth):
    m = math.inf
    for i in range(3):
        for j in range(3):
            if board[i][j] == 0:
                board[i][j] = 'X'
                score = minimax(board, depth + 1, True)
                board[i][j] = 0
                m = min(m, score)
    return m

def minimax(board, curDepth, isMaximizing):
    if terminalState(board) != 0:
        return terminalState(board)
    if isMaximizing:
        return maxValue(board, curDepth)
    else:
        return minValue(board, curDepth)

displayTable()
while terminalState(board) == 0:
    getUserMove()
    if terminalState(board) != 0:
        displayTable()
        break
    aiMove(board)
    displayTable()
if terminalState(board) == 1:
    print('Congratulations! You won!')
elif terminalState(board) == -1:
    print('You tried your best. Thank you for playing')
else:
    print('Tie game. Thank you for playing')

This is the code of the game that I'm trying to make. Everything runs with no errors, except the logical error of the Minimax algorithm. I've tried changing the both the terminalState and aiMove functions, but I keep getting the same result.

1

There are 1 answers

0
trincot On

The main problem is that you don't deal with the terminal state of a tie, and this leads to scores of infinity being returned in minimax (as no valid move is found).

Note that your terminalState is capable to return only three values -1, 0 and 1, yet there are four states possible: X-wins, O-wins, tie, and not-terminated-yet! Where you call terminalState, you consider the value 0 as indicating the game is not over yet, yet in the main code, you have code that outputs "Tie game", but there really is no possibility to get there: if the state were 0, the main loop wouldn't have ended.

Secondly, as the value returned by terminalState is used by minimax as a score, it should have a sign that corresponds to the maximizing principle. Everywhere in your code you have chosen "O" to be the maximizing player, except that in terminalState you return -1 when "O" wins, and then minimax uses that as the score, which obviously is wrong. It should be the opposite.

To tackle both problems, toggle the sign on the returned values in terminalState, and add code in it to detect a tie. Distinguish between a tie and a not-terminated game: return 0 when it is a tie, but return None when the game is not over yet.

So:

def terminalState(gameboard):
    for i in range(3):
        if gameboard[i][0] == 'X' and gameboard[i][1] == 'X' and gameboard[i][2] == 'X':
            return -1  # Altered sign, here and below
        if gameboard[i][0] == 'O' and gameboard[i][1] == 'O' and gameboard[i][2] == 'O':
            return 1
        if gameboard[0][i] == 'X' and gameboard[1][i] == 'X' and gameboard[2][i] == 'X':
            return -1
        if gameboard[0][i] == 'O' and gameboard[1][i] == 'O' and gameboard[2][i] == 'O':
            return 1

    if gameboard[0][0] == 'X' and gameboard[1][1] == 'X' and gameboard[2][2] == 'X':
        return -1
    if gameboard[0][0] == 'O' and gameboard[1][1] == 'O' and gameboard[2][2] == 'O':
        return 1
    if gameboard[2][0] == 'X' and gameboard[1][1] == 'X' and gameboard[0][2] == 'X':
        return -1
    if gameboard[2][0] == 'O' and gameboard[1][1] == 'O' and gameboard[0][2] == 'O':
        return 1

    if any(0 in row for row in gameboard):  # Is there a free cell?
        return None  # not terminal!

    return 0  # It's a tie

Then there are three places where you compare the value returned by terminalState with 0. Change those so that the comparison is with None.

With those changes your code will work.