I am trying to create an AI agent to play checkers against, using a minimax terminal. However, it doesn't move correct pieces. It seems to just move pieces randomly, even if they cannot be moved.

I've rewritten both the minimax and undo functions multiple times, as I believe the problem is because the state is not being undone correctly each time through, however I still have the same problems.

    def undo(self, state, oldRow, oldCol, newRow, newCol):
        if oldRow + 1 == newRow:
            if state[oldRow][oldCol] == 'b' and state[newRow][newCol] == 'B':
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = 'b'
                state[newRow][newCol] = temp
            else:
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = state[newRow][newCol]
                state[newRow][newCol] = temp
        elif oldRow - 1 == newRow:
            if state[oldRow][newRow] == 'w' and state[newRow][newCol] == 'W':
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = 'w'
                state[newRow][newCol] = temp
            else:
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = state[newRow][newCol]
                state[newRow][newCol] = temp
        elif oldRow + 2 == newRow:
            if state[oldRow][oldCol] == 'b' and state[newRow][newCol] == 'B':
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = 'b'
                state[newRow][newCol] = temp
                state[oldRow + 1][int((oldCol + newCol) / 2)] = 'w'
            else:
                if state[newRow][newCol] == 'b' or state[newRow][newCol] == 'B':
                    temp = state[oldRow][oldCol]
                    state[oldRow][oldCol] = state[newRow][newCol]
                    state[newRow][newCol] = temp
                    state[oldRow + 1][int((oldCol + newCol) / 2)] = 'w'
                elif state[newRow][newCol] == 'W':
                    temp = state[oldRow][oldCol]
                    state[oldRow][oldCol] = state[newRow][newCol]
                    state[newRow][newCol] = temp
                    state[oldRow + 1][int((oldCol + newCol) / 2)] = 'b'
        elif oldRow - 2 == newRow:
            if state[oldRow][oldCol] == 'w' and state[newRow][newCol] == 'W':
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = 'w'
                state[newRow][newCol] = temp
                state[oldRow - 1][int((oldCol + newCol) / 2)] = 'b'
            else:
                if state[newRow][newCol] == 'w' or state[newRow][newCol] == 'W':
                    temp = state[oldRow][oldCol]
                    state[oldRow][oldCol] = state[newRow][newCol]
                    state[newRow][newCol] = temp
                    state[oldRow - 1][int((oldCol + newCol) / 2)] = 'b'
                elif state[newRow][newCol] == 'B':
                    temp = state[oldRow][oldCol]
                    state[oldRow][oldCol] = state[newRow][newCol]
                    state[newRow][newCol] = temp
                    state[oldRow + 1][int((oldCol + newCol) / 2)] = 'w'
        return state

    def minimaxAB(self, state, row, col, player, depth, alpha, beta):
        if depth == 0 or self.terminal_test(state):
            return None, self.utility(state)

        if player == HUMAN_PLAYER:  # maximizing player
            best = -math.inf
            bestRow = None
            bestCol = None
            for move in self.actions(state, row, col, player):
                newRow = move[0]
                newCol = move[1]
                _, val = self.minimaxAB(state, newRow, newCol, self.getEnemyPlayer(HUMAN_PLAYER), depth - 1, alpha, beta)

                # undo the move

                state = self.undo(state, row, col, newRow, newCol)
                if val > best:
                    bestRow, bestCol, best = newRow, newCol, val
                alpha = max(alpha, val)
                if alpha >= beta:
                    break
            next = bestRow, bestCol
            return next, best

        else:  # minimizing player
            best = math.inf
            bestRow = None
            bestCol = None
            for move in self.actions(state, row, col, player):
                newRow = move[0]
                newCol = move[1]
                _, val = self.minimaxAB(state, newRow, newCol, self.getEnemyPlayer(AI_PLAYER), depth - 1, alpha, beta)

                # undo the move
                state = self.undo(state, row, col, newRow, newCol)
                if val < best:
                    bestRow, bestCol, best = newRow, newCol, val
                beta = min(beta, val)
                if alpha >= beta:
                    break
            next = bestRow, bestCol
            return next, best

Here is how it is called. Because any piece could be moved, it loops through all the positions on the board, and if it is a white piece (w) it uses the position on the board to call minimax. From the solution minimax gives, it checks to see if it is a valid move, and if it is and has the best utility score then it is chosen. After both loops end, the best move is played

```python
max = -math.inf
                    for r in range(NUMBER_OF_ROWS):
                        for c in range(NUMBER_OF_COLS):
                            if state[r][c] == 'w' or state[r][c] == 'W':
                                move = self.minimaxAB(state, r, c, player, 2, -math.inf, math.inf)
                                tempMove = move[0]
                                tempRow = tempMove[0]
                                tempCol = tempMove[1]
                                valid = self.is_valid_location(state, r, c, tempRow, tempCol, player)
                                if move[1] > max and valid:
                                    nextM = move[0]
                                    max = move[1]
                                    maxRow = r
                                    maxCol = c
                    newRow = nextM[0]
                    newCol = nextM[1]
                    valid = self.is_valid_location(state, maxRow, maxCol, newRow, newCol, player)
                    if valid:
                        state = self.result(state, maxRow, maxCol, newRow, newCol)
                        self.display(state)
                        player = HUMAN_PLAYER
                    if self.winning_state(state, AI_PLAYER):
                        print("Player 2 wins!")
                        game_over = True

It should make an intelligent and valid move, however it makes two moves at the same time that aren't valid moves

0 Answers