Tic-Tac-Toe - Part 2.1 - MiniMax

1

Attached Files

The following files have been attached to this tutorial:

.zip

bht-tictactoe-ai-v2p1.zip

Download now 6.91 KB
.capx

tictactoe-part2p1.capx

Download now 886.45 KB

Stats

3,216 visits, 4,763 views

Tools

Translations

This tutorial hasn't been translated.

License

This tutorial is licensed under CC BY 4.0. Please refer to the license text if you wish to reuse, share or remix the content contained within this tutorial.

This is an extension of my Tic-Tac-Toe series. See:

Part 1

Part 2

Part 3

In Part 2 I show how to write an AI plug-in to play against the user. To my shame, I recently discovered that there is a major bug and the algorithm doesn't work. If you place an X in a corner, the computer places an O in the center and if you place an X in the opposite corner, the computer incorrectly places it's O such that player X can win. This should be impossible! If Tic-Tac-Toe is played correctly, nobody should win.

Let's Try Again

So, what is Minimax? Minimax is an AI algorithm that plays out game scenarios, and from all of the possible combinations, picks a path that maximizes the score for player A, while minimizing the score for player B. There are tons of explanations on the web that explain Minimax, so I'm not going to bother explaining it here. An excellent article, for Tic-Tac-Toe, as it happens is here: http://neverstopbuilding.com/minimax. I adapted my mechanism to use the basics described in this article.

Rather than just replace the plugin, I also decided a bit of an update was in order. The included CAPX now lets you pick if you want to be X or O. The key here is that X always goes first, so if you pick O, the computer goes first. I also cleaned up some things. I'm not going to go into the details of the CAPX code itself, since there are three examples already based on the same general code. Instead, I'll look at what changed in the plugin, to show how (relatively simple) the switch over to Minimax was, from my faulty attempt at using an alternate approach.

One major change I did make, that affects the CAPX is to move all of the win-detection into the plugin. Previously I had the winner array duplicated in the CAPX and the plugin. This is bad design, although it followed directly from the first tutorial into the second. There are a couple of new calls available now to get the winner data.

The "Check for Winner" action calls the code that determines if there is a winner, who the winner is, and which line in the winners-array produced a win. To get the result of this, use the expressions "Get winner ID", and "Get winner index" to get the new data.

The Code

onCreate()

I've added four new variables. Two for Minimax and two for winner data.

    // There is always a maximizing and a minimizing player. These identify which is which.
    this.maximizingPlayer = 0;
    this.minimizingPlayer = 0;
            		
    // Since we need two values to get winner details, we cache the winner ID and row (in the winnerLines array, so we can mark the correct winning 'line').
    this.cachedWinnerID = -1;
    this.cachedWinnerRow = -1;
  

The following are scattered around to invalidate the calculated winner data, as any change invalidates the "CheckForWinner" call. See SetMoveX, SetMoveO, CheckForWinner, GetMoveO, and GetMoveX.

    this.cachedWinnerID = this.cachedWinnerIndex = -1;	// Reset the cached values as they are now invalid.

The new GetMoveO. Here we reset our winner data, set our maximizing player as O, and the minimizing player as X. Then start Minimax with our current player board, and the maximizing player (O).

    // Perform the AI to get the next best move for player O.
    exps.GetMoveO = function(ret)
    {
        this.cachedWinnerID = this.cachedWinnerIndex = -1;	// Reset the cached values as they are now invalid.
        this.maximizingPlayer = CONST_PlayerO;
        this.minimizingPlayer = CONST_PlayerX;
        ret.set_int(this.RunMiniMax(CurrentBoard, this.maximizingPlayer));
    }

GetMoveX just inverts the data:

    // Perform the AI to get the next best move for player X.
    exps.GetMoveX = function(ret)
    {
        this.cachedWinnerID = this.cachedWinnerIndex = -1;	// Reset the cached values as they are now invalid.
        
        this.maximizingPlayer = CONST_PlayerX;
        this.minimizingPlayer = CONST_PlayerO;
        ret.set_int(this.RunMiniMax(CurrentBoard, this.maximizingPlayer));
    }

These just return the winning data that was cached by the CheckForWinner() call:

    // Perform the AI to get the next best move for player X.
    exps.GetWinner = function (ret)
    {
        ret.set_int(this.cachedWinnerID);
    }
    // Perform the AI to get the next best move for player O.
    exps.GetWinnerIndex = function (ret)
    {
        ret.set_int(this.cachedWinnerRow);
    }

RunMinimax just sets up the call to the recursive Minimax call. Minimax needs a depth value passed in, and returns an array, with the best-score and best-move in it. This call deals with these extra details.

    // Given a board, and a player, maximize the score for the Maximizing player.
    instanceProto.RunMinimax = function(board, currentPlayer)
    {
        // [0] is the score
        // [1] is the move
        var bestMove = this.Minimax(board, currentPlayer, 0);
    	
        return bestMove[1];
    }

And finally, the big kahuna:

    // Given a current board, and current player, run the Mini-max algorithm.
    instanceProto.Minimax = function(board, currentPlayer, depth)
    {
        // Reset the score to 0, and the best-move to -1.
        var result = [];
        result[0] = 0;        // score
        result[1] = -1;        // best move
        
        var children = [];
        
        // Cycle through every possible new move, and build a new board with that move, for the current player.
        for(var i=0; i<9; i++)
        {
            // If this position is blank, add a move here.
            if(board[i] == 0)
            {
                // Copy board and add new move
                var childBoard = board.slice();    // copy
                childBoard[i] = currentPlayer;    // new move
                children.push(new BoardMove(childBoard, i));    // store on an array
            }
        }
        
        // if the current player is the Maximizing player, start the best-score at -100, so that any higher score will replace this value.
        // Otherwise the current player is the minimizing player so start the best-score at 100 so lower scores will replace this value.
        var bestScore = currentPlayer == this.maximizingPlayer ? -100 : 100;
        var currentScore = 0;
        var bestMove = -1;
        var currentResult = [];
        
        // Reached the bottom, so we have no score (for either player).
        if(children.length == 0)
        {
            bestScore = 0;
        }
        else
        {
            // Increase our depth counter;
            depth++;
            
            // Cycle through all of the possible board moves
            for(var n = 0; n< children.length; n++)
            {
                // If we are maximizing
                if(currentPlayer == this.maximizingPlayer)
                {
                    // Is this board a winner for the [b][i]maximizing[/b][/i] player
                    if(this.IsWinner(children[n].Board, this.maximizingPlayer))
                    {
                        currentScore = 10 - depth;    // Give a positive score.
                    }
                    else
                    {
                        // No, so run the algorithm for the [b][i]minimizing[/b][/i] player
                        currentResult = this.Minimax(children[n].Board, this.minimizingPlayer, depth);
                        currentScore = currentResult[0];    // Store the score
                    }
                    
                    // If our new score is higher than the one we had
                    if(currentScore > bestScore)
                    {
                        // Store the score and the move that got this score.
                        bestScore = currentScore;
                        bestMove = children[n].Move;
                    }
                }
                else    // Current player is minimizing
                {
                    // Is this board a winner for the [b][i]minimizing[/b][/i] player                    
                    if(this.IsWinner(children[n].Board, this.minimizingPlayer))
                    {                        
                        currentScore = depth - 10;    // Give a negative score.
                    }
                    else
                    {
                        // No, so run the algorithm for the [b][i]maximizing[/b][/i] player
                        currentResult = this.Minimax(children[n].Board, this.maximizingPlayer, depth);
                        currentScore = currentResult[0];    // Store the score
                    }
                    
                    // If our new score is lower than the one we had
                    if(currentScore < bestScore)
                    {
                        // Store the score and the move that got this score.
                        bestScore = currentScore;
                        bestMove = children[n].Move;
                    }
                }
            }
        }
        
        // Return our best-score and best-move.
        result[0] = bestScore;
        result[1] = bestMove;
    
        return result;
    }

Since this is fully commented I won't go through every line, but in order:

*clear out the result

*make an array of all possible moves the current-player can make

*setup a score based on whether we want to maximize or minimize the score

*if we have no moves at all, stop with a score = 0

*increase our depth

*look at each of the possible new moves in our array

**if we are maximizing, see if this produces a winner - calculate our score

**else call the Minimax recursively for the minimizing player

**if our current score is better than the previous best-score, store the score and the move that got us that score

*do the opposite for minimizing the current player

*finally store our result so we can return the two values from our function

Although we already have a GetWinner() call, we need a simpler/faster call for Minimax to use that only cares about checking if the current player is a winner. We cycle through the winners-array to see if any winning combinations are found.

    // Does this board produce a win for the given player
    instanceProto.IsWinner = function(board, player)
    {
        // Run through the winning-lines array to see if this player matches the positions of a winning combination.
        for (var i=0; i<winnerLines.length; i++)
        {
            if(board[winnerLines[i][0]] == player && board[url=https://www.scirra.com/tutorials/320/tic-tac-toe-part-1]winnerLines[i[/url]] == player && board[url=https://www.scirra.com/tutorials/326/tic-tac-toe-part-2]winnerLines[i[/url]] == player)
            {
                return true;
            }
        }
        return false;
    }

And finally, we need a small class to store a board and a move, so we can add that data to the children-array. This just provides the data structure.

    // When we build a new possible board, we need to store the board and the move that was added. This just stores those two values for us so we can place it in an array.
    function BoardMove(board, move)
    {
        this.Board = board;
        this.Move = move;	
    }

So, now we have a fully functional, optimized, working Minimax AI to play Tic-Tac-Toe, or now that the basics are plumbed in, the code could be copied and reused for other strategy type games. Just the "make a possible move", and the "win-score" code would need to be changed.

.ZIP

bht-tictactoe-ai-v2p1.zip

Download now 6.91 KB
.CAPX

tictactoe-part2p1.capx

Download now 886.45 KB
  • 0 Comments

  • Order by
Want to leave a comment? Login or Register an account!