Ultimate Tic Tac Toe [Puzzle discussion]

Yes, you must create nodes at the first turn with a do nothing constructor.

1 Like

My Node class is like this:

class Node {
public:
    Node* childrenBegin;
    Node* childrenEnd;
    unsigned char grid;
    unsigned char action;
    float visits;
    float wins;
};

And like I said, at the first turn, I create 5000000 nodes. The creation is as simple as this: Node pool[5000000];

I have a little problem with mi AI because I can’t know who begins the game… Can someone help me for this?

and in consequences, I don’t know what is the first move of my opponent…

If I remember correctly you do have the last move of the opponent as an input. If the first last move is -1 it means you are the first player, otherwise you are the second

1 Like

Sometimes this is a programmers first experience with using Bitboards and MCTS. To this end I created a small example of how to implement Bitboards for plain TicTacToe. Instead of MCTS, this example has just Monte Carlo (also called MC, FlatMC or 1ply MC) origional and updated with feedback from darkhorse64

The example shows how to effectively use bitboarding to get 500,000 simulations within 99 milliseconds, and how something as simple as plain Monte Carlo can achieve good informed results. I know of 3 CGers LazyMammal, BugKiller_328, and KP56 who used Bitboarding and plain MC to achieve Gold, Gold, and Silver respectively.

Plain MC is very straight forward to implement. You effectively just play random games and track which starting position resulted in the most wins. It can also be entirely reused to make MCTS when you are ready.

9 Likes

@MSmits and many other CG regulars also helped and shared code for how to calculate all 8 win conditions in 3 operations by using C++ intrinsic. I’ve added the code to a tech io snippet that you can play with here The free knowledge-sharing platform for technology . This is only slightly faster than having 8 in an array and iterating over them, but every nano second counts in UTTT!

2 Likes

A few improvements for your snippet.

  • Use explicitely player ids 0 and 1. Now, switching player is just
    player ^= 1;
    and playing a move becomes
    players[player] |= 1 << move;
  • The status is useless. Let your playout return the score
    moveScores[i] += simulateBoard.randomPlayout(playerId, moves[i]);
2 Likes

Thanks darkhorse64. I’ve made the suggested changes and updated the post to have the updated code. I also deleted a bunch of dead code.

I’ve try that, but I’ve noticed the processing is far slower while it is the opponent turn. I guess that CodinGame simulator is limiting the CPU ressource during this time. At the end, this method allows an increase around 20% instead of the 100% I expected.

edit: I’ve used just one thread to do use the CPU while waiting for Console.ReadLine().

Multithreading is useless for your bot. CG allows only one thread for running a game.

I am trying to optimize my C# MCTS-solution and can’t understand how do you reach so large numbers of rollouts.

What I have done already:

  1. Board is two bitmasks: for X and for O.
  2. GetMoves is something like (X | O).GetBitsSet() implemented with TrailingZeroCount technique.
  3. Win is determined by comparing with 8 win-sequence bit-masks.

It gives me ~2k playouts on the second turn with max tree depth ~8. And 140-100 place in the Gold Legue.

Profiler shows me, that

  1. 30% of time I spend in GetMoves function
  2. 20% of time I spend in the UCT function (log and sqrt calculation I think)
  3. 10% — in HasWinSequence function.

I have no idea how to do 25k rollouts instead of 2k :frowning:
I will be grateful for any hints.

Do you have bitboard?
Do you have table for every possible moves for every possible states?
Do you have table for wins for every possible state?

Additional:
Do you reuse tree from previous turn?

1 Like

Your implementation seems on the right track

  • For the win check, you can do much better by using the bitboard as an index into a 512 sized lookup table.
  • Your GetMoves is using lots of time for checking bits in a mask. There is something to dig here.
  • C# is a nice language but it’s not fast enough. You should really consider using C++ or Rust
1 Like

I would like to know how the smaller board is chosen each turn.

I use an empty list in my code, and append each valid_action_count to this list each turn.
Then I use the debug print to determine how many and what actions are available each turn.
What I see is that on turn 1, I have 91 ( 9x9) actions to choose from. Excellent.
However on my second turn, I have only 9 actions in the valid_action_count list.

Am I correct in observing that after the first turn I am forced to place my mark in a random smaller board rather than one of my own choosing?

From the rules:

When a player plays on a small board, he also decides where the next player will be allowed to play: for example, if a player has played in the bottom left square of one of the small boards, the next player will play on the small board located at the bottom left square of the main board.

If a player is sent to a board that is either already won, or full, then that player is allowed to play in any empty square.

1 Like

Thanks, for some reason when I read that the first time I misinterpreted it to mean that the next player has to play on the same small board. I’m not sure why I understood it better with your copy paste, but I appreciate it, thanks.

I implemented minimax for wood league, but it always times out. Any suggestions?

import sys

PLAYER = 1
OPP = -1
EMPTY = 0
def findBestMove(board): 
    bestMove = [-1, -1]
    bestVal = -1000
    for i in range(len(board)):
        for j in range(len(board[0])):
            board[i][j] = PLAYER
            moveVal = minimax(board, 0, False)
            board[i][j] = EMPTY
            if moveVal > bestVal:
                bestMove = [i, j]
                bestVal = moveVal
    return bestMove

def minimax(board, depth, isMax):
    score = evaluate(board)
    if score == 10 or score == -10:
        return score
    if (isMovesLeft(board) == False) :
        return 0
    if isMax:
        best = -1000
        for i in range(len(board)):
            for j in range(len(board[0])):
                if board[i][j] == EMPTY:
                    board[i][j] = PLAYER
                    best = max( best, minimax(board, depth + 1, not isMax) )
                    board[i][j] = EMPTY
        return best
    else:
        best = 1000
        for i in range(len(board)):
            for j in range(len(board[0])):
                if board[i][j] == EMPTY:
                    board[i][j] = OPP
                    best = min( best, minimax(board, depth + 1, not isMax))
                    board[i][j] = EMPTY
        return best

def evaluate(board) :
    for row in range(3) :    
        if (board[row][0] == board[row][1] and board[row][1] == board[row][2]) :       
            if (board[row][0] == PLAYER) :
                return 10
            elif (board[row][0] == OPP) :
                return -10
    for col in range(3) :
        if (board[0][col] == board[1][col] and board[1][col] == board[2][col]) :
            if (board[0][col] == PLAYER) :
                return 10
            elif (board[0][col] == OPP) :
                return -10
    if (board[0][0] == board[1][1] and board[1][1] == board[2][2]) :
        if (board[0][0] == PLAYER) :
            return 10
        elif (board[0][0] == OPP) :
            return -10
    if (board[0][2] == board[1][1] and board[1][1] == board[2][0]) :
        if (board[0][2] == PLAYER) :
            return 10
        elif (board[0][2] == OPP) :
            return -10
    return 0

def isMovesLeft(board):
    for i in range(len(board)):
        for j in range(len(board[0])):
            if board[i][j] == EMPTY:
                return True
    return False

# game loop
target = [-1, -1]
boardState = [[EMPTY, EMPTY, EMPTY], [EMPTY, EMPTY, EMPTY], [EMPTY, EMPTY, EMPTY]]
while True:
    print(boardState, file=sys.stderr, flush=True)
    opponent_row, opponent_col = [int(i) for i in input().split()]
    if opponent_row != -1:
        boardState[opponent_row][opponent_col] = -1
    valid_action_count = int(input())
    target = findBestMove(boardState)
    print(target[0], target[1], sep = ' ')

no max depth …
maybe not a good idea :wink:

4 Likes

How is the time measured?
Because sometimes I get a timeout, and sometimes, if I rerun it with same settings, it works.