Ultimate Tic Tac Toe [Puzzle discussion]

Yeees, I did it :slight_smile:
That was quite a challenge!!! After fixing all the bugs I could find Iā€™ve reached 7th in the gold league after that I decided to play with the exploration parameter (sqrt(2.0) - 1.0) looked like working the best, submit -> 1st in gold league for a couple of days I could not think of anything else then I saw that I havenā€™t almost any battle ended in a draw. I updated my algorithm with 0.5 reward for draw(0 for lost, 1 for win), submit -> again 1st in goldā€¦, but I saw less losses and more draws. After playing for several hours with the exploration parameter 0.5 value really balanced the search, submit -> 80th in Legend!!! And I felt really really happy.

So in the end it turned out, as @darkhorse64 said, that the exploration parameter must be adjusted properly, also the pruning proposed by @accorp guided the search in the correct direction.

From this point on, I think the main improvement for my strategy would be to optimize the C++ code, I have an idea how to make it about 2 times faster (but maybe in another parallel universe) or improve the selection phase (I read a ton of literature on the subject).

Thanks to everyone for the suggestions. I wanted to learn MCTS with this game and I think I did.
Great game, great platform, great community and it feels great to see the result of the efforts made!

Wonā€™t stop till weā€™re legends

6 Likes

Apart from optimising your code, there are a couple of simple and generic (I mean valid for any MCTS) things that you can do to improve your bot:

  • Smarter rollouts: If you have a winning move, play it. If you can prevent your opponent to play a winning move, do it
  • Solver: when a game nears the end, while building the tree, you reach terminal nodes. If you have a win, it means the previous move was losing. If all possible moves lead to a loss, it means the previous move was winning. This way, you can back-propagate proven results and avoid exploring again solved nodes. This is quite effective: usually you can predict the result of a game 10 or more moves before the end. There are two useful papers to read: ā€œMonte-Carlo Tree Search Solverā€ and ā€œScore Bounded Monte-Carlo Tree Searchā€
14 Likes

I get into gold league with simple heuristics.
No simulations, no complex algorithms - just scoring function for each valid action.
With defs and classes, the code is below a 100 lines.

Here is the core:

score = f1 * win + f2 * defend - f3 * closed + f4 * same_board - f5 * middle + f6 * position.get(board ) + f7 * position.get(self.next_board[y, x]) * [1, 0][closed]

Just figure out the factors.

2 Likes

:wave: Magus, do you mind sharing a bit how your Node looks like?
Because I tried to create the same amount, first in Java then in C++ and I could not reach more than ~500k nodes on first turn.

My current node looks like that:

Node {
  int[9] subBoards;
  int[81] childrenIndexes;
  // 2 double, 5 int
}

For testing purpose I tried to remove both arrays but without result, I always hit timeout at ~600k.

Here is how I create nodes:

for (int i = 0; i < nodes.length; i++) {
      nodes[i] = Node.createEmptyNode();
}

I think there are two major issues with your code:

  • Your node includes the game state. This is taking way too much space. Your node should only contain the played move. Every time you enter a node, you should instead update a Game variable with the node move. Replaying a game will not be less efficient that copying a game state.
  • I guess you are initializing your nodes at startup. Donā€™t do that. Allocate your data and initialize a node once you need it.
4 Likes

Ok so the idea would be to recompute each Game states from the current state when finding a new node, instead of storing computed Game states for reuse. Iā€™ll try that thank you.

For the second point, I confirm you thatā€™s what Iā€™m doing. In Java, If Iā€™m not wrong
Node[] nodes = new Node[5_000_000];
will allocate memory for 5M references, which should be cheap somehow.

What I thought was expensive is when creating the actual node using its constructor

Node() {
  //set default values
}

Thatā€™s why I tried to create nodes at the first turn.
And from what Magus said, he seems to do that as well?

(Btw Iā€™m familiar with Java but a total beginner with C++)

Yes, you must create nodes at the first turn with a do nothing constructor.

1 Like

My Node class is like this:

class Node {
public:
    Node* childrenBegin;
    Node* childrenEnd;
    unsigned char grid;
    unsigned char action;
    float visits;
    float wins;
};

And like I said, at the first turn, I create 5000000 nodes. The creation is as simple as this: Node pool[5000000];

I have a little problem with mi AI because I canā€™t know who begins the gameā€¦ Can someone help me for this?

and in consequences, I donā€™t know what is the first move of my opponentā€¦

If I remember correctly you do have the last move of the opponent as an input. If the first last move is -1 it means you are the first player, otherwise you are the second

1 Like

Sometimes this is a programmers first experience with using Bitboards and MCTS. To this end I created a small example of how to implement Bitboards for plain TicTacToe. Instead of MCTS, this example has just Monte Carlo (also called MC, FlatMC or 1ply MC) origional and updated with feedback from darkhorse64

The example shows how to effectively use bitboarding to get 500,000 simulations within 99 milliseconds, and how something as simple as plain Monte Carlo can achieve good informed results. I know of 3 CGers LazyMammal, BugKiller_328, and KP56 who used Bitboarding and plain MC to achieve Gold, Gold, and Silver respectively.

Plain MC is very straight forward to implement. You effectively just play random games and track which starting position resulted in the most wins. It can also be entirely reused to make MCTS when you are ready.

9 Likes

@MSmits and many other CG regulars also helped and shared code for how to calculate all 8 win conditions in 3 operations by using C++ intrinsic. Iā€™ve added the code to a tech io snippet that you can play with here The free knowledge-sharing platform for technology . This is only slightly faster than having 8 in an array and iterating over them, but every nano second counts in UTTT!

3 Likes

A few improvements for your snippet.

  • Use explicitely player ids 0 and 1. Now, switching player is just
    player ^= 1;
    and playing a move becomes
    players[player] |= 1 << move;
  • The status is useless. Let your playout return the score
    moveScores[i] += simulateBoard.randomPlayout(playerId, moves[i]);
2 Likes

Thanks darkhorse64. Iā€™ve made the suggested changes and updated the post to have the updated code. I also deleted a bunch of dead code.

Iā€™ve try that, but Iā€™ve noticed the processing is far slower while it is the opponent turn. I guess that CodinGame simulator is limiting the CPU ressource during this time. At the end, this method allows an increase around 20% instead of the 100% I expected.

edit: Iā€™ve used just one thread to do use the CPU while waiting for Console.ReadLine().

Multithreading is useless for your bot. CG allows only one thread for running a game.

I am trying to optimize my C# MCTS-solution and canā€™t understand how do you reach so large numbers of rollouts.

What I have done already:

  1. Board is two bitmasks: for X and for O.
  2. GetMoves is something like (X | O).GetBitsSet() implemented with TrailingZeroCount technique.
  3. Win is determined by comparing with 8 win-sequence bit-masks.

It gives me ~2k playouts on the second turn with max tree depth ~8. And 140-100 place in the Gold Legue.

Profiler shows me, that

  1. 30% of time I spend in GetMoves function
  2. 20% of time I spend in the UCT function (log and sqrt calculation I think)
  3. 10% ā€” in HasWinSequence function.

I have no idea how to do 25k rollouts instead of 2k :frowning:
I will be grateful for any hints.

Do you have bitboard?
Do you have table for every possible moves for every possible states?
Do you have table for wins for every possible state?

Additional:
Do you reuse tree from previous turn?

1 Like

Your implementation seems on the right track

  • For the win check, you can do much better by using the bitboard as an index into a 512 sized lookup table.
  • Your GetMoves is using lots of time for checking bits in a mask. There is something to dig here.
  • C# is a nice language but itā€™s not fast enough. You should really consider using C++ or Rust
1 Like