Ultimate Tic Tac Toe [Puzzle discussion]

berthy424 · March 18, 2018, 2:07pm

Yes you’re right. Thanks for your answer.

dbf · March 24, 2018, 10:53am

I got ‘Invalid output’ error today while there were no any output from my bot, and in the chat
eulerscheZahl told me, that for timeout in this game there is ‘Invalid output’ error also. If it is true, is it possible to fix it, because it is not obvious, wjile we have Timeout error in other games?

BrunoFelthes · March 24, 2018, 12:29pm

The victory conditions was updated:

You’ve won on 3 aligned smaller tic-tac-toe boards. If nobody managed to get 3 marks aligned, the player that won the most smaller tic-tac-toe boards wins.

So, the rules will not change anymore?

pb4 · March 25, 2018, 12:49pm

At the moment, the game can still be a draw if both players get as many marks on the big grid.

It is possible to introduce a new rule, stating the following : "If the grid is full and the game is drawn according to current rules, the winner is the player who has the most marks on all undecided small grids."

This new rule would guarantee that no game can be drawn.

I’m curious to have your thoughts about it ?

reCurse · March 25, 2018, 3:08pm

Why is a draw something to correct? You’re giving a huge advantage to the first player that way.

JFB · March 26, 2018, 7:57pm

Just small story about how I tried to win UTTT with Neuron Network and fail.

I just decided to write this story for community (as Thibaud said – community is important ) – maybe somebody find something interesting for him.

Currently I started learning a bit about NN so I want to try to check my knowledge on some case. I just written very simple MCTS bot for UTTT which moved to middle of Gold – so I decided to test NN (specifically speaking – Convolution Neural Network - CNN) on this game.

What I wanted to do?

I planned to implement following steps (I name it for myself PMAG – Poor Man AlphaGo ):

Gather games replays from my bot and top players’ bots
Extract moves from these replays
Using moves – calculate states before every move
Create 2 sets of pairs: (game state, move) – one for winner, second for loser
Use pairs from winner to train CNN
Implement trained CNN in the bot and run it on CG
Repeat from step 1

I expected that, after each iteration, my bot will be better and better.

What I done?

I written small script in Python and gather around 2000 UTTT replays
Using another Python script I extracted lists of moves with information which player win
3-4) Using simulator from my current C++ bot I calculated states and generated *.png for each state and put them in directory {win/lost}/{next_move}
I written script in Python, using #Keras and #TensorFlow to train small CNN (very basic – just to start – I planned to improve it letter) and train this CNN.

As you see I just went according to plan. It was Friday evening, I have had all weekend – I was in good mood.

And in this moment I realized that it is source size limit on CG. I know about it, of course, but never before my code rich the limit – so my knowledge was only theoretical. Now I saw that my very basic NN has around 50 k floats – what mean around 200 kB – and even more because I have to code bytes using base64 or base83.

But maybe source code limit is not so strict?

I tested and even file 100.3 kB was too long. So limit is 100 kB and is very strict. So instead of improving my basic NN – I have to simplify it . So I returned to point 5) and trained even simplest NN. So from now all my decision for implementing bot was dictated by source size limit:

I decided to implement bot in Python (not in current C++ bot) because:
a. for golf Python is better – code is smaller
b. in Python they are function for base64 and base83 – another decrease of source code
c. in Python it is float16 (in C++ I will have to implement myself such library)

When I finally have my program working on CG I run it – and have had 0% win rate in Gold. Checked and of course found big bug (I wrongly implemented conversion from NN output to moves).

After bugfix – bot start winning. Sometime. Very rarely  It finish as last in Gold. With rank around 14.0. So probably it was good enough to first half of Silver. But no more.

So I fail.

I still have few ideas what I can change – but I’m afraid that source code limit will not give me possibility to even write NN program better than my current, basic MCTS. Maybe it is intentional from CG – to protect good programmers from lazy ones and his NN bots

If not – maybe it is possible to increase source size limit? Or implement another solution – give possibility to upload data (for example 10 MB for each login) and let bots to use data from owner account?

And if somebody is curious do my NN was realy basic it was:

conv -> batchnorm -> relu -> conv -> batchnorm -> relu -> conv -> relu -> dense -> relu -> drop -> dense -> softmax.

Really basic, no inception, no ResNet, no deep.

dbf · April 1, 2018, 8:10am

Are we going to have ‘Legend’ league opened for UTTT?

_CG_Maxime · April 1, 2018, 2:03pm

Yes, just waiting for more participants.

snowfrogdev · May 2, 2018, 3:30pm

So I ran one million random playouts of Ultimate Tic Tac Toe with the CodinGame winning rules and I found that for the first player to play, there is a 50.9% chance to win, a 7.2% chance to draw and a 41.9% chance to lose. To your knowledge, are these odds correct?

DollarAkshay · May 6, 2018, 12:16am

In gold league, the MightyCarlo boss always plays a different game even when you click Replay in Same Conditions.

This actually confused me and I thought it was my bot that was doing some randomization, turns out it was MightCarlo. I understand MontaCarlo Tree Search is a randomized algorithm, but it would be nice if the bot used the RandomSeed given by the environment.

inoryy · May 6, 2018, 7:13am

That’s not possible. What Maxime could do is lock the bots seed to some constant, but even then the behavior would wildly change due to varying number of instructions it’ll be able to fit in his time limit (30ms I believe?).

What helped me get out of gold was focus on other players in top10, ensuring my bot could sustain at least 50% winrate vs everybody. High winrate vs MightyCarlo fell out naturally from that goal.

perseverent · May 15, 2018, 5:42am

I am trying to learn MCTS with this bot contest. How many simulations per turn would be necessary to be able to beat MightyCarlo boss, considering that he is doing probably the same thing by looking at the name.

Magus · May 15, 2018, 6:11am

If MightyCarlo is the gold boss, i reached legend with 25k rollouts at the second turn.

perseverent · May 15, 2018, 7:47am

Thanks @Magus. Yes I was talking about the gold boss. I have only 5k rollouts in the second turn, but I noticed the tree has 13k nodes. I was thinking I save time if I expand a leaf with all possible moves. It turns out, most of my leaf nodes will never benefit from a rollout. That or I have a bug.

perseverent · May 15, 2018, 7:54am

Does anybody have any idea what to do with the time in the first turn? In the moment I do not use it. I feel this is such a waste but at the same time it’s like getting 1h for the first move in chess, and after that 30sec for all other moves. In chess you play fast in the beginning to save time for later.

Magus · May 15, 2018, 8:04am

In the first turn i create 5000000 nodes. I don’t want to use new ou delete later in the code because it’s slow.

Then if i’m the first player, i hardcode the (4,4) move and i let the MCTS do its job. I keep my tree between turns so when i’m playing at turn 2 my tree is already filled with the first turn calculation.

If i’m player 2, i also hardcode a move but only if my opponent played in (4,4).

perseverent · May 15, 2018, 8:24am

Thanks @Magus. Now I have a clear target.

perseverent · May 25, 2018, 5:57am

I can confirm @Magus claim. Once you can get 25k rollouts at second turn in a consistent manner, you will beat the gold boss MightyCarlo.

perseverent · May 25, 2018, 6:02am

If anybody wants to try MCTS, you can get to legend just with the vanilla algo. All you have to do is to optimize the hell out of the code.

perseverent · May 25, 2018, 6:42am

Ideas for optimization:
1.calculating the table where next player has to move, once you know the last move. There are 9 TTT tables and 9 possible moves (if you always consider the move relative to the table). So if last move was move no.7 (in what table was that it’s not important) than the next move has to be made in table no. 7.
2.using bitboards for memory representation of the tables. I found a very interesting idea here for testing a win if you use bitboards: https://www.youtube.com/watch?v=5t5jzkO0t7w (it’s in German but you will get the point)
3.if you go the bitboard route I can recommend the following site: https://graphics.stanford.edu/~seander/bithacks.html
4.try to have lookup tables for things you use often.