Spring Challenge 2021 - Feedbacks & strategies

It’s not specific to DUCT, It’s something I implemented for this game (even if I’m not an expert of MCTS/UTC so I have know idea if it is something classical or not). The idea is that in this game the winner is determined by score difference and not by specific conditions (such as in UTTT for example). So in order to make moves less risky, it makes sens to try to reach the maximum score difference, otherwise the opponent could up come with a solution that I either pruned or with a specific sequence of actions that the MCTS don’t anticipate (which did append against Beam Search opponents who find an optimal way to optimize their score in late game), and if the MCTS was only targeting +1 score, it will be enough to lose.

For mid-game evals, it also makes sens to give an evaluation between 0 and 1 because you don’t know the winner at this point. The exploration parameter c of UCT needs to be adjusted appropriately.

Not completly I also had sometimes 20 possible seeds (but it was unusual). But an important part of the pruning is also the fact that I only allow seeding if there are no seeds, so even if I might have 400 children at some point if both players have no seed in play, it will only be at a specific depth and not for all nodes. Also remember that with DUCT you don’t have to explore those 400 children : once each player has tried his 20 actions, it will redo one of his actions based on UCT formula, exploring often some of the 400 children while other will never be visited.

3 Likes