Spring Challenge 2021 - Feedbacks & strategies

Saelyos · May 17, 2021, 4:16pm

I ended up #2

Algorithm

I used a simultaneous MCTS version (DUCT), where both players share the same tree and have their decision decoupled at each depth. I chose this algorithm because I wanted to take precisely into account the opponent (what a Beam Search can’t really do), while being able to go deep to measure the impact of a seeding or grow.

Action pruning

A problem with MCTS is the fact that the search space can become really huge, and with all the seeding possibilities I had to do some pruning.

I pruned the seed actions by computing a shadow map for the next 6 sun orientations, for which I only
took into account my own trees as I really don’t want them to shadow themselves. I wanted to also
consider the opponent trees to make shadow on them when they are at their higher stage, but didn’t manage to. I then reduce the seeding to the best cells according to this shadow map (taking also richness into consideration).

To further reduce the search space the action are ordered COMPLETE -> GROW rank 3 -> 2 -> 1 -> SEED, so I can’t take COMPLETE action after a GROW one for exemple.

I also had the limitation to not take a SEED action if I already had a seed.

Rollout

I am doing semi-random rollouts : the action type is determined with some heuristics which are almost
identical to what @Magus did, and then applied to a random awake tree :

Complete a tree after day 11 if I have more than 3 rank 3 trees, and if the opponent has at least 1 (if he has no rank 3 trees there is no need to rush). All the remaining trees are completed at day 23 if they are worth at least one point.
Grow a tree (with a larger probability to select a bigger tree). This action is disabled if day > 21.
Seed if I have no seeds.
Wait otherwise.

Evaluation

Instead of doing full rollouts I had better results by stopping them at day + 12 (so it will still go to the end
for the last half of the game), and then evaluate them :

A mix of trees, current score and sun if day < 23
The final score for day 23

The values back-propagated are not 0 and 1 but (my_eval - opponent_eval) / 40 + 0.5, so the algorithm will still fight when the game is probably lost, and will try to improve the score difference if it is winning.

Performances

It has been a long time since we had a game where having more performances directly leads to a better ranking, but it was definitely the case with MCTS. So I spent some time to improve the performances to reach on average 15k rollout for the first half of the game (most of the time ended up being spent choosing semi-random actions).

Conclusion

I’m quite happy with this second place considering that I couldn’t really compete with NN.
I really like the game, especially the fact that there was no random at all after initial positions.