Spring Challenge 2021 - Feedbacks & strategies

Fluxor · May 17, 2021, 2:53pm

I did the exact same thing, but I struggled to incorporate this information properly in my evaluation:
In my final version (ranked #33 in legend)

I considered the waiting opponent to get sun points at the end of the day in my simulation
I used the growing opponent to compute guaranteed (pessimistic) sun production and potential shadow cast on opponent (optimistic) as part of my heuristic.

But your 2/3 + 1/3 trick in the simulation is much more elegant, and if I switch to it (which is basically 3 lines of diff), I get 60% win against my version. Too bad I could not think of it…

MichaelPu · May 17, 2021, 2:58pm

I also think that training NN to output a value of a state could be a good idea in this game. Will probably try it myself sometime.

The_Auditor · May 17, 2021, 2:59pm

This is similar to my experience, the jump from if else heuristics to true AI is very steep. Making an AI solution that beats the if else heuristics is not easy.

Numby · May 17, 2021, 3:14pm

Rank: 698/1608 Gold (925/6867 Global)
Language: Java (~1700 LOC)
Strategy: MCTS (simulteneous), tree reuse, pruning, eval
Time Spent: 50~60h (much of it debugging)

Overview

I’ve competed on CodingGame before but this was the first time I really went for it. I’m happy that I implemented a simultaneous MCTS with storing the state between turns and managed to code up most of what I intended. However, due to poor performance of my sims, taking too much time to calculate possible actions and memory issues my code couldn’t really make good use of tree search.

Overall despite the struggles I had a blast, learned a lot and I’m sure to come back for future contests.

Initial Strategy

I went for simulating the game pretty early. My initial strategy was, starting with the current board state, to go through each possible action and (as many times as possible) play through the game from that action using random moves. After move time runs out I pick the action with the highest winrate. Code snippet:

List<ActionWithScore> possibleActionsWithScore = board.getPossibleActionsWithScore;
while (hasTimeRemaining()) {
    for (ActionWithScore actionWithScore: actionsWithScore) {
        actionWithScore.playThroughAndUpdateScore();
    }
}

// Sort by win rate
Collections.sort(possibleActionsWithScore);
return possibleActionsWithScore.get(0).getAction();

After ironing out my game simulation bugs I was surprised this got me as far as ~700 gold on gold opening. At best I was running ~2k sims a turn and unfortunately this number didn’t really improve after many tweaks due to performance issues.
I was certain the initial strategy was suboptimal because of purely random move choice so I wanted to try MCTS next.

Board Representation

I stored most of the information about the board as int[37] where each int holds all the information of the cell. Later, I found out about bitboarding and the possibility to store the entire board in just few ints, and even compute the shadows with bit shifts with a hex bitboard. I chose not to pursue this representation because at all times my main bottleneck appeared to be generating the list of possible moves at each state. In hindsight, implementing the leaner bitboard board representation would’ve allowed me to more easily compute shadows and do more interesting things with pruning the possible actions using heuristics.

MCTS (simultaneous)

This was my first time implementing any game tree search algorithm (haven’t even done minimax before) so it was definitely a struggle. My initial MCTS used alternating nodes for me and opponent until I realized this gives opponent nodes information about what action I had taken.
I re-wrote the MCTS to be simultenous by storing both actions in a node, and chosing actions independently for each player based on the UCB. I believe this is MCTS DUCT although I found about the paper later and haven’t read it carefuly.

I also implemented storing the tree between moves and changing the new root to the current node whenever it existed (i.e. my opponent move was one of the expanded children of the prior root). I thought this was really cool but it didn’t end up mattering much due to general performance issues with my code.

Pruning

For pruning the possible moves I’ve used fairly conservative rules and probably should’ve spend more time to make pruning more aggressive:

Seed: use preferred seeding indices (where the seeding tree won’t shadow itself) in the first 18 days. Also in the first 14 days don’t seed at all in a cell that would shadow with any of my trees (assume tree size 3 for the seeded cell to make this easy)
Grow: don’t grow beyond 5 trees of the next size
Complete: don’t complete in the first 11 days. Also don’t complete if I have less than 3 trees of size 3 and less than 5 trees of size 2 in the first 20 days (to avoid sun starving).

Scoring

I spend a lot of time tuning my scoring function. My intuition is that if the scoring function is a perfect predictor of the end game score from that state (assuming optimal moves) then you’ve basically solved the game. This is also why a NN might be a good fit (it just has to map a board state to score).
My scoring as usual was fairly complex. It took into account current scores, nutrients, sun, current day, the number of trees of each size and the number of directions from which those trees can be shadowed (i.e. across all the 6 directions). After tuning it in the web IDE it looked to be a pretty decent predictor of who’s winning.

Lessons Learned

What went well

Unit testing and ensuring that simulating the game works correctly
Asserts (pre-submission) catching most of my bugs
Scoring function seemed to work well
Created a Timer utility class which was handy in diagnosing issues and could easily by turned off

What went poorly

Code bloat and trying to code in 1 file for the first few days (after which I split up code files and started a python script to stitch)
A lot of time diagnosing random timeout spikes which turned out to be GC
Expensive MCTS Node representations (containing a map, multiple lists and few other variables each)

Where I got lucky

Adding a line to access all my static fields (to force my static initializer blocks to run) and adding a System.gc() at the start of main solved most of my timeout issues

PS (edit after reading posts above): Thanks for the great PMs, you guys are truly inspiring and I’ll revisit the strategies without the time pressure over the next few months. Also thanks to game makers and GC for hosting; this was a great game/competition!

skotz · May 17, 2021, 3:36pm

Legend 42 with C#

I used minimax with alpha-beta pruning and iterative deepening, and my heuristic was based on score, accumulated sun, tree counts, and tree values. Weights were trained offline by exhaustively running through ranges of values for each variable.

My game state used bitboards (ulong) for tree-related stuff and ints for everything else, so cloning states was a decently fast operation.

At the heart of my search was good move filtering and ordering. Since the majority of possible moves in each position were seed moves, I filtered out any that would end up casting shadows on my others trees within 2 tiles. This allowed me to search much deeper while also spacing out my trees so they could collect more sunlight. I also filtered out moves based on a few dozen expert-system-like observations to help nudge the search away from bad openings or from running out of momentum in the middle game.

For move ordering, I prioritized COMPLETE move first, then GROW moves, then SEED moves, then WAIT moves. Weights for richness and existing tree counts for each tree level were also included. This move ordering greatly improved my cutoff rate and consequently my search depth.

Late in the week I switched from simulating moves to simulating states, which helped a lot since prior to this I wasn’t fully implementing the game rules. At the end of the competition I was pretty consistently searching 14 moves (7 turns) deep.

By far my biggest jump in rankings was when I fixed a bug that was charging all my opponent’s actions to my sun bank instead of theirs, and that took me from 200th in gold to 40th in legend. Probably the thing I wasted the most time on was fixing bugs for the way I was incrementally updating my shadow map during playouts.

Additionally I built tools for training weights, automatically combining my code from multiple files into one, generating images for game states, serializing inputs from real matches so I could easily debug specific states locally, and parsing games from top players so I could analyze patterns.

Oh yeah, and on my first move I precalculated lookup tables for seeds and shadows for every day (mod 6), tile, and tree size.

Thank you for running this competition! It was a lot of fun.

MSmits · May 17, 2021, 3:38pm

I’ll keep it (relatively) short. I promised to share some bitboard stuff, but Kodle already did so. I think I’ll write a separate article at some point with some sims for popular board games so that people can learn how to simulate them efficiently.

My search was beamsearch. In hindsight I probably should have gone with MCTS. I am better at that and I spent way too much time fixing bugs in my beamsearch (no idea why, it ought to be simple). I coded a single player MCTS on sunday, in just 2 hrs, bug free. It just wasn’t good. Should have tried a 2 player one, but not enough time left.

For Beam Search eval parameters are the most important part. I had:

Richness (separate parameter for each tree type)
Suns
Future suns (similar to eulers)
GameScore
Shadow region overlap penalty (if a tree is within 3 hexes on a straight line to another tree it adds a penalty)

Search depth in the end roughly 6-7 days I think (30 turns, but waiting on the slowest state). Beam width 1k was best. I could simulate 250k states (actions) per turn. So performance was really not an issue. This game was all about clever pruning and eval.

I’m of two minds when it comes to my result. If you had told me I’d be lower legend at the start of the contest, I’d be disappointed, but after I was doing so poorly on wednesday (still in silver) I became very relieved when I finally beat gold boss on saturday. Tried more things on sunday, but not of them gave me improvements. So my final code is the code that beat gold boss.

I had a good time. Thanks to struct for helping me setup brutaltester and congratulations to the winners, especially recurse, with his monster of a NN.

AllYourTrees · May 17, 2021, 3:50pm

Finished 150 legend with rules based bot.

Main approach landed me in gold. The thing that got me to legend was actually watching @reCurse games and synthesizing his strategy into rules. Had no idea it was a NN at all.

Here was the simple bot I ended up with:

seed on diagonals only, or distance 3 in a straight line. break ties by richness. nothing else for seeding
don’t complete before day 12, after day 12 complete one tree a day. prefer the tree that is blocked next turn or blocks your own tree next turn
only grow to size 3 if either it would block an opponents tree next turn or if it would unblock the tree you’re growing from your own tree next turn.
if you have any grows < size 2 that would block or save your own tree from getting blocked next turn, do those
otherwise grow the tallest tree
some other stuff for endgame for day >= 20 too to make sure it’s not wasting energy

there were obviously a lot of other differences and behaviors that I didn’t capture from watching his strategy, but thought it was pretty decent for just being rules. For example sometimes reCurse bot would complete more than 1 tree in a turn but I didn’t figure out the condition for that. I think it also prioritized seeds very differently from what I did, and it’s endgame was different as well

Saelyos · May 17, 2021, 4:16pm

I ended up #2

Algorithm

I used a simultaneous MCTS version (DUCT), where both players share the same tree and have their decision decoupled at each depth. I chose this algorithm because I wanted to take precisely into account the opponent (what a Beam Search can’t really do), while being able to go deep to measure the impact of a seeding or grow.

Action pruning

A problem with MCTS is the fact that the search space can become really huge, and with all the seeding possibilities I had to do some pruning.

I pruned the seed actions by computing a shadow map for the next 6 sun orientations, for which I only
took into account my own trees as I really don’t want them to shadow themselves. I wanted to also
consider the opponent trees to make shadow on them when they are at their higher stage, but didn’t manage to. I then reduce the seeding to the best cells according to this shadow map (taking also richness into consideration).

To further reduce the search space the action are ordered COMPLETE -> GROW rank 3 -> 2 -> 1 -> SEED, so I can’t take COMPLETE action after a GROW one for exemple.

I also had the limitation to not take a SEED action if I already had a seed.

Rollout

I am doing semi-random rollouts : the action type is determined with some heuristics which are almost
identical to what @Magus did, and then applied to a random awake tree :

Complete a tree after day 11 if I have more than 3 rank 3 trees, and if the opponent has at least 1 (if he has no rank 3 trees there is no need to rush). All the remaining trees are completed at day 23 if they are worth at least one point.
Grow a tree (with a larger probability to select a bigger tree). This action is disabled if day > 21.
Seed if I have no seeds.
Wait otherwise.

Evaluation

Instead of doing full rollouts I had better results by stopping them at day + 12 (so it will still go to the end
for the last half of the game), and then evaluate them :

A mix of trees, current score and sun if day < 23
The final score for day 23

The values back-propagated are not 0 and 1 but (my_eval - opponent_eval) / 40 + 0.5, so the algorithm will still fight when the game is probably lost, and will try to improve the score difference if it is winning.

Performances

It has been a long time since we had a game where having more performances directly leads to a better ranking, but it was definitely the case with MCTS. So I spent some time to improve the performances to reach on average 15k rollout for the first half of the game (most of the time ended up being spent choosing semi-random actions).

Conclusion

I’m quite happy with this second place considering that I couldn’t really compete with NN.
I really like the game, especially the fact that there was no random at all after initial positions.

MichaelPu · May 17, 2021, 4:16pm

55th total, 3rd python

Board representation
A uint64 numpy array:
Each bit in a number represents the location on the board.
Each ‘row’ was 9 bits wide, to allow 3 binary shifts without landing on other legal cells.
Each number represented different type of info about the cell.

Search
Modified beam search, with max beam width of 100.
The states were trimmed after every action.
All the possible actions were calculated “simultaneously” for all states as a single big numpy array.
The way to overcome variable possible actions per turn - marking (virtual) ‘WAIT’ as the only possible action after the (real) ‘WAIT’.
Whenever both players can only wait, the analysis of a single turn is over.

Seeding, growing, completing
An interesting trick was to do all those simultaneously on all the cells by:
new_state[relevant_cells] = np.roll(old_state[relevant_cells]) & grow_mask
Because the states were ordered as:
MY_0, MY_1, MY_2, MY_3, EMPTY, OP_0, OP_1, OP_2, OP_3
So for action on my trees:
relevant_cells = MY_0, MY_1, MY_2, MY_3, EMPTY
For opponent action:
relevant_cells = EMPTY, OP_0, OP_1, OP_2, OP_3

Heuristic
Estimated the sun production for the rest of the game by casting shadows in all directions and calculating the average produce of the trees.

During the game, I estimated exponential growth in sun production (by similarity to interest in finance)
During the endgame (last ~6 turns), the sun at endgame was calculated linearly, because significant tree upgrades are unlikely.

opponent prediction
Used the same code that calculates my own actions, just instead of expending the search, it just chooses the best action per state according to the same heuristic.

Hardcoding
Around gold league, the bots surpassed human level, and it was virtually impossible to understand what’s going on in the midgame.
I intentionally didn’t hardcode any part of the game - beginning or end.
This allowed me to easily test the bot and see if the combination of the search and heuristics led to the correct actions.
This helped with finding a few hidden (but nasty) bugs…

Feedback
Just like @kovi, I had a feeling this would be simple optimization, and was glad to be proven wrong. The problem was more complex than it looked like!
Hope the next contest would be with bots that don’t surpass human level so quickly though.

Thanks for just another great contest!

TwoSteps · May 17, 2021, 4:39pm

710th total, 120 lines of Java

First time in a while I don’t spend time writing a proper AI. I hadn’t much time nor motivation so I just wanted to see how high I could get with really simple code (the kind of code you would write to get out of Wood league): no classes, just a few ifs, no shadow/distance computation and ignoring the opponent.

Main “strategy”:

don’t seed near your trees (actually hard-coded part of the 2-tile distances)
start harvesting trees mid-game
grow closer to the center

Pretty sure the top 500 is reachable with a bit of parameter tweaking (and code golfing to get under 100 loc)

SlyB · May 17, 2021, 4:46pm

Thanks a lot for sharing your MCTS details.
And congratulations.
I also tried the D-UCT, but my pruning was may-be not enough for SEED.
I could have around 20-25 SEED actions. So, 20*20 makes regularly nodes with 400-500 childs.
This was greatly limiting the depth, and spending most of the MCTS in exploration.
Did you manage to get this branching factor down with the shadow maps for seed?

kgautron · May 17, 2021, 4:51pm

I enjoy these bitboard stuff very much but i don’t get how to use this one.

Does each bit represent the existence of a tree at the given cell ? In that case do you have one separate integer for each treesize/owner combination ? How do the 1/8/9 shifts you mentioned work here ?

Would love any further explanation on this

SlyB · May 17, 2021, 5:04pm

Hello, I ended up 137, with a MCTS (using the same kind of opponent heuristic already described), for days <19.
For the days >=19, I used a DUCT. I wanted to use a DUCT for the complete 24days, but my SEED pruning was not enough.

I’ll describe here, something that was not yet done. May-be this could be interesting.
In order to unit test my engine, I was guessing the opponent actions, playing them (with my engine) on the previous State, and comparing the State coming from CG and the one from my engine. If any difference, I was just timeouting.
Some of you already guessed the opponent’s action when there’s only 1 action, but I did it even when the opponent had several actions.
The problem here is to play all the actions combinations (=N! ).
But, it appears that you don’t need to do it in fact. If you sort correctly the actions: Complete, Grow3, Grow2 Grow1, SEEDs and WAIT, you don’t need to check all the combinations.
In fact, quite all the time it works on the first combination (GrowXs can be in any combination). I have very rare cases, where I need to check other combinations. And I limited then to 25 max, to save time.

I enjoyed this contest, the ranking is clean, and it allows many ways to reach Legend.
So, thanks a lot for this Contest, and congratulations for the top players, especially reCurse, Chapeau bas.

kurnevsky · May 17, 2021, 5:44pm

Hi, thanks for the PM.

The values back-propagated are not 0 and 1 but (my_eval - opponent_eval) / 40 + 0.5

How does this work? Is it a specific of DUCT or this game? Because with plain UCT if you just add score to the win amounts it will end up choosing riskier moves that might give higher score.

Wontonimo · May 17, 2021, 5:51pm

Finished 212th Legend. My first contest on CG and loved the community chat and help.

I was just going to submit an if/else bot with hard coded heuristics and got to mid Gold. With 4 days left I got into it and started on a MonteCarlo. On the journey to full MCTS I first implemented simple 1ply MonteCarlo (no tree) and that got me into Legend. It was able to perform about 3K minimally-informed-rollouts using a stripped down version of my if/else bot. It was hard coded to run 100 simulations per 1st move and sometimes timed out when there were too many moves.

I then started on DUCT MCTS but ran out of time to address performance issues. Great event. Thanks to all the regulars who showed up on world chat, you made this a super fun event. I can’t mention you all, but I hope you know who you are.

BrainSolver · May 17, 2021, 5:55pm

Does each bit represent the existence of a tree at the given cell ?
Yes, and you need other bitBoards for the calculation of actions, shadows and rewards. (one per size)

In that case do you have one separate integer for each treesize/owner combination ?
I have an isMine bitBoard, but you can also separate, it’s a choice to make.

How do the 1/8/9 shifts you mentioned work here ?
Here is for example the projected shadow of 4 trees of size 1:
bitboard

kgautron · May 17, 2021, 6:26pm

Ok i see, so the shift allows to compute all the neighbors towards a given direction in a single operation (one direction per type of shift).
Nice.

Thanks for sharing and the explanation.

struct · May 17, 2021, 6:52pm

Hi,
Finished on rank 44.

Search:
I used chokudai search, not sure if it was any improvement over beam search.
The beam width started at 1200 and increased by 300 every time it completed, I was searching 6 days deep.
I tried adding enemy dummy but failed.

Heuristics:
My only heuristics were to allow completes only after turn 12 and only allow 1 seed at a time on the board.

Bitboard:
I used a bitboard that had a minimum padding of 3 bits (will be explained bellow).

constexpr uint64_t inside = 0b0001111000001111100001111110001111111000111111000011111000001111;

I had an array of masks to get the seed actions.

constexpr array<array<uint64_t, 64>, 3> seedMasks

Then I could do

const int cost = __builtin_popcountll(trees[player][0]);
for (int i = 1; i < 4; ++i) {
	uint64_t available_trees = trees[player][i] & ~dormant;
	while (available_trees) {
		const int source = _bit_scan_forward(available_trees);
		const uint64_t mask = seedMasks[i - 1][source];
		uint64_t available_cells = mask & usable & ~all_trees;

		while (available_cells) {
			const int target = _bit_scan_forward(available_cells);
			generated_actions.emplace_back(ActionType::SEED, source, target, cost); // type, source, target, cost
			available_cells &= ~(1ULL << target);
		}
		available_trees &= ~(1ULL << source);
	}
}

The reason I left 3 bits padding was to prevent shadows from overlapping, without having to worry about overlaps I can simply do the following to get all the shadowed cells:

constexpr int amount[6] = { 1, 9, 10, 1, 9, 10 };
uint64_t shadow = 0;
const int mod = day % 6;
const int value = amount[mod];

if (mod >= 1 && mod <= 3) { // LEFT SHIFT
	shadow |= ((merged_trees[1] << value) & ~(merged_trees[2] | merged_trees[3])); // Height 1
	shadow |= ((merged_trees[2] << value | merged_trees[2] << (value * 2)) & ~(merged_trees[3])); // Height 2
	shadow |= (merged_trees[3] << value | merged_trees[3] << (value * 2) | merged_trees[3] << (value * 3)); // Height 3
}
else { // RIGHT SHIFT
	shadow |= ((merged_trees[1] >> value) & ~(merged_trees[2] | merged_trees[3])); // Height 1
	shadow |= ((merged_trees[2] >> value | merged_trees[2] >> (value * 2)) & ~(merged_trees[3])); // Height 2
	shadow |= (merged_trees[3] >> value | merged_trees[3] >> (value * 2) | merged_trees[3] >> (value * 3)); // Height 3
}

BrainSolver · May 17, 2021, 7:15pm

gg , your minimum padding of 3 avoids some masks compared to my minimum padding of 2.

rafaelSorel1 · May 17, 2021, 7:19pm

Thanks for the feedback though, what was the search depth per day and the overall average of processed nodes for the 6 days ?