Summer Challenge 2024 - Feedback and Strategies

IloIlo · June 25, 2024, 12:22pm

The topic where you can tell how you liked the challenge, & how you solved it.

Thanks to Codingame and Fiverr for organizing it, and congratulations to @MSz, @reCurse and @Nanaeda for the top 3 !

Jamesscn · June 25, 2024, 12:48pm

I ended up around 56th place in legend. This was an odyssey for me considering that I lost a lot of time due to factors that were completely out of my control and that I was stuck at the top of gold near the final hours of the contest, but I finally reached legend for the first time and I am very happy with my bot!

I believe that part of the reason that I got so far was because I wrote my bot in Rust, historically I have used C++ but having all the modern features and crates that come with Rust makes everything so much faster and easier for me to deal with.

Simulation

The first thing I worked on when the contest started was simulating the game. For this I created a bunch of structs and methods for all of the minigames and the entire state of the game. When a minigame ended I would simulate a random turn based on the logic from the source code of the game on GitHub.

For example, to generate a new set of wind speeds for the archery game I copied over the list of probabilities for each wind speed from the game’s source code and randomly sampled from that list in my simulation. It didn’t matter so much if the code was suboptimal at this point, I could always optimize this at a later stage.

The great thing about how I organized my code was that I was able to set up a local referee that could use this simulation code, while also using the same code for my bot. That way I didn’t have to worry about duplicating code and making mistakes in either my bot or the referee and not in the other.

But what if both the referee and the bot are incorrectly simulating the game? For that I decided to set up some unit tests.

Testing

I played a game in the web browser and I manually collected the inputs that were given to my bot for all 100 turns by printing them out to the standard error stream with eprintln!(), and I also recorded the outputs for all three bots at every turn. I then created a big unit test locally to read in these inputs and outputs.

Next, for every turn I fed the input string and the outputs for each player into my simulation and compared the input string for the next turn with the one I had recorded, ignoring the GPU string when a new game started because I obviously wasn’t going to be able to predict that when it changed (I did consider trying to crack the random seed, but CodinGame recently changed the random algorithm they use to a cryptographically secure one, so that was out of the question).

To my surprise the simulation was almost perfect out of the box, there were a few small mistakes but I fixed them and I now had accurate simulations!

Algorithms

Once I had the simulations working, the next step was to implement a framework that could take in an algorithm and use that to find the best action for me to take during every turn of the game.

I like making my code as general as possible because it allows me to reuse parts, so I worked on creating traits (you can think of them like interfaces if you have never used Rust) for game states, game actions and players. This way I could write the code for my algorithms independently from the code for the game and avoid having to focus on specifics.

Also, I decided to allow my algorithms to simulate until the end of the game even if the states were heavily impacted by randomness because algorithms such as Monte Carlo tree search should be able to adapt and learn from the random variation of these states.

Beam search

Once my algorithm framework was set up, the first thing I tried out was beam search. It failed miserably, but after debugging my code locally with VSCode and CodeLLDB I realized that the issue wasn’t the algorithm, but my search. I was using the game scores of the minigames multiplied together, but the problem was that this was basically zero for most of the game.

This was causing the beam search to essentially pick moves at random because all the scores were always zero. To fix this, I made a heuristic scoring function based on different aspects of the games (how far ahead I was in hurdles, how close I was to the centre in the archery game, etc.) and this shot me up straight to the top of silver.

Because I had a local referee, I was also able to get some code to test different hyperparameters and that helped me a lot, but after messing around with it a while it became apparent to me that beam search wasn’t going to get me any higher than the very bottom of gold.

Monte Carlo tree search

The next thing I tried was Monte Carlo tree search with a branching factor of 4 moves ^ 3 players or 64 nodes. At first I implemented it using only my UCT score, but then I read about Decoupled UCT or DUCT and I consulted this and this papers to implement it in my code. I then proceeded to optimize my hyperparameters and surprisingly my heuristic wasn’t helping anymore, so I got rid of it.

This sent me up to around the middle of gold, but yet again I couldn’t find anything else to improve in my code until I realized that I was only doing about 2000 simulations at every game turn. This was very surprising to me because I thought my code was efficient.

I was about to focus on optimizing my code when I noticed something. I saw MSmits in first place and a light bulb turned on in my head.

Smitsimax

Everything was falling into place, I had read about Smitsimax before (MSmits has a very good write up on the algorithm here) but the main advantage I see in it is that there’s a tree for every player as opposed to a tree for the entire game, so instead of having an increase in num_moves ^ num_players child nodes at every level, you end up with only num_moves * num_players nodes (64 vs 12 for this challenge), greatly reducing the amount of nodes that need to be stored in memory.

I greatly suspected (although this might not be the case) that MSmits was using Smitsimax, and it’s not too different from Monte Carlo tree search, so it naturally felt like the next thing for me to try out.

I finished working on it and it shot me up to around 100th place in gold. I knew there were more things I could optimize, but I was slowly running out of ideas and time was ticking.

Last minute optimizations

There are a few major things that helped me clear the home stretch into legend:

Number of simulations

I talked about this earlier, but I was achieving around 2000 simulations per turn, and therealbeef mentioned on Discord that he was getting around 250k in C++, so I knew I was doing something wrong. I learnt that RustRover has a profiler (although I really dislike it) and using that I realized that the rand crate from rust was eating up a LOT of time.

I decided to get rid of the crate and create my own random number generation with fancy bit magic, and that got me to around 30k simulations per turn.

I also realized that I was allocating a lot of structures on the heap, so I replaced those structures with my own stack friendly implementations from scratch.

Additionally, I precomputed some of the GPU strings to save even more time. For skating there were only 24 possible strings, and for hurdles only 251, but for diving there were over 1 billion possible strings and for archery there were almost a quadrillion possible strings with different weights, so what I did was to create a script to simulate a whole bunch of strings for those two games and use around 200 of the most commonly repeated strings for the precomputed arrays. These optimizations then allowed me to get to around 43k simulations per turn.

I am sure that I could have squeezed out a lot more simulations out of my code if I had dedicated more time towards this, but for now this was going to have to do.

Smoother scoring function

Looking on discord I saw a comment by wontonimo:

A good eval has a smooth(-ish) gradient, and is an underestimate.

Followed by this comment by LazyMammal:

I’m guessing he’s talking about any mini-game that rewards medals. If you only count “whole” medals the reward function will be in discrete steps. It only gets worse if you are evaluating the combined score (product of medals point). Getting a reward of “zero” is much worse than a fractional reward.

This made a lot of sense, because at the current moment my score was just the product of the medal scores, which was still zero in most cases.

My old heuristic based on individual goals for each of the different minigames didn’t work, but I decided to try adding the number of gold medals I have to my score instead in order to make the evaluation function smoother and help Smitsimax choose paths with more certainty. This really helped me and shot me up to around 25th place in the gold league.

A stupid bug I missed

There were only a few hours left and I was getting desperate. I couldn’t find any more ways to improve my code and I was going to have to refactor a lot of code to improve my simulation count. I decided to re-write a few things such as Smitsimax and while I was looking for copper I struck gold. There was a big mistake in my implementation of Smitsimax where I wasn’t expanding the trees properly and I was dropping the movements for some of the players.

I was honestly surprised that the algorithm even worked without some of the player’s inputs, but I rushed to fix the bug and I finally had an advancement. I submitted my code, and to my dismay I was losing a lot of games due to timeouts. I tried lowering the time limit for Smitsimax from 50ms to 40ms, adding more places for the code to stop and so on but nothing was working. Suddenly, one of the re-submissions didn’t have as many timeouts and I ended up at around 5th place in gold league.

There was nothing left for me to do but to wait anxiously for me to potentially climb to legend as less than 12 hours were left and I didn’t know why I was getting these timeouts all of a sudden. Luckily, I waited for a few more hours and I got the notification that I was in legend!

Final remarks

I must say that this has been a wild ride, a massive storm blew right past my city leaving us without power and water for several days, and the transformer outside of my house decided to blow up randomly a few days before that, so I wasn’t able to dedicate as much time as I had hoped to the contest.

Nevertheless this has been extremely fun and I learnt a lot by participating in this contest. Thank you to CodinGame and CGjupoulton for creating this contest, Fiverr for sponsoring it, and the rest of the community for your support and suggestions!

Apofils · June 25, 2024, 1:09pm

Hello everyone !
Thanks Codingame and Fiveer for this challenge and GG to everyone!

I finished 170th globaly with an heuristic based bot based on the intersection of best moves for every miniGames.

More information here if you want : SummerChallenge2024 - Olymbits - Apofils PostMortem - Gold 100th - Global 150th · GitHub

Always a pleasure to participate to this challenges!

yamo · June 25, 2024, 1:12pm

Hello !

I finished 172th.

Here is my postmortem if you’re interesed : My post mortem for the codingame Summer Challenge 2024 | Yann Moisan

therealbeef · June 25, 2024, 2:20pm

9th in Legend

I used a UCT Forest, with one tree per player.

Fixed depth of 9. So, in every iteration, I select 9 moves for each player, and simulate the resulting game state.
When a mini-game is finished during those 9 moves, I don’t regenerate it, as it would be too unreliable simulation.
While simulating I do keep changing the risk moves in skating. Simulating less skating moves made the bot worse.
For evaluation, I calculated an estimate of the end-game score.
Current medal scores + estimated medal scores of unfinished mini-games + equally divided medal scores for all mini-games that can still be finished before turn 100
Estimated medal scores of unfinished mini-games are done by (very small) neural networks, except for archery.
I then multiply the mini-game medal scores like the rules prescribe and the values that are used for back-propagation are based on the score ratios between players.
Including the NN evaluations, I got about 250k turn sims in 50ms (and about 100k on the trash CPUs)

I liked the game; discrete states and full observability. Only downside was the randomness of the games, and the resulting noise in the ratings.

gaha · June 25, 2024, 2:35pm

#7 in legend

I am using Smitsimax - so 3xMCTS; one per player. I do rollouts up to depth 30, replacing minigames as they finish. I keep a pool of random minigames generated in turn 0, so I can load them up quickly. If I am finishing the search before turn 100, I add a small amount of bonus points to all minigames, which represent the expected points players will get in the future minigames. I decay this bonus linearly as the game progresses.

I have a heuristic to rate all 4 moves given the game state: I am using an array of size 3 in which the 3 minigames (hurdles, diving, archery) are voting for moves by adding a heuristically chosen amount of points to each move (U, L, D, R). For example, in some imaginary game state, hurdle minigame might be voting [0, 4, 4, 10], diving [0, 8, 0, 0] and archery [4, 5, -4, -5]; skating is ignored in this heuristic. Votes are then summed up to: [4, 17, 0, 5] from which I choose the maximum value and perform this maximal move.

I use this heuristic voting in two places: in random rollouts where I add random noise to the votes (to increase playout diversity) and in progressive bias: an additional term that I add to UCB1 formula which encourages heuristically good moves when the numer of visits in a node is low; but I don’t introduce the random noise here.

For scoring I use just 0, 1 or 2 depending on the final result + small bonus in form of the difference of squared points between me and the opponents. The bonus encourages moves that decrease point difference, even if they don’t bring the victory yet.

I use an additional tweak to the random rollouts that gives a slight improvement: every odd iteration I do a fully random rollout without the voting mechanism. Not sure why it improves things, but it does. Perhaps my rollouts were too deterministic.

From performance standpoint, I am getting about 17k rollouts (to depth 30) on the faster machines and 9k on the slower ones (there are various VMs that you can get on CG and they have different speed).

PS. Used psyleague, helped a lot!

BlitzProg · June 25, 2024, 2:43pm

BlitzProg: 798th (PHP - 1 turn heuristic)

Hi there. It’s been a while!
A game with only 4 commands made it look easy to take advantage of a simulation and rush to the gold league fast. So I made one. Then I had no idea how to use it.

Eventually, I made it to my goal without.

Wood 2
Run next to the hurle, then jump.

Wood 1
Same thing, just combine all hurdles from all game into one GPU! (I don’t know why the bronze boss loses to this)

Bottom Silver:
Turns out you can still get some half-decent performance with some simple 1-turn heuristics. I force-win two medals on the diving game. then weight each move using wind strength x relative direction and how close the hurdle is to my player.

Rank 2 silver to gold promotion
wind x direction for weighting is very naive, so a basic MC does it instead: random archery rollouts are performed for each move, the best ending distance is then used to weight moves.

That’s pretty much it!

jBigman · June 25, 2024, 3:43pm

jBigman: 167th Global, 1st in JavaScript

I developed an algorithm capable of emulating the four possible movements. For each movement, I independently analyze each mini-game to evaluate the game itselft and the potential medals

Hurdle: I determine the number of moves needed to finish and compare this with the minimum moves required by other players.
Archery: during the last 8 turns, I brute-force all combinations to see if I can get closer to the bullseye than the others.
Skating: I mainly try to avoid other players by checking where they might land on the next turn.
Diving: I determine the best possible diving points for each player and compare them to see if I can beat their maximum score.

Then i compare each move emulation, by adding the 4 mini games evals and the new total medal score.

If you want to read more details on my evaluations, on how i avoid the 0 medals, and how i priorized short term over long term medals : SummerChallenge2024 - Olymbits - jBigman PostMortem

jacek1 · June 25, 2024, 4:20pm

85th, 23rd in gold

I didn’t use any search, just trained NN to pick the move it thinks is the best.
First day trained directly on the referee using evolution strategies to see if the policy NN is feasible. It was. This method was sufficient to get into mid gold after all. But for the most time I was using deep Q-learning to train the network.

Most time was spent tweaking hyper params and net architecture/inputs. I had several runs from scratch. Thanks to that I had several differently learned bots to test against. Bot was easily in top100 all the time, unfortunately it was not enough for the legend. I desperately tried several things at the end, even with some search, but it didn’t work, because I haven’t done serious simultaneous games search before, let alone 3 players + randomness. All I can say with proud is that my bot doesn’t have any simulation. It was weird throghout the runs to see my bot was among the strongest ones, but it would lose with 0 points against 2 default ais in IDE, because it would not earn any gold or silver medals in hurdles. But because it got all golds in other minigames, it would still think it is winning (have good score on the moves).

Given the game was easy to simulate, I wonder why I didn’t try that from the start, 2 weeks should be enough to learn about tree search of such games. Meh maybe next time.

I really liked the design of the game, the pixel art and simple rules, though the statement contained some errors or was not 100% clear in some aspects. The viewer could contain more detailed information, at least in debug mode. People in discord were constantly asking why they got 0 points when they got most gold medals :). It seems using AI for cover images is a tradition now.

Appendix:
NN is 2 hidden layers with 200 nodes each, with 214 inputs and 4 outputs. 214 x 200 + 200x200 + 200x4. The inputs are from the perspective of me vs others. I experimented with various inputs. One thing that Lysk told me is to use hurdles relative to players, it sped up the training as it can be seen here (y-axis is winrate against test bots and green is the updated inputs).

Deep Q-learning params:

double deep q-learning, update target network every 4096 turns
NN learning: SGD with 0.8 momentum and alpha learning rate
alpha learning rate: 0.001 to 0.0005 during run; in final run fine-tuning with 0.00005 which stabilized the training
exploration: e-greedy with e = 0.05 (more in the beginning)
gamma: 0.99
1 million replay buffer
learn from 32 positions randomly from buffer each turn
each turn evaluate 3 players and get results, so each turn 3 positions were added to the buffer for learning
reward at the end only. 1.0 for 1st, 0.25 for 2nd and 0 for being 3rd

LeifTerry · June 25, 2024, 4:34pm

54th in Legend, 2nd in C.

My first legend finish in a contest!

Reproducible Tests
I did not create an offline tournament this time, as I just end up tuning my code to beat myself which doesn’t always translate into stronger bots.
Instead I found strong, deterministic bots (jack, sullyper) from among the other players and tested against those with a specific list of seeds.

Simulate Accurately
Try to avoid artificial weights and heuristics.
Fixed depth look-ahead, computed all 4^depth combinations and outcomes for player. depth=8 (9 if I had time). Compare player’s simulated results to best/worst possible opp results. e.g. definitely 1st, possibly 2nd, etc.
For Skating, depth=1. Considered all 16 combinations of opp moves, but not equally - opps are much more likely to move in direction of Diving or jump a Hurdle.

Combine Results
Weight games are behind in medals more.
Games that are in GAME_OVER, won’t finish before turn 100, or have outcome that can’t be changed (1ST/2ND/3RD) can be ignored.

IloIlo · June 25, 2024, 4:42pm

Hi @therealbeef, interesting, thanks ! Can you explain the process to gather data and train the NN for each minigame ?

cegprakash · June 25, 2024, 5:07pm

#103 Global

Postmortem: GitHub - cegprakash/cg_olymbits_post_mortem

Feedbacks:

The olympics theme was super fun
More minigames like Running, Javelin throw, Triple jump etc. would have been fun especially because it was super easy to simulate the games…
In the minigame input if we had MINI_GAME_TYPE before each mini game data would have been even more fun with random games spawning at random locations…
Captchas and IDE testing limit was super annoying provided that there is no officially supported tool for local testing
It was super cool just having the need to beat the boss in wood leagues
Contest duration of 14 days was perfect… I was able to implement all my ideas in the contest till I ran out off ideas.
Fiverr gigs looks amazing!!
Can we have Tshirts for top 103 ranks instead of top 10 please…

_Royale · June 25, 2024, 5:53pm

Legend 5th

Thank you CG and all the participants for this contests!

Base search algorithm: Decoupled PUCT without chance nodes.

Decoupled MCTS, a MCTS variant for simultaneous game: XMash Rush (CC07) - Feedback & Strategies - #38 by jolindien
No chance nodes, sometimes referred as “open-loop”: monte carlo tree search - MCTS for non-deterministic games with very high branching factor for chance nodes - Artificial Intelligence Stack Exchange
Selection : PUCT instead of UCB.
Started with a similar heuristic as @gaha, where each mini-game voted for its prefered move, to guide the search and prefer promising move for each player.
Finally I trained a very small network to replicate (supervised learning) the policy of UCB for a single player, by self-playing lots of games with high number of sims.
Simulation : rollouts until the end of the mini-games, where moves are sampled from the same policy network.
Reward is my_score / total_score (with a epsilon free silver medal for all players).

Things that did not work:

chance nodes or “closed-loop” approach
bigger/smarter NN for policy
for the first days of the contest, my simulation was broken because I forgot to reset this boolean
https://github.com/CodinGame/SummerChallenge2024Olymbits/blob/main/src/main/java/com/codingame/game/mini/HurdleRace.java#L124

wala · June 25, 2024, 7:56pm

Java #Legend.
First time I do a MCTS during a contest (it always seemed a bit suicidal using Java).
I did a “DUCT” :

I have one layer for each player : first layer 4 nodes for each of my moves, then the first opponent (16 nodes) then the second (64 nodes). It seemed easier that way even if it adds 4 + 16 nodes.
So when I select the most promising node I look through the next 3 layers.
Then to avoid that the opponents choose their moves knowing mine, when I rollback the teams scores, I update all the nodes with the chosen moves (so I update 1 node for me, 4 nodes for opponent 1 and 16 for opponent 2).
I only simulate 4 random moves, then attribute medals based on the teams positions (with some “light” heuristics).
For each team the wininng score is : sum of the delta medals scores of the team divided by the total for all teams.
I don’t start again the mini games. For skating I choose random risks. My game state is a byte array, without the scores. I used a cache with 400000 nodes.

But I only have a few thousands iterations each round. If the contest had lasted longer, I would have tried to translate my code in C++.
That was a really fun contest. Really nice idea. Thanks Codingame.

MSmits · June 25, 2024, 8:55pm

Thanks to the organizers for the contest. I made rank 8. It has been my best contest rank on CG in a long time as i’ve mostly been focused on optimization contests on other platforms and boardgame arenas on CG. I’m quite happy with the results and hope to join again next time.

Here’s my post mortem

It’s a bit short, but if I need to explain more, just ask me a question. I’ll answer and change the post mortem also.

wlesavo · June 25, 2024, 9:34pm

Thx to CG for keep doing a good work setting up a regular bot challenges. I reached bottom legend with a lucky push.

I realy liked the very idea of this game when one action controls 4 different games. I originally thought I wont have time for this, but the concept of this game was so facinating I had to participate. Although the game was frustrating in many aspects on high level, I really enjoyed the first week of the challenge and community discussions with all the familiar faces like in old times.

Strategy-wise I tried very similar ideas with mentioned by other players, trying to make SmitsiMax work, although I had less luck in the chosen direcions. Also I think I made way more hyper parameters then it was needed and couldnt resist tweeking them, which was absolutely pointless considering the randomness of submits.

Few things that actually worked:

Different exploration for me and opponents. This helped the search to stabilize.
Punishing for total score of all players.
Fixing bugs and reading referee code.

Vintarel · June 25, 2024, 9:36pm

26th overall

A result I’m very proud of ! The contest was fun.

I’ve written a pretty standard DUCT algorithm, nothing too fancy, based on the MCTS algorithm :

In every node of the tree, each player chooses one of the 4 possible moves thanks to the UCB formula. I stop expanding the tree when all 4 games are over. In reality, Rollers is only played at the root node, because otherwise its future states are not known.
The score I back-propagate from the leaves to the top is a weighted difference of the total scores (real or expected) after the 4 games are finished, the weights being the same in all the tree.

For instance, if A is currently winning, he wants to widen his score with the second player, say B. So the moves A chose are rewarded with the score : SCORE[A] - SCORE[B]. If A is actually right between B and C, I reward his moves with the score : SCORE[A] - 0,5 * SCORE[B] - 0,5 * SCORE[C]. And so forth. Locally, on my computer, I found that this difference of score gives much better results than the raw score.

In the UCB formula I divide the sum of the rewards by a “normal” score, e.g. an average of the expected scores. It should give numbers between [-1; 1]. It does not matter if it’s not.
It makes around 40k iterations per turn, and at the end of the search, the move I choose is the most visited one.
I use big arrays for getting the expected final scores of each game in O(1) computation. I suppose all players will play random (like a roll-out). For example, in Hurdles, if A, B and C are respectively in cells 13, 22 and 24, with no hurdles in sight, I compute that they respectively have an expected score of 0, 1.58 and 2.92 (if playing randomly).
I couldn’t store every states as it’s too big, but I came up with this approximate method : instead of doing A vs B vs C, I look at A vs B and A vs C separately. Then, in hurdles for instance, I have an array P and P[xA][xB][stunA][stunB] is the probability that A wins (or ties) vs B only (in a pure random play). The same array gives the same probability vs C. If the former is denoted pAB and the latter pAC, then the expected score of A in this game is :

E = 3*pAB*pAC + 1*(1-pAB)*pAC + 1*(1-pAC)*pAB.
Separating duels is not perfect because C can infer in the A vs B duel (if C finishes first), but it is close to perfect (I checked locally). This is the same for all the games (they all gave me some headaches to code, but then they are quickly computed !).

Possible improvements :

I started to code a version where the final move is chosen randomly, based on the visit counts (like in the paper on DUCT). It gave rather promising results but I didn’t have time to finalize it.
I wanted to try Nash equilibrium but I didn’t get the algorithms to compute it.
I started to code a version where Roller is played, at least for the first layer of the tree, having 24 times more nodes, one per different permutation of (U, L, D, R). I coded it locally but it gave poor results, I guess because of the bugs and the parameters I didn’t tweak.

Thanks to Codingame for this challenge !

Remi · June 25, 2024, 9:58pm

438th global / 376th gold

I played to 3 games only, the roller wasn’t existing.
My solution works in two parts.

Part 1 : select pathes to explore
I did it like a Monte Carlo but carefully selected the paths to explore instead of doing them randomly.
I started from the ideal path of swimming, in which I sacrificed the last 3 laps of archery to test the 64 possibilities.

Part 2 : simulate
I simulated the 3 games (hurdles, archery and swimming), and every time I fell because of a hedge, I added to my list of paths to simulate the path that would have avoided the fall. At the end of each simulated path, I compute a score and select the path with the best score.

I added in the 2 last days a feature to ignore the swimming game if it wouldn’t change the end (I won in advance or I lost in advance, regardless of the actions). In this case, I simulated all possible pathes with depth 6. This allowed me to move up 50 to 100 places in the ranking.

Thanks CG for the game, it was really interesting, it was the first time I used simulation in a contest.

Illedan · June 25, 2024, 11:01pm

Thanks for the challenge. This was a very good game

Ended 12th and 1st C# missing the frog

Key features:

Psyleague FTW I have 140 version locally and 7 archives of leaderboards where I restart.
Smitsimax
Stop it at depth 7
Don’t create new minigames - Regret not trying it though, as Smitsimax handles this nicely with random moves in skating
Create some priority based on my current score and the amount of score gained from getting a new gold medal in that game (priority has a total sum of 1 for all 4 games and are divided by how much I would gain)
- Tiny extra bonus to Skating since this seems to be the primary source of losing for top players
- Calc: 5 + scores[gameId] / maxScore * (Program.Turn > 30 ? 20 : 5);
The first turn I pick the tree with the highest points as all players are equal. Trying to track equal players for multiple turns seemed worse, and I don’t understand why…
Eval is sum for all games:
g.GetScore(playerId) * Priority[player][gameId] * (!done ? 0.8 * g.TimeLeft() : 1.0)
- Games return scores of: Solo first: 1.0, shared first: 0.6, second: 0.3, shared second: 0.15
No eval in regards to who wins and total medal score - But I tried real hard on that, about 30 versions locally and nothing sticked for a better bot

Please let us know about the t-shirt prior to the contest next time. Would have skipped sleep for top 10 t-shirt

And thanks for the screenshot prior to the game. This made me create this interface + Diving, Archery and some initial version of the search: (didn’t get all the details correct in the sims, but it saved me some time when it started)

    interface IGame
    {
        void Reset();
        void Save();
        void Move(int[] moves);
        void GenerateNew();
        double GetScore(int player);
        double GetScore(int player, int move);
        int GetState(int player);
        void SetFromState(int player, int state);
        bool IsDone();
        void ProgressTime();
    }

Andriamanitra · June 25, 2024, 11:16pm

Gold (781st global)

I decided early on that I didn’t want to spend much effort on the contest so I would just focus on heuristics and not try anything fancy, so I was pleasantly surprised to make it into the Gold league. In the end my solution was about 280 lines of Python.

My approach

I used heuristics to score each move in each minigame independently. To do this I structured each minigame into its own class with a method that returned a MovePrefs object (for example MovePrefs(UP=2, DOWN=2, RIGHT=3, LEFT=1)). The main loop would then take the weighted sum of these preferences to pick the best move.

Hurdles: +1 per square moved, -6 for hitting a fence
Archery: +1 if the move is towards origin, except on the very last turn I gave +10 for each enemy that the move is guaranteed to beat even if the enemy plays their best move. Archery was my worst minigame, I didn’t really like the problem so I didn’t spend much time on it. It would’ve been pretty simple to look more than one move ahead.
Roller Skating: +1 per square moved, increasingly negative penalties for taking on more risk depending on current risk level, and a penalty for getting stunned (with lower penalty for getting stunned if there’s less than 3 turns left). I also estimated the risk from collisions by assuming (non-stunned) enemies move 1 square 25% of the time, 2 squares 50% of the time, and 3 squares 25% of the time.
Diving: +(1+2*combo) for the correct move, except when the victory is already guaranteed (current points > maximum possible points if both enemies play the rest of the game perfectly) or 3rd place is already guaranteed.
I discarded minigames that have no chance to finish on time before round 100
I prioritized the minigames in which I had the fewest points so far by weighing them by 1/(current_score_from_minigame + 0.1).

Feedback

I really liked the concept of having multiple simple minigames played at once. Compared to some of the previous challenges it was really easy to understand how the game works and get started with coding (I don’t think it was even necessary to start with only one minigame). I think always having the same 4 minigames was a huge missed opportunity. Having a pool of about 8 minigames out of which 4 would always be randomly chosen (with repeats!) for each match would have made the challenge much more interesting.