Spring Challenge 2021 - Feedbacks & strategies

ValGrowth · May 19, 2021, 3:59pm

Legend 4th.

【Before the Legend League is released】

[Algorithms]

6 days Chokudai Search (It is a variant of Beam Search)
Opponent is WAIT
1 Depth = 1 action
In order to make the progress of the day uniform between different nodes of the same depth, limit the number of actions per day to 6 times, and when WAIT, jump to the next mod 6 == 0 depth.

I banned how to plant trees in which my trees are in shadow of each other. (When calculating shadows, considered all the trees as GrowState3.) However, if there is no place to put it with this rule, up to two trees are allowed to be planted in the shadow.
In order to reduce the number of states, each cell I limited the seed source tree to one most highest tree.

[Evaluation function]

Maximize actual score
Then, maximize evaluation of trees (1,2,4,8 for each GrowState) for tie break.

With this evaluation, immediately COMPLETE all trees in order to maximize the score, But that isn’t good. So I put a limit of COMPLETE that only after the 12th day, and once in a day (if did not COMPLETE, it will be carried over to the next day).

By the above method, I was in 20th place just before the Legend League was opened.

【After the Legend League is released】

Below, to prevent confusion, I will describe my bot as Player1 and the opponent’s as Player2.

I realized that it is important to maximize the amount of SunPoints earned, not the number of trees. However, it varies depending on the opponent moves, so it is necessary to do opponent move prediction. I implemented it in the following way:

Do Player2 perspective ChokudaiSearch (Player1 is WAIT) in the first 20ms.
Next, do Player1 perspective search while playing back the actions of the Player2.

The problem with this method is the Player2’s invalid seed action. When simulating Player1, Player2 only executes a predetermined action, so Player2 may plant the seed later where Player1 already planted the seed.

To solve this, I allowed two trees in one cell. There is also a solution to cancel the action of Player2, but I did not adopt it because doing so would lead to overvaluation of the move that disturb opponent.

Also when simulating Player1, it’s not good to try to plant seeds in an empty cell that Player2 COMPLETEed in the simulation. If Player2 does not COMPLETE in a actual game, Player1 will be waiting until Player2 do COMPLETE action.

To avoid this, I introduce the concept of GhostTree.

GhostTree is the tree that visible only by Player1. It appears when Player2 executes COMPLETE, and it does not exist for Player2 and does not produce SunPoints, but for Player1 it exists as a tree of GrowState 3 and make shadows, and cannot plant a seed there. As a result, I can lead the Player1’s action that does not collapse even if Player2 does not do COMPLETE in the actual game.

[Evaluation function]

By released the opponent from WAIT, now I can use symmetrical evaluation values.

When simulating for Player1, evaluation = [Player1 evaluation value]-[Player2 evaluation value]
When simulating for Player2, evaluation = [Player2 evaluation value] (Player1 is WAIT, so disturbing to Player1 is not evaluated)

Second version of evaluation function is follows

Maximize actual score
Then, maximize total SunPoints Income for tie break.

The total SunPoints Income is the cumulative total of SunPoints gained during the simulation + the SunPoints to be gain the next day (calculate in the current game state). The used SunPoint is not considered.
The COMPLETE limitation remains the same, only once a day after the 12th day.

With the contents so far, it was ranked 10th in the just after the opening of Legend League.

【To the Contest finished】

It is not optimal that “COMPLETE once a day after the 12th day”, I improved this first. I removed this limitation and changed the evaluation function as follows

Maximize A*[Actual Score]+B*[Average SunPoint Income]

Average SunPoint Income is simply the Total SunPoint Income divided by the number of days.
Coefficients A and B are different parameters for each day (24 days x 2 for a total of 48)
This was adjusted as shown in the table below

As a result, the behavior was as follows.

Do not COMPLETE until day 10
On days 10-11, COMPLETE trees that makes very small SunPoint Income.
For days 12 to 15, COMPLETE trees that makes small ～ medium SunPoint Income.
After 16 days, COMPLETE trees without almost considering SunPoint Income.

In addition, I did the following improvements.

Subtract the wasted SunPoint from the SunPoint Income. “wasted” means the minus amount due to the number of trees when growing a tree.
Removing same game state by using Zobrist Hashing. Along with this, the depth of Chokudai Search has been increased to 8 days.
Allow the case where Player1 will plant a seed later at the cell where Player2 already planted during the simulation. This is because predicting the seed coordinates of the opponent player is mostly fail.
Only on the 12th day, allow seeds to be planted in the central green cell even if it is in the shadow of my own tree. If it is plant on the 12th day, I guess that we will be able to COMPLETE it before the end of the game.

After doing these improvements, I finally got the result of 4th place.

Thank you for this great contest, I’m also looking forward to the next one.

VincentBab · May 19, 2021, 9:18pm

Finished 28th (legend)

Simulation

I started to build a simulation of the game (generatePossibleMoves(), simulate(myMove, oppMove) using bitboards very much like @Kodle described.
This article explain how to iterate bitsets quickly: https://lemire.me/blog/2018/02/21/iterating-over-set-bits-quickly/
To add another tips to what @Kodle said I used this “trick” to get the size of a tree without branching:

    int getSize(int cell) const {
            return __builtin_popcountl(size1Trees & (1UL<<cell)) + __builtin_popcountl(size2Trees & (1UL<<cell))*2 + __builtin_popcountl(size3Trees & (1UL<<cell))*3;
        }

MC

I started with a Monte Carlo algorithm which was perfect to test the simulation. I selected the move which gave me the highest average score (using score difference with opponent or number of wins did worse).

MCTS DUCT

I had the feelings that DUCT should be very adapted to this game because the 2 players play simultaneously but I failed to understand how I could implement it. But that’s something I definitly want to try.

MCTS sequencial

I reused my code from UTTT but it took me a while to implement it correctly for this game.
Selection and Backpropagation was very standard.

Expansion

Almost the same as @Magus without “Never WAIT if a SEED is possible.”
but with “Never COMPLETE after GROW”
For SEED I precompute the list of possible seed target for every cell and tree size, ordered by :

“Diagonal” (will never have shadow from this cell)
Further away
Richness
Then I only allow the 3 best possible seeds per tree

Rollout

I only rollout after both players have made a move to make it fair (one in two nodes).
Moves pruning is the same than Expansion with the addition of "Allow wait only if ((day > 16 and day < 22) or day >= 22 and COMPLETE is not possible) or no other action is possible

Evaluation

Simply myScore - oppScore
This might be a good spot for improvement

Conclusion

I really need to setup cg-brutal-tester to test new ideas more quickly. This might be the way to the Top 20.
This challenge was lot of fun and I liked the fact that multiple strategies was possible to reach legend (Beam, MCTS, Heuristics, NN, …).
I’m also very proud of the result of our small startup considering I was the only one who had experience with Codingame Challenge. Congratulations on reaching gold league for everyone

And congratulations to @reCurse for his amazing work and very well written PM

AntiSquid · May 19, 2021, 10:24pm

For anyone who’s interested.

anon35323069 · May 20, 2021, 6:45am

"… learn from your peers …"

Gilde · May 20, 2021, 9:48am

Finished 161st Legend… 2 days after the end of the contest
…so 421th officialy

Search

I use a Beam Search of 60 nodes. My simulation is not that fast, I only able to evaluate around 7k Nodes a round. For each my possible actions (pruned like other players…), I find best opponent action. Calculated nodes for a depth (only starting on depth 4) are then pruned : I just keep best 60. For the first time I used cancel functions instead of coppying gameState. Better for perfs, harder to debug…

Evaluation

Starting on day 12, my score - opponent score, then I evaluate each tree, taking account potential immediate and future shadows (decreasing future shadows). I think that this evaluation made my simulation slower, but I couldn’t find better

Typescript Brutal Tester

I used Brutal Tester to test my magic numbers and new features. But I had to customize my typescript code in order to make it work. I think that might help TS deloppers (maybe @Magus you can add it to your doc) :

Prerequisites

Install Node

Customize TS Code

Init a new node project npm init
Add following dependencies in you package.json:

  "dependencies": {
    "@types/node": "^14.14.37",
    "typescript": "4.2.4",
    "node": "16.1.0"
  }

Create two files master.ts and control.ts. For me Control is my code in the Arena, Master is my code in the IDE, and my goal is to make Master defeat Control
Put the following code in those two files, and copy-paste your code in function play
WARNING : Replace all standard readline() by await readline()

const rl = require("readline");

const intRl = rl.createInterface({
  input: process.stdin,
  output: process.stdout,
});

// Max time between two lines
const READ_TIMEOUT=10000;
// Max time that node serveur should be up
const GAME_TIMEOUT=10000;

intRl.on("line", (input: string) => {
  lines.push(input);
});

function delay(ms: number) {
  return new Promise((resolve) => setTimeout(resolve, ms));
}
const lines: string[] = [];
let wait = 0;
async function readline(): Promise<string> {
  const line = lines.shift();
  if (line) {
    wait = 0;
    return Promise.resolve(line);
  } else {
    // Read every ms if there is a new line
    if (wait < READ_TIMEOUT) {
      wait++;
      await delay(1);
      return await readline();
    } else process.exit();
  }
}
async function play() {
// COPY PASTE HERE YOUR TS COD IN IDE, REPLACING ALL `readline()` by `await readline()`
}

play();
setTimeout(function () {
  return process.exit();
}, GAME_TIMEOUT);

Build and start players

In project directory, build master and control with command line :

tsc master.ts
tsc control.ts

Then you can start master and control with node control.js

You can build and start using npm scripts :

    "build": "tsc master.ts && tsc control.ts",
    "master": "node master.js",
    "control": "node control.js"

And then npm run build, npm run master, npm run control…

With brutal tester, after build, this will look like :

java -jar cg-brutaltester.jar -r "java -jar SpringChallenge2021/target/spring-2021-1.0-SNAPSHOT.jar " -p1 "node control.js" -p2 "node master.js" -n 100 -s -t 2

Or

java -jar cg-brutaltester.jar -r "java -jar SpringChallenge2021/target/spring-2021-1.0-SNAPSHOT.jar " -p1 "npm run control" -p2 "npm run master" -n 100 -s -t 2

Limitation

For now, this code is a bit slow, so I couldn’t customize my beam search (because of turn time out). But it helped when I was in heuristic mode (to improve evaluation)

Thanks CG for the contest, and for all PM that helped me reach Legend after all !

Thyl · May 20, 2021, 1:42pm

A wonderful PM !
Congratulations !

Fantomat75 · May 21, 2021, 12:59pm

I’m reading this late but congrats, its exceptional ! It’s been one year that I have been wondering if it’s feasible to apply the alphazero framework to codingame, I’m happy it is! I plan on doing it myself when I have time and the hardware.

I don’t understand a part of your story : Have you successfuly applied your pipeline to UltimateTTT ? It’s the first game I want to try with this framework when I will get to it, but if you’ve found that it’s not working really well I might just take another multiplayer game instead.

JFB · May 21, 2021, 3:21pm

#114 in Gold (#371 overall) so not very good.

Thank CG for the contest - it was pleasure to participate

I have had very simple approach - 1 level deep MinMax with very simple scoring function

if COMPLETE or GROW possible: difference of final scores with only WAIT to the end and COMPLETE in last round
if it is no SEED: put SEED in place with best difference of final scores calculated after growing all trees to max in round + 3

Additionally I do not GROW or SEED if the tree can not reach max size before the end of the game.

It is all.

First I implemented it in Python but it was timeout sometimes in SEED’s calculation. Rewritten in C++ it last around 0.3 ms max.

Thank for all PMs - they are all very informative and the eye openers.

And especially kudos to:
@reCurse - congratulations for 1st place. I think that more people will start using NN in CG now - when it was proven it is possible to receive 1st place. And using utf-8 to put more that 100 kB (you and @Agade written about it) is so simple and great idea.
@BrainSolver - thank you for bitboard for hex - maybe it was common knowledge but it is new for me.

reCurse · May 21, 2021, 4:02pm

Yes, TotoZero is working with UTTT, my current bot in the arena uses its results, though it’s an older version and there’s some bugs I have not addressed (yet?). It’s a difficult and not-so-straightforward game to use Alpha Go/Zero on though, I would recommend simpler ones like Oware, Breakthrough, etc.

Fantomat75 · May 21, 2021, 4:47pm

Why do you think it’s not straightforward? Any particular logical reason or it’s from your experience with trying to apply the concept to it ? Because in my imagination I was thinking that it was quite easy for a start : very simple input encoding, 2 player perfect information deterministic turn-based board game.

I’ve just realize that Oware is in fact Awalé in french, I love that game, did not know it was on cg! Thx for the discovery, I think it will be my next bot.

reCurse · May 21, 2021, 4:54pm

You are right, however there are complications.

It is extremely difficult for a human to read and understand the state of the game.
- For this reason, it will be difficult to understand if the bot is playing properly or not, and what to focus on to improve it.
- For this reason, it is also arguably harder for a NN to evaluate it properly, or at least easily. It took me many different attempts before getting something doing any better than vanilla MCTS.
It is difficult to reason on from a spatial perspective. In this spring challenge, it is easy to have intuition about the importance of neighborhood cells of distance 1, 2, etc. But what does it even mean in UTTT?
There is an overwhelming advantage for player 1. In my experience, it is much easier to learn from situations of balance than imbalance.
It’s just a bad game.

Fantomat75 · May 21, 2021, 5:23pm

That’s true, and as you said it’s a real problem for debugging but I’m not sure to agree on the fact that it is a problem for NNs. Maybe its harder for a human to find the right architecture.

I’ve think several times about that. A 3x3 Conv is very good when its superposed on a little grid, but it does not make a lot of sense when it’s in the middle of several grids… And you need to see the greater picture with the big grid that you’re indirectly filling. Maybe a CNN is just not the rigth architecture for that. Are you using a CNN for it ?

I’ve notice that first player had an advantage but I thought it was maybe because I was not in legend. I agree that imbalance is a big problem for learning. Especially with self-play.

Haha

Thx for your answers, it’s very insightful!

jacek1 · May 21, 2021, 5:45pm

I also have NN in UTTT now, but to get right inputs for the NN is quite tricky for this game (I don’t use CNN). I’ve also had many failed attempts before the bot started to get stronger than simple MCTS. As reCurse said, I’d suggest simpler game like oware or breakthrough. Oware has little inputs and at most only 6 moves per ply.

Fantomat75 · May 21, 2021, 6:02pm

An MLP then ? Or a secret special architecture ? I think I will follow your advices and start with Oware, I’ve already studied this game before and I think it’s a better game too!

jacek1 · May 21, 2021, 6:25pm

Good old MLP (ponies ). Not so secret, but I’d tell you there are 81 squares in UTTT and each square can hold about 30 different states (one-hot encoding). I also exploit the fact that during move only handful of squares are changed.

JFB · May 21, 2021, 8:20pm

Interesting, naively I assumed 4 states in every square. After a while I can see 6 possible states. But around 30? Interesting.

I have to return to my naive CNN bot for UTTT …

Edit: After 10 min I can see why adding 5 more states can be useful - so maybe after 1 h I can see 30 states Thank you for interesting idea …

Fantomat75 · May 21, 2021, 8:57pm

I am also searching the 30 haha. So far I have : empty, cross and nought as a base. But if we take the big square we can have, finish or not, finish can be tie, or cross winner or nought winner. For not finishing we can differentiate playable and non playable squares. With a one-hot vector I have in total 3 * (3 + 2) = 15.
It’s only half the count Or maybe its totally a different encoding

AntiSquid · May 21, 2021, 11:25pm

After thinking it through carefully, I realized i miscounted, recounted and ended up with 42. I am just wondering which ones are the wrong ones, if you list your 30 states I’ll be able to identify the wrong ones and discard them. Thanks in advance.

aCat · May 22, 2021, 7:53am

Individual ranking: #68
School ranking: #1 University of Wrocław
Language: C#

I know the contest is long gone, but it costed me quite a lot, and I’m still busy fixing real-life things, so postmortem is a little bit late.

I aimed top 50, so I am a little disappointed, as I have a feeling that this was possible, but I screw some things organizationally.

Acknowledgments

Thanks for many hours talking and sharing ideas, and great feeling of being a member of the wonderful community.
Special thanks to @MSz, and also people from our Polish Slack Community: @Zylo, @DomiKo, @Nerchio. Thanks to @eulerscheZahl for sharing some bit-magic knowledge - I learned a lot.

Congratulations and many thanks to all participants from our University. The ones that happened to be in the representation are @MSz, @DomiKo, @KeNaj712, @Pawix.

Thanks to CG for organizing the event @TwoSteps.

And last, but definitely not least, congratulations for all “proper” winners, and TheOne[tm] far above them ;].

Game

I am really happy that my first impression was correct, and the game allowed heuristics to be competitive. Even better, there is also a diversity of search-based algorithms at the top, which is wonderful. This is a huge advancement over the Fall Challenge and definitely a step in the right direction!

This does not fully cover the disappointment from the fact that the rules are copied from the existing boardgame. I really felt for a few days like CG cheated with this.

I really like the idea of putting all legal moves as input. Mostly time-wasting for top programs, but a huge help for lower leagues, heuristic approaches, etc. A good thing, so please keep if possible (i.e., not for ‘infinite’ number of moves ^^’) in the next contests.

Visualization was hmm… theoretically nice to look at. But. ‘Normal’ one did not have days(!) nor nutrients, so it was unusable from the start. If not for that, it was good enough, as the ‘debug’ one was actually worse and less readable. Dormant and sun gathered info was OK, but player messages were harder to read; there was no info about the quantities of the trees each size, and: no info about the player moves! That is ridiculous. “Great debug visualization,” but you still have to go back and forth with frames to see what actually happened. Please add an option to just print player moves in visualizations in future games CG.

Algorithm

Synchronized Beam Search. I started with some Flat Monte Carlo to test the engine. But at an early stage I was convinced that efficiency all the way and opponent is not so important, so I focused on beam, and later didn’t have time to implement some opponent prediction or something smarter.

I spent quite a lot of time encoding the engine with a bit magic, which was a great experience, but as it was something rather new to me - also time-consuming. And I spent some time on visualization - which helped me a lot.

What did not work for me was beam search, where I allow in the same layer states with different days - it breaks the evaluation. I had to synchronize this: inserting waiting states for a new layer, and counting states ready to change day - if all are ready, then apply ‘next day’ to them.

What I really put the accent on and I’m proud of is move pruning. I did not have hashtables, so I put some effort into reducing the branching factor during the move generation.

Completes at the beginning or never. Grows in proper size order but additionally without permutations in the same size. Seed only when free. In the first beam search day it can go before any grow, in deeper search goes only as last action (wait was illegal if seed was legal).
Seed is pruned by the target (used for seeding from size 3 first search day, any size tree later). So when generating legal actions, I try to generate only one action seeding to a certain hex - that prunes a lot.

Apart from this, I added later some priority-based stage-based pruning like avoiding shadows early and avoid poor soil later.

Evaluation

This was really simple. I calculated optimal tree completes for both players for day 23. For all the other days, it was just calculating gathered sun in the future (my suns - opponent suns) with the current state of the board plus current score times 3. Discounted times 0.99 each day in the future. And that’s all.

I had many more concrete ideas ready to code (or even started), but without time to actually apply them.

General Proposition

I want to seize the opportunity and write a few words about something I have in mind, and discussed already with a few people on several occasions.

Please add additional league @TwoSteps!

Leagues are the best invention of CG in years. They shift the challenge from PvP to PvE, encouraging learning and knowledge sharing. They add a sense of clear goal, and accomplishment. They make rankings more stable. During the contests, they create additional thrill in the moments of the openings.

But the number of users participating in contests increased like twofold at least, and the number of leagues is the same. Which is horrible both with a computational point of view, a logical one, and the user experience.

So please consider adding some league (at least one!), probably at the bottom, as Legend is known to be the top and it will be a mess to change this. But adding some ‘Copper’ that will work as current bronze (first league with full rules), but will be below bronze might help a lot and at least reduce the issues we currently have with this, generally great, system.

5DN1L · February 5, 2024, 2:54pm

A post was merged into an existing topic: Spring Challenge 2021 - Puzzle discussion