Spring Challenge 2021 - Feedbacks & strategies

I do not think you would learn a lot from reading “ready” code.
You will learn a lot by trial and error, by writing and above all by debugging code yourself.

You want to learn minimax?
Then:

  1. Learn the basics on the topic: https://en.wikipedia.org/wiki/Minimax
  2. Practice with basic use case: https://www.codingame.com/learn/minimax
  3. Read about more advanced stuff: principal variation search, transposition tables…etc
  4. Experiment more: for example write some IA that plays checker, connect 4, or any simple game and use this framework to test everything you read about

You will learn much more this way than reading someone else’s minimax code

If you want to use RL+NN, it would be pointless to read @reCurse training pipeline code.
What you need to do is:

  1. Learn the basics:
    For NN there are tons of resource on the net (including wikipedia). For RL there is the nice Sutton book in pdf: https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf
  2. Practice with basic examples
    https://www.codingame.com/training/hard/binary-neural-network---part-1
    https://www.codingame.com/training/expert/binary-neural-network---part-2
  3. Experiment: create learning pipelines to learn “simple” things using public labeled dataset, like https://en.wikipedia.org/wiki/MNIST_database
  4. Read more (AZ papers) and experiment more: implement DQN, double DQN, A3C algorithm. Test everything with easy games…

The multiplayer arena is a perfect place for experiments, and top-player PM always gives you ideas for the next things to learn.
For example, I want to try @miklla DP-based solution, and I will learn much more by finding out myself than by reading his code. Of course, a prerequisite is to have a deep understanding of DP concepts, but here the CG platform can help again: https://www.codingame.com/learn/dynamic-programming

9 Likes

Sharing the full code is not an option, but we can ask and patiently wait for someone to create a playground with some snippets on a topic of interest. We already have such nice playgrounds as smitsimax, optimizing-breadth-first-search, genetic-algorithms, contest-tools-and-workflow, sse-avx-vectorization etc. and I feel we need more of them.

2 Likes

The general layout / skeleton of an algo is more than enough. As a starter bot that’s good, but even then should be as minimalistic as possible.

Don’t think an entire multi needs to be sacrificed.

Will add my ugly code later in a play-ground for “peer-review”. :smiley:

2 Likes

While I mostly agree with all your points answering joelthelion’s question, I did learn a lot by reading random codes that have been released over the years.

Random examples from my early days on CG:

  • hashing & beam search implementation preserving variety of first moves by kimiyuki in Hypersonic
  • reusable classes from one contest to another (Jeff06 on TGE)
  • slowness of mt19937 random number generator when comparing with Agade and Magus on PCR
  • Psyho’s xorRand() code

Sure, given enough time I could have figured most of this by myself.

Since the Fantastic Bits contest, I do not share codes publicly anymore. However, I do sometimes share some codes privately when there is a legitimate interest and motivation demonstrated. Generally not the full or latest code, but enough to answer that person’s question.

13 Likes

I should have said that I am not a beginner, I scored legend in two contests and gold in many more. I know most of these algorithms and have implemented them several times. I even implemented an RL bot for CSB a few years ago :slight_smile:

I still think it would be very valuable to be able to read other people’scode. Of course trial and error is important, but reading other people’s code can also be a valuable learning experience that no amount of coding on your side will replace.

6 Likes

Hey ! Finished 100th Legend, there is my PM
https://github.com/sgalasso42/cg_spring_challenge_2021
Good reading :slight_smile:

4 Likes

For me “Sharing” code is simple. Omit some critical parts of the code, with TODOs, just make sure that it won’t work right out of the box. Obvious parts are eval/score(). You can give some ideas about what available data are, hints about what did you do, etc, but not a working eval.

People will learn from it, and with enough interest, reconstruct a working version. The problem is with ready to submit code. Anyone can just C&P it on CG IDE and press submit (Sayonaraaaaa!)

1 Like

Legend 4th.


【Before the Legend League is released】

[Algorithms]

  • 6 days Chokudai Search (It is a variant of Beam Search)
  • Opponent is WAIT
  • 1 Depth = 1 action
  • In order to make the progress of the day uniform between different nodes of the same depth, limit the number of actions per day to 6 times, and when WAIT, jump to the next mod 6 == 0 depth.

I banned how to plant trees in which my trees are in shadow of each other. (When calculating shadows, considered all the trees as GrowState3.) However, if there is no place to put it with this rule, up to two trees are allowed to be planted in the shadow.
In order to reduce the number of states, each cell I limited the seed source tree to one most highest tree.

[Evaluation function]

  1. Maximize actual score
  2. Then, maximize evaluation of trees (1,2,4,8 for each GrowState) for tie break.

With this evaluation, immediately COMPLETE all trees in order to maximize the score, But that isn’t good. So I put a limit of COMPLETE that only after the 12th day, and once in a day (if did not COMPLETE, it will be carried over to the next day).

By the above method, I was in 20th place just before the Legend League was opened.


【After the Legend League is released】

Below, to prevent confusion, I will describe my bot as Player1 and the opponent’s as Player2.

I realized that it is important to maximize the amount of SunPoints earned, not the number of trees. However, it varies depending on the opponent moves, so it is necessary to do opponent move prediction. I implemented it in the following way:

  • Do Player2 perspective ChokudaiSearch (Player1 is WAIT) in the first 20ms.
  • Next, do Player1 perspective search while playing back the actions of the Player2.

The problem with this method is the Player2’s invalid seed action. When simulating Player1, Player2 only executes a predetermined action, so Player2 may plant the seed later where Player1 already planted the seed.

To solve this, I allowed two trees in one cell. There is also a solution to cancel the action of Player2, but I did not adopt it because doing so would lead to overvaluation of the move that disturb opponent.

Also when simulating Player1, it’s not good to try to plant seeds in an empty cell that Player2 COMPLETEed in the simulation. If Player2 does not COMPLETE in a actual game, Player1 will be waiting until Player2 do COMPLETE action.

To avoid this, I introduce the concept of GhostTree.

GhostTree is the tree that visible only by Player1. It appears when Player2 executes COMPLETE, and it does not exist for Player2 and does not produce SunPoints, but for Player1 it exists as a tree of GrowState 3 and make shadows, and cannot plant a seed there. As a result, I can lead the Player1’s action that does not collapse even if Player2 does not do COMPLETE in the actual game.

[Evaluation function]

By released the opponent from WAIT, now I can use symmetrical evaluation values.

  • When simulating for Player1, evaluation = [Player1 evaluation value]-[Player2 evaluation value]
  • When simulating for Player2, evaluation = [Player2 evaluation value] (Player1 is WAIT, so disturbing to Player1 is not evaluated)

Second version of evaluation function is follows

  1. Maximize actual score
  2. Then, maximize total SunPoints Income for tie break.

The total SunPoints Income is the cumulative total of SunPoints gained during the simulation + the SunPoints to be gain the next day (calculate in the current game state). The used SunPoint is not considered.
The COMPLETE limitation remains the same, only once a day after the 12th day.

With the contents so far, it was ranked 10th in the just after the opening of Legend League.


【To the Contest finished】

It is not optimal that “COMPLETE once a day after the 12th day”, I improved this first. I removed this limitation and changed the evaluation function as follows

  1. Maximize A*[Actual Score]+B*[Average SunPoint Income]

Average SunPoint Income is simply the Total SunPoint Income divided by the number of days.
Coefficients A and B are different parameters for each day (24 days x 2 for a total of 48)
This was adjusted as shown in the table below

image

As a result, the behavior was as follows.

  • Do not COMPLETE until day 10
  • On days 10-11, COMPLETE trees that makes very small SunPoint Income.
  • For days 12 to 15, COMPLETE trees that makes small ~ medium SunPoint Income.
  • After 16 days, COMPLETE trees without almost considering SunPoint Income.

In addition, I did the following improvements.

  • Subtract the wasted SunPoint from the SunPoint Income. “wasted” means the minus amount due to the number of trees when growing a tree.
  • Removing same game state by using Zobrist Hashing. Along with this, the depth of Chokudai Search has been increased to 8 days.
  • Allow the case where Player1 will plant a seed later at the cell where Player2 already planted during the simulation. This is because predicting the seed coordinates of the opponent player is mostly fail.
  • Only on the 12th day, allow seeds to be planted in the central green cell even if it is in the shadow of my own tree. If it is plant on the 12th day, I guess that we will be able to COMPLETE it before the end of the game.

After doing these improvements, I finally got the result of 4th place.

Thank you for this great contest, I’m also looking forward to the next one. :grinning:

26 Likes

Finished 28th (legend)

Simulation

I started to build a simulation of the game (generatePossibleMoves(), simulate(myMove, oppMove) using bitboards very much like @Kodle described.
This article explain how to iterate bitsets quickly: https://lemire.me/blog/2018/02/21/iterating-over-set-bits-quickly/
To add another tips to what @Kodle said I used this “trick” to get the size of a tree without branching:

    int getSize(int cell) const {
            return __builtin_popcountl(size1Trees & (1UL<<cell)) + __builtin_popcountl(size2Trees & (1UL<<cell))*2 + __builtin_popcountl(size3Trees & (1UL<<cell))*3;
        }

MC

I started with a Monte Carlo algorithm which was perfect to test the simulation. I selected the move which gave me the highest average score (using score difference with opponent or number of wins did worse).

MCTS DUCT

I had the feelings that DUCT should be very adapted to this game because the 2 players play simultaneously but I failed to understand how I could implement it. But that’s something I definitly want to try.

MCTS sequencial

I reused my code from UTTT but it took me a while to implement it correctly for this game.
Selection and Backpropagation was very standard.

Expansion

Almost the same as @Magus without “Never WAIT if a SEED is possible.”
but with “Never COMPLETE after GROW”
For SEED I precompute the list of possible seed target for every cell and tree size, ordered by :

  • “Diagonal” (will never have shadow from this cell)
  • Further away
  • Richness
    Then I only allow the 3 best possible seeds per tree

Rollout

I only rollout after both players have made a move to make it fair (one in two nodes).
Moves pruning is the same than Expansion with the addition of "Allow wait only if ((day > 16 and day < 22) or day >= 22 and COMPLETE is not possible) or no other action is possible

Evaluation

Simply myScore - oppScore
This might be a good spot for improvement

Conclusion

I really need to setup cg-brutal-tester to test new ideas more quickly. This might be the way to the Top 20.
This challenge was lot of fun and I liked the fact that multiple strategies was possible to reach legend (Beam, MCTS, Heuristics, NN, …).
I’m also very proud of the result of our small startup considering I was the only one who had experience with Codingame Challenge. Congratulations on reaching gold league for everyone :wink:

And congratulations to @reCurse for his amazing work and very well written PM :slight_smile:

9 Likes

For anyone who’s interested.

6 Likes

"… learn from your peers …"

1 Like

Finished 161st Legend… 2 days after the end of the contest
…so 421th officialy

Search

I use a Beam Search of 60 nodes. My simulation is not that fast, I only able to evaluate around 7k Nodes a round. For each my possible actions (pruned like other players…), I find best opponent action. Calculated nodes for a depth (only starting on depth 4) are then pruned : I just keep best 60. For the first time I used cancel functions instead of coppying gameState. Better for perfs, harder to debug…

Evaluation

Starting on day 12, my score - opponent score, then I evaluate each tree, taking account potential immediate and future shadows (decreasing future shadows). I think that this evaluation made my simulation slower, but I couldn’t find better

Typescript Brutal Tester

I used Brutal Tester to test my magic numbers and new features. But I had to customize my typescript code in order to make it work. I think that might help TS deloppers (maybe @Magus you can add it to your doc) :

Prerequisites

  • Install Node

Customize TS Code

  • Init a new node project npm init
  • Add following dependencies in you package.json:
  "dependencies": {
    "@types/node": "^14.14.37",
    "typescript": "4.2.4",
    "node": "16.1.0"
  }
  • Create two files master.ts and control.ts. For me Control is my code in the Arena, Master is my code in the IDE, and my goal is to make Master defeat Control

  • Put the following code in those two files, and copy-paste your code in function play
    WARNING : Replace all standard readline() by await readline()

const rl = require("readline");

const intRl = rl.createInterface({
  input: process.stdin,
  output: process.stdout,
});

// Max time between two lines
const READ_TIMEOUT=10000;
// Max time that node serveur should be up
const GAME_TIMEOUT=10000;

intRl.on("line", (input: string) => {
  lines.push(input);
});

function delay(ms: number) {
  return new Promise((resolve) => setTimeout(resolve, ms));
}
const lines: string[] = [];
let wait = 0;
async function readline(): Promise<string> {
  const line = lines.shift();
  if (line) {
    wait = 0;
    return Promise.resolve(line);
  } else {
    // Read every ms if there is a new line
    if (wait < READ_TIMEOUT) {
      wait++;
      await delay(1);
      return await readline();
    } else process.exit();
  }
}
async function play() {
// COPY PASTE HERE YOUR TS COD IN IDE, REPLACING ALL `readline()` by `await readline()`
}

play();
setTimeout(function () {
  return process.exit();
}, GAME_TIMEOUT);

Build and start players

In project directory, build master and control with command line :

  • tsc master.ts
  • tsc control.ts

Then you can start master and control with node control.js

You can build and start using npm scripts :

    "build": "tsc master.ts && tsc control.ts",
    "master": "node master.js",
    "control": "node control.js"

And then npm run build, npm run master, npm run control

With brutal tester, after build, this will look like :

java -jar cg-brutaltester.jar -r "java -jar SpringChallenge2021/target/spring-2021-1.0-SNAPSHOT.jar " -p1 "node control.js" -p2 "node master.js" -n 100 -s -t 2

Or

java -jar cg-brutaltester.jar -r "java -jar SpringChallenge2021/target/spring-2021-1.0-SNAPSHOT.jar " -p1 "npm run control" -p2 "npm run master" -n 100 -s -t 2

Limitation

For now, this code is a bit slow, so I couldn’t customize my beam search (because of turn time out). But it helped when I was in heuristic mode (to improve evaluation)

Thanks CG for the contest, and for all PM that helped me reach Legend after all !

6 Likes

A wonderful PM !
Congratulations !

1 Like

I’m reading this late but congrats, its exceptional ! It’s been one year that I have been wondering if it’s feasible to apply the alphazero framework to codingame, I’m happy it is! I plan on doing it myself when I have time and the hardware.

I don’t understand a part of your story : Have you successfuly applied your pipeline to UltimateTTT ? It’s the first game I want to try with this framework when I will get to it, but if you’ve found that it’s not working really well I might just take another multiplayer game instead.

1 Like

#114 in Gold (#371 overall) so not very good.

Thank CG for the contest - it was pleasure to participate :slight_smile:

I have had very simple approach - 1 level deep MinMax with very simple scoring function

  1. if COMPLETE or GROW possible: difference of final scores with only WAIT to the end and COMPLETE in last round
  2. if it is no SEED: put SEED in place with best difference of final scores calculated after growing all trees to max in round + 3

Additionally I do not GROW or SEED if the tree can not reach max size before the end of the game.

It is all.

First I implemented it in Python but it was timeout sometimes in SEED’s calculation. Rewritten in C++ it last around 0.3 ms max.

Thank for all PMs - they are all very informative and the eye openers. :slight_smile:

And especially kudos to:
@reCurse - congratulations for 1st place. I think that more people will start using NN in CG now - when it was proven it is possible to receive 1st place. And using utf-8 to put more that 100 kB (you and @Agade written about it) is so simple and great idea.
@BrainSolver - thank you for bitboard for hex - maybe it was common knowledge but it is new for me.

2 Likes

Yes, TotoZero is working with UTTT, my current bot in the arena uses its results, though it’s an older version and there’s some bugs I have not addressed (yet?). It’s a difficult and not-so-straightforward game to use Alpha Go/Zero on though, I would recommend simpler ones like Oware, Breakthrough, etc.

2 Likes

Why do you think it’s not straightforward? Any particular logical reason or it’s from your experience with trying to apply the concept to it ? Because in my imagination I was thinking that it was quite easy for a start : very simple input encoding, 2 player perfect information deterministic turn-based board game.

I’ve just realize that Oware is in fact Awalé in french, I love that game, did not know it was on cg! Thx for the discovery, I think it will be my next bot.

You are right, however there are complications.

  • It is extremely difficult for a human to read and understand the state of the game.
    • For this reason, it will be difficult to understand if the bot is playing properly or not, and what to focus on to improve it.
    • For this reason, it is also arguably harder for a NN to evaluate it properly, or at least easily. It took me many different attempts before getting something doing any better than vanilla MCTS.
  • It is difficult to reason on from a spatial perspective. In this spring challenge, it is easy to have intuition about the importance of neighborhood cells of distance 1, 2, etc. But what does it even mean in UTTT?
  • There is an overwhelming advantage for player 1. In my experience, it is much easier to learn from situations of balance than imbalance.
  • It’s just a bad game. :slight_smile:
4 Likes

That’s true, and as you said it’s a real problem for debugging but I’m not sure to agree on the fact that it is a problem for NNs. Maybe its harder for a human to find the right architecture.

I’ve think several times about that. A 3x3 Conv is very good when its superposed on a little grid, but it does not make a lot of sense when it’s in the middle of several grids… And you need to see the greater picture with the big grid that you’re indirectly filling. Maybe a CNN is just not the rigth architecture for that. Are you using a CNN for it ?

I’ve notice that first player had an advantage but I thought it was maybe because I was not in legend. I agree that imbalance is a big problem for learning. Especially with self-play.

Haha :smiley:

Thx for your answers, it’s very insightful!

1 Like

I also have NN in UTTT now, but to get right inputs for the NN is quite tricky for this game (I don’t use CNN). I’ve also had many failed attempts before the bot started to get stronger than simple MCTS. As reCurse said, I’d suggest simpler game like oware or breakthrough. Oware has little inputs and at most only 6 moves per ply.

1 Like