CodeBusters Final Ranking

Hello everyone,

For the CodeBusters contest we decided to improve the ranking stability. Indeed for the Smash the Code contest we all witnessed some pretty disturbing changes in the leaderboard close to the end of the ranking process and we needed to do something about it.

So the following will happen for CodeBusters:

  • During the contest, nothing changes: CodinGame uses a basic TrueSkill like system to rank players. The system is fast and good enough and provides a decent ranking. Our belief is that when players strengths are very close, TrueSkill will rank with a -5/+5 rank error margin. The only improvement we made is that the top five players no longer play systematically against a stronger player, these players can also play matches against players below them as it is the case for the other players.

  • Once the contest is closed, we wait until all the in-progress matches are done.

  • After that, we launch an additional batch of 200 matches for the players of the Legend league (limited to the first 150 players).

  • For these matches, we will rely on a modified version of TrueSkill : the score will be computed as the average of TrueSkill scores over the 200 matches rather than relying on the latest score. The rationale is that TrueSkill has a short memory and the last match result can influence way too much the ranking (e.g. if one player is lucky, he/she can move up by a few positions for a short period of time).

  • Additionally, when playing the 200 matches, each time a match is launched involving a player from the top 15, 5 sub-matches are launched between this player and the other player and only the global result of the matches is fed to TrueSkill. Again the goal is to provide TrueSkill with reliable results to lower the “luck” factor and short memory issues. It means that players in the top 15 will play at least 1000 matches of their own.

We have made some in-depth testing of this method and we are confident that it will yield better results than the one we used in the past.

Some of these changes were suggested by members of the CodinGame community and we thank them for their involvement.

15 Likes

Sounds good ! Have you also considered using the TrueSkill Through Time variant ?

ref (kaggle)
PDF (microsoft research)

1 Like

When the contest finishes and you are preparing the final battles (for the top 150 Legends), will everyone start equally, or will you continue on from the existing leaderboard positions?

Equivalent question: At the end of the contest, do we need to keep resubmitting the same code again and again until the random number generator gives us a better-than-average ranking that we like? Or will that give no benefit and be unnecessary?

@JamesMcG: everyone will continue from their existing leaderboard positions. During our investigations we experimented with several options : only 100 matches, starting from the same position, etc. The results is that when the number of matches is high enough - 200 is high enough :slight_smile: - the starting positions do not matter, the average is powerful enough to remove the possible “luck” factor a player could get at the end of the contest.

So in theory, it should be unnecessary to resubmit the same code again and again to start from a slightly higher position.

Unless I am mistaken, “TrueSkill Through Time” does not seem to be a variant but just an analysis of “TrueSkill” over time. Still worth looking. Thanks.

It’s a bit more different than that.
TSTT solves 2 issues we have here :

See section 3.2 of the PDF from Microsoft.
The implementation is quite complex I’m afraid … I’ve reimplemented Vanilla TrueSkill myself but did not dare to implement TSTT yet …

OK! This is clearer now.

Having played a lot in silver en gold in the last few day, I wonder if the tracking algo isn’t just a big random.

Each time I change something I submit my code 3 or 4 time (being in America help to get result faster on that) and in silver I have one case where I got ranked 150/250 and then in the top 10 without any change.

Now in gold I’m going up and down between 100 and 350 without changing the code. Is that really just good luck and bad luck in the match making or their is a problems in the ranking stabilisation somewhere?

Will it concern a static list of the 15 best players when additional matches are added, or the sliding top 15 (which can change according to math results) ?

In other words, if I’m ranked 16 when matches are added, and then ranked 15 after a few matches, will I benefit the sub-matches ?

Thanks.

Or another angle on this… Suppose my AI is comparable in ability to other AIs ranked around 200ish. I submit at approximately the same time as several other AIs that are comparable in ability to AIs in the top 20. On my way up to my rightful ranking of 217, I am matched against these smarter AIs quite a few times as they also move up in ranking. My AI, understandably, loses each of these matches.

Will those losses count against me as losses to less impressive opponents? For example if I lose to high-ranking AIs at around 640, 620, 580, etc. Will that affect my chances of rising to my “correct” ranking?

  • danBhentschel
2 Likes

@Neumann: yes you would benefit from the sub-matches. This is the actual ranking that counts.

@JonatanCloutier: what you experience, other players have experienced it as well on CodeBusters. A few possible explanations:

  1. Either the strength of the players are similar in Gold. This is quite likely. The algorithm possiblities are important but limited and in the end most players end up doing almost the same thing.

  2. CodeBusters may be what we call at CodinGame a “random” game: the result of a match rely too much on what the other player does as it influences greatly what your player does. There is some kind of a loop as you wait to see what the other player does and vice-versa. The result is a highly unpredictable game. The perfect example of such a game is “Platinum Rift”. A perfect counter-example is “Smash the Code”. Not sure if this the case for CodeBusters.

  3. Too many games are generated with a small variance (for example not enough ghosts). Thus winning is a matter of luck (am I the first to find the few ghosts available?) rather than being the one with the best algorithm. That may be the case for GhostBusters as some games are too “easy”.

My initial comment for TrueSkill about having +5/-5 rank margin error was done from Smash The Code. For CodeBusters it seems to be quite different but we can still rely on the averaging of TrueSkill score for the legend league.

As a conclusion, it is quite hard to balance a game, we are always learning at CodinGame with each new contest and maybe we should add some level of score averaging during the contest itself and for all leagues. Something to think about for the next contest…

1 Like

@player_one: indeed it would slow down your progress to the top. This is why we provide a high number of matches (100 per player) and why we generate new matches regularly.

Maybe it’d would be better to avoid matches of two non-stabilised opponents. The reason is that in many (most?) cases they are of really different skills and got close together just by accidental timing; they would never be compared if e.g. the stronger one of them was stabilized.

Under TrueSkill, such an uneven match is a penalty for the weaker of the two players (and a slight advantage for the stronger one).

4 Likes