Unstable TrueSkill remedy - fixed set of seeds and players

Currently with the same code (in Code Royale game) I score from 2 to 67 position on different “Test in arena”, while avg is about 4x. Executing single run after code change actually doesn’t tell anything, it should be run more times.

Could it be possible to store a single “Test in arena” run with all it’s players and seeds as a reference and run next “Test in arena” runs against this reference set? (It would be second button “Test in arena” would remain as is, but additionally would appear “Test in fixed reference arena”)

Now I can manually set seed and change players but doing so multiple times for every single code change would be cumbersome.

It would also save codingame resources, as I suppose many users wouldn’t repeat “Test in arena” so many times.

How do you find it?

In the last few years, Codingame reduced a lot the amount of playable tests in the IDE. Because more and more players use tools like CGBenchmark to executes a series of games against fixed opponents. Sometimes it is many games against the same opponent (like 100 games against the 1st of the arena).

This is the reason why you can’t execute more than 20 tests in less than 15 minutes (or something like this, I don’t have the exact numbers in mind).

You are also unable to submit your code in the arena too often, for the same reasons.

I have nothing against this limits. That’s understandable.
What I propose is something like a one manual seed for whole “Test in arena” - which would cause playing the same games against the same oponents instead random. The same as auto and manual in single “Play my code”.

After making a change in code, I would like to test it in arena against the same set of games and players, not new random ones. The set of games and players could even be smaller then current 100, but fixed all the time - to see if my code change has improved something against this fixed set of games and opponents.

How do you find it? Do you also find it useful? Is it possible to add such improvement on codingame?

I didn’t know about
it looks to be what I am looking for. I will give it a try if it is still working. Thx for info.

CGBenchmark is probably what you are looking for, yes.

If you want to test locally against one of your own code, take a loot at brutaltester: https://github.com/dreignier/cg-brutaltester

Thx, I will.