[Community Puzzle] Search Race

CommunityBot · May 21, 2020, 9:51pm

https://www.codingame.com/multiplayer/optimization/search-race

Send your feedback or ask for help here!

Created by @Illedan,validated by @Astrobytes,@struct and @Scarfield.
If you have any issues, feel free to ping them.

Jay_Jayjay · May 21, 2020, 10:42pm

Is there a way to create a custom map for testing?

Illedan · May 21, 2020, 11:08pm

No, but it is a good idea. I’ll try to code something tomorrow to add that!

TBali · May 22, 2020, 12:45pm

Hmm, this one sounds a bit familiar…
Is it exactly same physics/rules as CSB, just solo?
(Sorry, could check myself but I am as lazy as a Haskell expression evaluator…)

Illedan · May 22, 2020, 1:28pm

Yes, it is CSB, but solo. My own motivation for making this is to have a playground to work with simpler NN for CG.

darkhorse64 · May 22, 2020, 5:49pm

Contrary to CSB, the positions are truncated and not rounded to the nearest integer even if the statement says so.

Illedan · May 22, 2020, 8:49pm

I found this not possible as there is no way to enter custom input for Optimization puzzles.

JBM · May 22, 2020, 10:59pm

Please!

Some of us aren’t up to real AI and must resort to mere simulation.
We have to know: truncate or round?

Illedan · May 22, 2020, 11:08pm

Angles are rounded.
Position is truncated
Speed is truncated.

docri · June 27, 2020, 6:00pm

First I 'd like to thank @Illedan for putting this together, very much appreciated! I was looking for a pure driving simulation to work on my predictions for CSB, and this is it. Well done.

Now for a little rant. The turning physics in the game do not behave at all like described. From the description:

It will at max turn 18 degrees from where the current heading are.

And, in the expert rules:

The car rotates to face the target point, with a maximum of 18 degrees.

Furthermore, in the input specification:

angle. Heading angle in degrees between 0 and 360 for the Car.

So one might think the ‘angle’ input is what defines the ‘current heading’. Not so. There are two scenarios:

The desired turning angle is < pi/10
The desired turning angle is >= pi/10

In scenario 1, the current heading is actually defined by the relative positions of the car and the target and no rounding of angles occurs in the computation of the acceleration for the next move. Using the angle reported in the turn’s input will (occasionally) yield wrong predictions in this scenario.

That’s confusing enough, but it gets better. The reported ‘angle’ is not the ‘current heading’ in scenario 2 either! That’s because there is an intermediate rounding hidden in the engine: at the end of each turn, the game engine converts the current angle to degrees and rounds it to the nearest integer degree. Then it gets converted back to radians and mapped to [0, 2pi]. Finally, when the angle is reported to the player, it gets rounded to the nearest integer degree again. But the engine uses the angle from before this final rounding to compute the acceleration for the next turn. That’s a value player code never sees nor can easily derive from the inputs. Sure, knowing this you can keep track, but hey…

To be fair, there is this sentence with a link to the engine code on github in the documentation:

If you’re going to run local simulations, you’ll need to look at the referee.

Maybe this one sentence should replace the entire description of the game physics in the current state of affairs. I have a hunch that the behavior described above is not intended. But maybe it is – I’d like to know.

Obviously, I had too much time on my hands and some fun figuring this out. So thanks again!

Illedan · June 27, 2020, 7:19pm

Thx for the feedback, glad you had fun figuring this out
I will add a little more descriptive information in the statement!

jrke · July 20, 2020, 2:58pm

Is there any time bug cause’ i can use 37-38 ms max?
if yes,so @Illedan can you fix it?

WildSmilodon · July 24, 2020, 7:08pm

Same here, 50 ms works for test cases for me, but when I submit, I must use 30 ms in order to get 100%.

fenrir1 · July 24, 2020, 11:02pm

I think it could be cool to share a bit about how we achieve our results. At the moment it is 10771 for me.

I trained offline and my bot simply recognize the race and output precomputed actions.

To precompute these actions, I train an NN using DPG (Deterministic Policy Gradient, https://arxiv.org/abs/1509.02971) and record the best result seen for a given race while training. Without recording the best result, my end of training NN scores about 10900-11000. I trained my NN using all the races (I did not tried to learn one race at a time).

For DPG, I used 2 NN:

one for Q value that takes the states (full context, POD and all checkpoints and the action angle/thrust) and outputs a single value
one for actions that takes the states (full context, POD and all checkpoints) and outputs 2 values (angle and thrust).

Note: I used my own NN toolbox that I developed for CSB and so to compute dQ/dAction, I approximate the derivative using (q(action + delta) - q(action - delta)) / 2delta.

For me, the mains issues were:

Stable Q training. I tried the slow moving weight averaging used as a target by the original paper but it was not very good in my tests/setup. I end up using ensemble training with multiple Q NN (and associated action network) and use the Q average as the target.
Defining the right reward (and associated gamma)
The amount of exploration

DomiKo · September 28, 2020, 8:41pm

How did you precompute actions for all tests?
Or what did you do for hidden tests?

Illedan · September 29, 2020, 3:23pm

All testst are found here:

KevinDaniel-S · December 1, 2020, 12:34am

How can I change the ms?, I have troubles with a hidden test case. I have been improving my bot, but I cannot pass the test37.

WildSmilodon · December 1, 2020, 1:46pm

Dear Kevin,

usually you measure the execution time of your code yourself. In C++ I use

#pragma GCC optimize("O3,inline,omit-frame-pointer,unroll-loops")

to ensure I get O3 optimized compilation of the code. Next I mesure time, as

TIME_LIMIT = 30
std::clock_t c_start = std::clock();
std::clock_t c_end = std::clock();
float maxTime = TIME_LIMIT * CLOCKS_PER_SEC / 1000.0;

while (c_end - c_start < maxTime) {

code that tries to find what to do (Monte Carlo, or Genetic algorithm, or Neural Network)

c_end = clock();
}

The time limit in miliseconds is stored in the TIME_LIMIT variable.

Hope this helps,
Jure.

LastRick · December 1, 2020, 2:57pm

Really like the puzzle. I haven’t ventured into all this NN stuff everyone is talking about, just using some calculations to vary thrust and apply banking. But just with that, I am able to pass all tests. One small criticism though: I fail one validator (test32) on submission, giving me 98%. But since there are no replays in the submission (like some of the other games and contests) I can’t see how to tackle that missing test. Plus, the numbers in the submission don’t really match the numbering of the tests so I don’t know what an equivalent test would be to focus on. Any help here would be appreciated.

Illedan · December 1, 2020, 6:25pm

I downloaded the contribution and made a new Private one:

Here you can now see the validators too.
32 is this one: