During last contest I experienced random timeouts although my code is used to reply within 4 ms out of 50 ms. So I decided to investigate.
The first lead to explore was to check my code. The timeout seemed to happen at different locations depending on the game but for a given game, it seemed it always happened at the same location. So I printed timestamps to have an idea where it comes from. And for the first game I studied, the timeout came from a basic stream().collect() in order to append strings. This statement took 40 ms to execute which was very strange. I decided to replace it by a loop but it didn’t solve the timeout. I tried to study other games and each time the timeouts came from different basic statements.
Then I started to explore a second lead. The garbage collector. Maybe a garbage collection was triggered from time to time and suspended the player run. In Java it is possible to get the number of GC with this code: “ManagementFactory.getGarbageCollectorMXBeans().get(i).getCollectionCount()”. The result was
very fast to analyse since there were no GCs during the whole game (at least nothing reported by this method). So it seemed that the garbage collection was not the issue.
Edit: actually there were two garbage collections; see my additional comment below.
At that moment it remained only one last lead in my mind. The JIT compilation. In Java the time spent in JIT compilation is available with “ManagementFactory.getCompilationMXBean().getTotalCompilationTime()” (if supported by the JVM). I decided to print timestamps and JIT compilation time in my simulation loop (each simulation is supposed to take almost the same time).
Here is the result of the first round.
iteration / timestamp in µs / JIT compilation time in ms (from the beginning)
1 / 14952 / 228
2 / 16747 / 229
3 / 19466 / 231
4 / 19739 / 231
5 / 19960 / 231
6 / 20201 / 231
On iteration 4, 5, 6, there are no JIT compilations and each simulation lasts around 250 µs. On the other hand there is a JIT compilation of 1 ms and another one of 2 ms on iteration 2 and 3 and these simulations last 1.8 ms and 2.7 ms. So it seems that the JIT compilation really impacts the main thread accordingly. Since
Codingame’s system is single-threaded, the game should be suspended to execute the JIT compilation.
In addition here is the result of round 2 (for a different simulation loop).
0 / 569 / 193
1 / 1722 / 193
2 / 2263 / 193
3 / 2642 / 193
4 / 3064 / 193
5 / 56230 / 194
6 / 56555 / 194
7 / 56784 / 194
8 / 57017 / 194
9 / 57262 / 194
10/ 57484 / 194
On iteration 5, we can see a JIT compilation of 1 ms which impacts the game of 53 ms and causes the timeout (we can also see that the simulation lasts 400 µs before JIT and 250 µs after). This behaviour is observable on all games. Around the round 2 or 3, there is a JIT compilation which impacts the game of several milliseconds (I don’t really know why the reported JIT compilation time does not match the delay on the simulation in this case).
Edit: as mentioned in my additional comment below, this specific JIT compilation actually also matches a GC execution.
Along the full game, there are usually only four or five JIT compilations which happen during the player run. They mainly appear during the first round, then there is the “very impacted” one on round 2 or 3, then there can be one or two other JIT compilation during later rounds which could delay the main thread of 1 ms. Overall the JIT compilation takes between 1 and 2 seconds. Fortunately all other JIT compilations occur when the player is waiting for the referee input (I guess the scheduler takes this opportunity to switch the threads). By the way, funny thing: I saw a game which timed out before reading any input; the “impacted” JIT compilation occured while the player was trying to read the first input…
Now I hope we can find a solution to that problem. Personnaly I never worked on that specific topic and I’m not sure what we could do. My suggestions would be:
- increase the maximum time of the first rounds to avoid timeouts;
- use the JVM option -Xcomp (to compile everything at the first call).
I would be happy to read your comments.