Thanks! Well I work with neural networks daily - I’m a PhD student in that field - which of course helped me a lot. I also had some experience in reinforcement learning, although this is further from my comfort zone.
Still, I am sure anybody can do it, it will simply take more time since you have to learn stuff on the way. This is something you could do right now, so that for the next challenge you will be more prepared ^^
If I have some advice for starting from scratch, that would be:
- use Python, at least for the training part.
I think ReCurse uses C++(edit: actually they use Python, with some C++ for accelerating stuff like the game simulation), but Python is much more friendly and has many libraries that will do most of the work for you: PyTorch (or Tensorflow) for the neural network part, and Gym + Stable-Baselines3 for the RL algorithms. This is what I used for this project. The only issue with python is for simulating the environment, which is quite slow. - learn about basic neural networks: how they work, how they learn. There are plenty of introductions online, like this one.
- learn about reinforcement learning: understand at least the main concepts (agent, reward, observation, action, environment). Once again maybe try out some tutorials like here and here.
- test a past codingame challenge. I think the Olymbits (Summer 2024) is a good start, since it is very easily castable to RL. And you have reCurse’s PM to help out!
- for having a RL bot in a codingame challenge, I think the roadmap looks something like this:
- reproduce the game as a RL environment
- use PPO (for instance) to train a NN on this environment. Ideally we want to use self-play (the NN plays against itself), but you may start by having another opponent (simpler to introduce in the framework, unfortunately self-play is not well supported by the libraries so we have to be hacky).
- export the trained NN in a codingame bot. This step involves compressing the weights of the NN in a (very long) string (too make it fit under the 100k character limit of codingame). Then in the final bot you uncompress these weights, and use the NN to process the game state to output the action it wants to do.
- improve. There are maaaaany things to try for potential improvement. Better hyperparameters, better observations, reward shaping, neural network architecture.
- reCurse also used an MCTS to add a search on top of the NN. I think this is not too important, first focus on getting a strong NN!
This message is getting longer than planned, but hopefully it may motivate some to try it out!
Edit: @reCurse you’re the professional here, feel free to correct me if I missed something ^^’