Green Circle - Feedback & Strategies

jacek · June 27, 2022, 9:40am

I ended up around ~10 in legend. I liked this game because I could finally put the machine learning to good use.

tl;dr I trained policy neural network using deep Q-learning. Submitted code doesn’t have any simulation, it relies on provided inputs and available actions. It chooses the legal action with highest score. I treat the game as blackbox and I still don’t know the rules .

At first the game was overwhelming. There were many rules not mentioned in the statement and there were some serious bugs at first, but quickly fixed. While they were getting fixed, I thought I wouldn’t bother with writing simulation for now and downloaded referee. I was thinking if I could get away without any simulation in the code and to train policy network. For past few months before the contest I was fiddling with neuroevolution, so tweaking NN parameters randomly to get the desired result. I slightly modified referee to my needs. I calculated there are 198(+RANDOM) all possible actions (actually + 8 more because due to bug referee doesn’t give TASK_PRIORITIZATION x 8 actions). It didn’t seem much. I tested evolved bot against random agent. Soon it achieved 100% winrate against it. So the method looked reasonable. When I submitted the code, it played not so bad, most bots at the time were random or ‘print 1st action’.

I didn’t have much success with deep Q-learning before, but I gave it a try anyway. At first it was worse than neuroevolution, but suddenly after thousands of games, the winrate spiked. It achieved 100% against random and was getting more and more wins against evolved bots. This week was all about tuning hyperparameters and training, making bigger and bigger models. On the almost last day I spent time to train the best model using the tuned hyperparameters. And all it was based on modified referee, I didn’t write any simulation my own, I still don’t know the game rules .

It was placed at the top of leaderboard. My bot was choosen as the gold boss . I liked how people were commenting it was playing weird or even stupid, yet it kept winning. It was learned from scratch without any expert knowledge. It feels like when Alphazero learned chess from scratch and played at superhuman level making seemingly dubious moves just to win most games. Poor souls who tried to use human logic against the boss.

I’m glad there are more contests on CG lately, I surely enjoyed this one.