Green Circle - Feedback & Strategies

Thanks everyone for playing :rocket: 1758 players!

I had a great time playing the game and creating the game with @[SG]Jerome.

You can share your strategies and ideas in this thread, but please don’t share runnable code as the game will be published as a multiplayer soon!

9 Likes

This time, I have 2 post mortem to do: one on creating the game and the second one on playing the game.

On creating the game:
Societe Generale wanted to create a CodinGame challenge to make developers aware of Green IT. The idea was to have a challenge, and then to give the game to the community. When they asked me if I wanted to participate, I couldn’t refuse ^^. There was a game I wanted to add in multiplayer for some time, but I never got the motivation to start coding it. This was a good motivation.
We had 3 months to create the game. So I started by learning java and after one month playing with the SDK, we had a good prototype of a working game. This game was in the usual CodinGame standards, but didn’t meet my company prerequisites: it was too difficult to fit the Green IT theme on this game.
So, we shelved this game (we still hope to release it later) and started again from scratch. We now only had 2 months to code a game, and we didn’t even know which game.
It took us one week to find a game that could fit the theme (a game in which virtuous actions would help the player and bad actions would give him a lot of penalties). It would be the first deckbuilding game on CodinGame and it seemed easy to code (there was only 8 different cards). And to fit with the Green IT theme, we wanted to limit the computing time for one turn to 20ms.
It was in fact much longer and harder to code it. And even harder to test all situations. And we couldn’t even set the time limit so low (the minimum time limit accepted by the SDK is 50ms)
But we managed to release it on time.
Of course, we had a few bugs (I guess we didn’t perform enough tests beforehand), but I believe we were able to fix them quite fast. I hope this didn’t bother you too much and that you enjoyed this game.
If I had to do it again, I would still go for it, but I would follow the following advices.

Advices for the next challenge creators:
Do not wait the last day to write the statements. And make sure to have people who do not know the game at all to read it.
Make sure to have a lot of testers. And to have testers who think differently. This will help you to detect more bugs.
If your game uses random, be sure to use SecureRandom
Be sure to have all the game information visible in the viewer. I forgot about this and I had to fix it during the challenge.

I’m currently upgrading the game for the multiplayer release (adding SecureRandom, fixing the statement for silver/gold/legend leagues, and trying to have a better view of the cards played)
I hope you enjoyed playing this game. I know the random part did not please some of you, but a deckbuilding game is about managing your deck in order to reduce the randomness impact. So we couldn’t remove all the randomness.

On playing the game:
Since I spent so much time fixing the game, monitoring the chat/forum/discord and answering questions, I didn’t have a lot of time to code my bot.
I started by submitting the bot I used to test the wood bosses. This got me in Bronze league.
I had to wait until tuesday to be able to code again. Due to lack of coding time, I decided to go with a simple heuristic.
Here is the algorithm that got me into Gold:
MOVE :
Never move next to the opponent: this will avoid the GIVE phase and youy’ll keep your cards
If I have BONUS cards in hand and no CONTINUOUS_INTEGRATION, I try to get to the CONTINUOUS_INTEGRATION desk.
Otherwise, I move to the next desk that is far enough from the opponent.

GIVE and THROW :
I should have only THROW phases.
If I can release an application, I keep the needed cards on the side.
Then, I throw the cards that are not needed for the release (starting by the BONUS).
If I don’t have anything to throw (in case I put all my cards on the side, I do a RANDOM)

PLAY
If I can release an application, I keep the needed cards on the side and I only play with the other cards in this order:
If I can automate a card, I do it (I prefer to automate the BONUS)
If I have a TRAINING and I have interesting cards to draw, I play it to draw cards
If I have a CODING and I have many interesting cards to play in my hand, I play it to be able to play all my cards
If I have a CODE_REVIEW, I play it to get more BONUS
If I have a REFACTORING and one debt in my hand, I play it to remove the debt
If I have a ARCHITECTURE_STUDY, I play it to draw more cards later.
And that’s all. I do not play any TASK_PRIORITIZATION or DAILY_ROUTINE

RELEASE :
I release the application that will produce the less debt (and I never deliver if I get more than 2 debts)

Once in Gold League, I analyzed the other players games. I tried to use the following pattern:

  • DAILY_ROUTINE
  • Get 3 CONTINUOUS_INTEGRATION in a row and automate BONUS
  • Use TASK_PRIORITIZATION to get the right cards to release applications
    But I must have a bug somewhere because it didn’t work as expected :frowning:
    I’ll try to fix it (or to use a small simulation) once the game is back in the multiplayer section
21 Likes

Language : Java
League : Legend
Rank : 39 !!!

Let me go back in time…

  • Day 1 : Reading the statement… conclusion : :neutral_face:
  • Day 2 : Reading the statement… conclusion : :sweat_smile:
    … … … Commit 1 : Basic bot → Wood 1
  • Day 3 : Reading the statement… conclusion : :joy:
  • Day 4 : Reading the statement… conclusion : :sob:
    … … … Commit 2 : Basic bot ++ → Silver
  • Day 5 : Reading the statement… conclusion : :head_bandage:
    … … … Thinking of what to code… simulation → looks very complicated and for only one turn… heuristic then
  • Day 6 : Reading the statement… conclusion : :dizzy_face:
    … … … Finally heuristic seems harder → go for one turn simulation
  • Day 7 : Reading the statement + referee… conclusion : :ok_hand:
    … … … Commit 3,4 : Simulation bot → Gold ~50
  • Day 8 : Reading the statement + referee… conclusion : :scream:
    … … … Debug…
  • Day 9 : Reading the statement + referee… conclusion : :ok_hand:
    … … … Commit 5,6 : Simulation bot + hardcode opening → Gold ~20
  • Day 10 : Reading the statement + referee… nah, everything in mind now !
    … … … Commit 7,8,9,10 : Simulation bot + hardcode opening + special case
    … … … → Legend ~10 (with a jump to 6 ! ok they were all trying some stuff…!)
  • Day 11 : … :zzz:
  • Day 12 : Present day, waiting result !

Debriefing time …

Now, i can say that is was really fun and entertaining. Ask me the question 2 days earlier, i would have probably answered differently !
I do not play board game often (i mean as a human lol) but i do like play as a coder ! And this one was a really interesting one, even with all that rules and randomness.
The multiple “CG” turn for one “game” turn was uncommon (first time i see that) and tedious to handle correctly.

The game
In fact, pretty cool, but hard to dive in. I definitely love the remastering, really really nice work :clap:.
Maybe the CONTINUOUS_CARD was too powerful but i guess another strategy would have been used so no problem for me… The winning strategy was just exactly like in real life, automated everything to compensate all that technical debt and cross the finger that all test will be green :stuck_out_tongue:

My bot
It’s a mix between simulation and heuristic, i leave some technical/strategy details of my implementation :

  • My simulation is a one “game” turn BFS than can start from any phase (almost !) to release phase. I write it like that with one thing in mind : do not need to pick the correct output in a list where some “CG” turn can be skipped ! No need to dissect the referee and i was more bullet proof of an update.
  • Later i could find another benefit of that implementation to introduced at any point some hardcoded move based on heuristic. The next “CG” turn will run normally with the simulation again.
  • Some hardcoded opening, i choose MOVE 2 for player 1 and MOVE 5 or 2 for player 2. Objective was to maximize CONTINUOUS_INTEGRATION cards in the early game.
  • I decided to not introduced statistics for CODING and TRAINING cards in the simulation, so i included them only when it was sure (i guess it happen <1% !)
  • To compensate, because those cards are very powerful too, i play them with some heuristic that override the simulation.
  • My evaluation function is based on the outcome of a “game” turn (position, automated cards, debt, etc…)… was my third but the real first where i put a lot of effort. Wow… how hard is it to express what is good or bad. I would like possible to put myself in it, that would simplify a lot of things ! Fun fact (or not) about it, is that i used an int, and that my scoring compute a negative value 80% of the time :joy: !

What could have be done :
Find the energy to compete in the final but i felt really tiny with all that big names around !

Some stats, because it’s always interesting :

  • 10 arena
  • ~50 play in ide
  • ~100 replay analysis
  • 0 play limit reached
  • 0 local play (i should give it a try to tweak parameters)
  • 1K lines of Java code !
  • ~100 ifs (those for debug included)
  • 2~3h/day ~20% coding !

While writing those stats and with the theme of the contest i’m kinda curious for more. Could be interesting to have more information in the contest global/personal summary, i have in mind :

  • nb match : total, min, max, avg/player
  • nb arena : total, min, max, avg/player
  • nb play ide : total, min, max, avg/player
  • nb replay watched : …
  • nb lines of code: …
  • size of code: …

Some questions for those who have the answers

  • Is there a different scoring for a 2-way match when winning as player 2 ? I guess no because it’s not necessarily a disadvantage…
  • How the rerun works ? I saw that the last 500 matches are kept. If i lost those 500 games against the same player who beats me all the time (and who like the arena button !), will the final result based on ~200 defeats + rerun matches ? I get the answer, rerun is more than 500 games !
  • When i hit the “send parameters to ide” button, do i play against the corresponding player version of the replay ? Same question when i choose a player in ide and come back 2 hours later ?

Many thanks

  • [SG]Jerome and [SG]Sebastien for the really good work, tracking, quick bug fixes and improvement. When is the next one ?
  • CG for your insane platform, chat’s life extension during the contest, your intentional, i’m sure, joke of gold opening and legendary (this is the good word) boss… was really fun !
  • All players that made the challenge, a challenge !

See you in Fall :wink:

14 Likes

Thank you CG for providing to us this completely unexpected contest. It was fun (most of the time). Thanks especially to @_SG_Sebastien.

I spent a lot of time on this contest. I was doing well up until legend opened (I promoted an hour or so after). Unfortunately after this, I tried many things, but everything failed. So my last 2 days were mostly wasteful in terms of the end result. I made it to 43rd legend.

My best bot was a full search of (only) my turn and was very efficient. Nearly all turns could be fully searched using probability calculus and weighted evaluation. This included situations where you would play 6 cards in a row for example (leading to 200k evaluations that were all weighted). In the rare case where I could not make it, I was saved by the fact that my search used iterative deepening. I started with allowing 1 card played, then re-searched with 2 in a row, then 3 in a row and it stopped when fully searched.

This snippet shows how you can calculate all possible hands that can be drawn when drawing X cards from an arbitrary deck with a probability calculated for each set.

I have several functions like these to make the full search fast. Because the search was so fast, I could also try to do a second full turn. I tried this with opponent position = -1 and also with the opponent position as a worst case for me. None worked for a second turn unfortunately.

I also had a search for when the score is 4. It picks the move that has the maximum probability of getting your score to 5. It outputs a %chance. It basically used the same sim, only replaced eval with 1 or 0 .

On saturday through half of sunday I coded an MC bot I planned to upgrade to MCTS on my turn and MC for the rest (till end of game). I could not get this to work better than my bot (i did get the sim perfect).

In the end I basically wrote 2 complete bots for this contest that are very different, but only the one I submitted on friday was good.

My main problems were these:

  • Understanding the game rules took me 3-4 days I think. Even after writing my sim I had to change it several times. I can blame it on the statement, but that’s not entirely correct. It is also just a complicated game that seemed counterintuitive to me. I don’t intend this as criticism, but that’s just the way I experienced it.

  • My first (and best bot) used eval. I made it fast and efficient, but I had serious problems with eval. At first I kept growing parameters (up to 25) and later shrunk it down to 10. I hit a dead end with this on friday. My main problem was not having any good intuition about what is a good play in this game.

  • My second bot used MC till end of game, which worked fine. It beat weak bots, but I needed to use heuristics to make the rollout better. In this case I ran into the same problem I had with my first bot, namely that I need to understand the game to do so. I could sort of see some moves were good or bad, but it’s hard to generalize. After trying some heuristics and failing, I abandoned this effort.

Still though, getting into legend with what is basically a brute force probability based bot seems like an achievement to me :grinning:

See you next contest!

20 Likes

I ended up around ~10 in legend. I liked this game because I could finally put the machine learning to good use.

tl;dr I trained policy neural network using deep Q-learning. Submitted code doesn’t have any simulation, it relies on provided inputs and available actions. It chooses the legal action with highest score. I treat the game as blackbox and I still don’t know the rules :wink:.

At first the game was overwhelming. There were many rules not mentioned in the statement and there were some serious bugs at first, but quickly fixed. While they were getting fixed, I thought I wouldn’t bother with writing simulation for now and downloaded referee. I was thinking if I could get away without any simulation in the code and to train policy network. For past few months before the contest I was fiddling with neuroevolution, so tweaking NN parameters randomly to get the desired result. I slightly modified referee to my needs. I calculated there are 198(+RANDOM) all possible actions (actually + 8 more because due to bug referee doesn’t give TASK_PRIORITIZATION x 8 actions). It didn’t seem much. I tested evolved bot against random agent. Soon it achieved 100% winrate against it. So the method looked reasonable. When I submitted the code, it played not so bad, most bots at the time were random or ‘print 1st action’.

I didn’t have much success with deep Q-learning before, but I gave it a try anyway. At first it was worse than neuroevolution, but suddenly after thousands of games, the winrate spiked. It achieved 100% against random and was getting more and more wins against evolved bots. This week was all about tuning hyperparameters and training, making bigger and bigger models. On the almost last day I spent time to train the best model using the tuned hyperparameters. And all it was based on modified referee, I didn’t write any simulation my own, I still don’t know the game rules :upside_down_face:.

It was placed at the top of leaderboard. My bot was choosen as the gold boss :sweat_smile:. I liked how people were commenting it was playing weird or even stupid, yet it kept winning. It was learned from scratch without any expert knowledge. It feels like when Alphazero learned chess from scratch and played at superhuman level making seemingly dubious moves just to win most games. Poor souls who tried to use human logic against the boss.

I’m glad there are more contests on CG lately, I surely enjoyed this one.

40 Likes

Wow. That is just mindblowing to me. Congratulations to make this work! Are you willing to give an estimate of the number of weights that were used in the final model?

My model is simple MLP. 125 inputs, hidden layer at first 32 nodes, final 256 nodes and 198 outputs. So 125 * 256 + 256 * 198, around 82k weights.

While I’m at it: Q-learning hyperparamaters

  • gamma 0.992
  • learning rate 0.001 + 0.8 momentum (simple SGD, not rmsprop nor adam)
  • 2 millions replay buffer
  • learn 32 positions after every round
  • update target network every 2048 rounds
  • epsilon 0.1; sometimes playing around with parameters injection noise for exploration
  • reward at the end only 1 win, -1 lose
11 Likes

Alright, thank you very much :slight_smile:

Thanks to CG, sponsors and @_SG_Sebastien for this sudden contest :slightly_smiling_face:

I don’t have much to share, since I didnt have much time for this contest, I ended up somewhere mid gold with some heuristic python bot. I wish that the game was anounced a bit earlier, but I believe CG had own reasons to do it the way they did. It was still fun to hang out in chat and watch the meta changes. I really hope that these smaller contests will further appear from time to time on CG.

8 Likes

Legend, 4th

My strategy was to simulate my whole turn (all my actions until I discard my hand) and pick the best first action. I did not simulate the opponent.

At the beginning of the contest, I converted the Java referee to C++. This was a bit more work than usual since there were many rules.
Many things could go wrong here, so to be sure I checked that a local game would produce the same outcome as CG IDE with a deterministic AI and a fixed seed.

I started with a Monte Carlo search and a manually tuned evaluation function (score, debt, automated bonuses, etc). When I had to draw cards I would simply ignore it.

Then I switched to an iterative deepening DFS.
For each possible actions, the tree is expanded with the resulting state.
When it has to draw cards, it would compute all the possible permutations with repetitions and the associated probabilities.
To simplify, if the player does not draw, I would assume it draws an empty set of cards with a probability of 1.
The tree was an alternation of “actions” Nodes (where it would pick the max child score) and “drawing probabilities” Nodes (where it would compute the sum of proba * child score).

To score the leaf nodes, I used a neural network. Since I was not sure what features were interesting or not, I ended up putting as much as I could.
The training was done by self playing. To keep some diversity the referee forced some random opening moves.

Thank you to SG and CG for the contest!
See you soon in the multi :slight_smile:

24 Likes

Thanks for your share jacek1, I would like to implement your AI for training purpose :slight_smile:

2 Likes

BTW, the good naming of the function is a “Falling Factorial” !

Thanks for sharing the information !

#51 Legend
Thank you @_SG_Sebastien, @_SG_Jerome and Societe Generale for the new Contest. It is very cool to have more contests along the year.

My bot is a full heuristic bot, checking only my turn:

  • For the first 5 turns, I have a opening book to do the move, with my position plus the opponent position, plus my cards, it is a very large opening book
  • For the play phase at the first 5 turns, I have a precedence graph, doing first the CI 8, than the Training, Coding, Code Review, etc
  • For the next next turns, the main objective is release an application, finding the closest move that will release one.

Things that made me reach the legend:

  • Play all cards even if i will release an app
  • If i can release more than one app, choose the best app that will disturb the opponent 5th app strategy and help your 5th app
  • Release app at all turns after turn 5, even if it will costs 4 TD, especially if you i’m player 2
13 Likes

k4ng0u 54th in Legend - C++

When I first read the statement, I was not sure whether to start this contest as the statement seems very complex and I didn’t have much time to spend on it. But once you get the grasp of the game, it’s actually quite fun to code!

Wood to Silver

Getting out of wood was unusually difficult. While I normally just hack into the input loop, here I had to straight out code my game state structure and store all the entities inside. Then I had some heuristics based on the current state to choose my action:

  • getBestMove: pick a location that would be most useful for the remaining apps
  • getBestCardToPlay: if no application can be released with less than 2 shoddy skills, eval each action in a hardcoded order, the first that matches an eval criteria will be choosen. Action order: training/coding, continuous integration, architecture study, refactoring, code review
  • getBestCardToGive (or throw from Bronze): choose cards that are the less useful for remaining apps

Silver to Gold

With this and some magic number tweaking, my bot got to Silver. Then I noticed that it was really slow at releasing 5 apps and sometimes didn’t even manage to do so after 200 turns. This was because the 5th app can only be released without shoddy skills… (Maybe this information could be either moved to the release phase explanation or next to the release action rather than the end game section). So I realised that my approach of the game was wrong from the start and the goal was not about having lot of skills and low technical debt, but rather having the good skills to be able to release the 5th application as fast as possible.

As many, I opted for the (suboptimal) systematic 4 automated BONUS strategy by encouraging my bot to take one of the following starting path:

  • ideal path: DAILY_ROUTINE => TASK_PRIO ARCHITECTURE_STUDY (only if opponent is on CI) => ARCHITECTURE_STUDY CI => CI => CODE_REVIEW CI => REFACTORING
  • secondary path (if opponent is on DAILY_ROUTINE): CI BONUS => CODE_REVIEW => REFACTORING

The eval would be something like:

  • Gigantic score for DAILY_ROUTINE permanent skill at turn 0
  • Gigantic score - 1 if I could draw a CI with a bonus in hand or vice versa
  • Small bonus score for DAILY_ROUTINE after turn 0, and same score -1 for ARCHITECTURE_STUDY
  • Some other scores for technical debts, applications affinity, and whether I am going to end up close to the opponent

Gold to Legend

When I reached top 50 in gold league, it was clear that this kind of per action scoring wouldn’t scale much more as my if forests were growing out of controls and my bot seems to be a bit too shortsighted to anticipate good actions.

So I went down the road of simulation to bruteforce the possible actions in one of my turn (MOVE => THROW => GIVE => PLAY_CARD * X => RELEASE). The main takeovers are that:

  • When playing a MOVE, the card should be added to the hand only at the end THROW/GIVE phases (for a while my bot would throw/give the cards he hadn’t drawn yet…)
  • Simulating random draws was not working well, so when I need to draw a card, I only do so if the number of cards in the draw pile are less or equal than the number of cards I need to draw and stop drawing when the draw pile is empty
  • TASK_PRIO is only available if changing cards can lead to a perfect release (this was to reduce the branching factor but in the end, I think I just never played TASK_PRIO)
  • RELEASE is only available if automated bonuses > 2 or a max of 2 shoddy skills is required

That, an eval (quite similar to my silver one but with my end turn state) and some lucky games sent my bot to Legend! If I was to spend more time on the multi, I would probably try to drop the 4 automated bonus strat to favour automating some other interesting skills.

Thanks a lot @_SG_Sebastien, @_SG_Jerome and CG for this suprise contest!

10 Likes

Top 40 in Legend at the end.

The game was not the most fun for me in terms of gameplay but the community around contests and chat interaction is always fun.

Beginning

I knew I will not be able to do any coding on day 2 and day 3 of the contest but didn’t want to start from scratch in the middle of it so the evening when the game was released I decided to make the bot as strong as possible in the shortest amount of time. It was first in bronze for at least a couple of hours, abusing continuous integration of bonuses, using all skills whenever possible and prioritizing low debt.

Search

At some point the game surprised me quite a bit with it’s depth, first bots that were finishing games quickly without any care about debt changed my perspective about best strategies.
I was pretty sure pure heuristics will be quite painful to do long term so decided to go for 1 turn search. Guys in the chat made me scared of the amount of states if you account for draw skills (I code in Java, not C++ and language speed gap on codingame is quite big). I decided to skip it and just search fully without calculating draw probabilities (I simulated draw only if I had more drawing power than cards in draw pile). Obviously that misses a lot of good moves like moving onto training and drawing into bonus continuous integration but was more than enough to find good moves in most situations. I restarted my search at every phase (MOVE,GIVE,THROW etc.) so even if I had fully planned turn it could change after using training for example.
I had 3 different eval functions

  • one for finishing applications (find all applications you can finish and score them this way)
  • one for standard play if I had less than 4 applications
  • one for the end game with 4 applications trying to finish the last one
  • In a way splitting it like this probably made it easier to eval but since you can finish applications quickly, the actual end game should probably be considered faster.
    The game felt quite random so pushing through top gold was not an easy task (it’s not even the boss itself but winning consistently against other top players).

    Thank you for organizing the contest and fixing all the bugs, I enjoyed it a lot!

    10 Likes

    First of all, thanks to CG and SG for providing this surprise contest! :slight_smile:
    I’m a big fan of deck builders in general, so I was instantly hooked.

    Not sure if someone is interested in what I did, but I’ll share it anyway as it can serve as a “what-not-to-do” :smiley:

    I finished somewhere top 100 in gold and came close to legend a couple of times, but the @jacek1 was too strong for my bot. I feel like I used an overly complicated approach with some serious issues for this game, which probably prevented me from reaching legend.

    Approach

    1. UCB algorithm with very lightweight (i.e. roughly 1000 weights) MLP NN trained by reinforcement learning from self-play games. Outputs a single value to a tanh. No policy network.

    2. Only 8 inputs to the NN: (score, #automated bonuses, ratio of TechnicalDebt in deck and avg probability to draw a card combination from the whole deck that allows to release an app w/o TechnicalDebt) for both players

    3. NN architecture: 8 inputs → fully connected layer of 8x20 → leakyReLU → fully connected layer of 20x20 → leakyReLU → fully conncted layer of 20x1 → tanh

    4. The avg probabilities are computed from binomial coefficients, which are precomputed. Also the possible hand draws needed for this are precomputed.

    5. The NN was trained for roughly 400k games. Only states at move phases (i.e. after possible RELEASEs) are used for learning.

    6. The search considers both players, but since the exact state of opponent cards (i.e. where they are, discard/played/draw/hand) is unknown the opponent always has the full deck as draw pile at the start of the search. Then randomly when beginning the tree traversal the draw pile is shuffled and 4+architectureStudy cards are drawn. For my own draw pile I also shuffle it at the beginning of the traversal (from the root).

    7. This means the nodes of my tree are actions, which correspond to different states. After the search the mean value of each node should represent an approximation to the win % when selecting that node.

    8. Since the possible actions vary when arriving at a node I need to check if a certain action is possible, so the number of visits of a node is correlated to the probability of a certain action being possible. This essentially inclines my bot not to take unlikely sequences of action, thus hindering the ability for high-rolling.

    9. The tree is expanded and childs are selected using UCB formula until a non-expanded move phase is encountered. The state is then evaluated and the value backpropagated as usual. This forces the tree to become relatively deep relatively quickly, which is not a good idea as it will miss important sequences on the first turn.

    10. There is very little pruning of possible actions, which also makes it hard for the bot to find good moves.

    Issues and last-moment 'improvements’

    1. The bot played the opening badly, so I hardcoded the first 2 turns of each player, which was only a small improvement.

    2. The computation of avg probabilities mentioned above vastly underestimates the actual probability to release since typically the players don’t draw from the whole deck, but only from a subset and after drawing there is the option to MOVE and pick up a card that is still missing for releasing.

    3. Therefore, I tried to track the opponent’s cards (numberOfCards in draw and cards that are definitely in discard) based on the difference of states before and after the opponent’s turn. This didn’t help as TRAINING and CODING really mess with it.

    Other tries

    • The bot that got me to gold is a vanilla MCTS with the same expansion strategy (of the tree) and completely random rollouts. I was very surprised that it could beat the silver boss, despite there being only 2k rollouts per phase.

    I would like to thank all people with whom I had fruitful conversations in chat! :slight_smile:
    Sorry for the long post!

    12 Likes

    I think you could have tried my way with java. 95% of turns my calc time was below 5 ms, often just 0 or 1 ms. I was not speed limited really. But you couldn’t know that beforehand. It was nice for brutaltesting to have a fast bot though.

    #53 in Legend. Lost ~15 spots on last minute bad submit :laughing:, but still very content with the result, as I had never reached legend nor even top 400 in a contest.

    I took a similar approach as Royale:

    • Exhaustive search of all possible moves until the turn switches to the opponent
    • NN for state evaluation, simple MLP with 2 hidden layers
    • For random draw actions, I calculate all possible outcomes and weight their evaluation with their probability.
    • I used an iterative deepening approach for restricting the number of play card moves that can be chained, to stay within the time limits
    • No real optimization, just ported the offline referee I wrote in C++ to the bot. Not sure if optimization was very important for this approach
    12 Likes

    This contest made me feel intellectually challenged :smiley:
    I failed hard lol, no PM, just wanted to say thanks to @_SG_Sebastien for a tough game and chat for being awesome as always. Also massive props to @jacek1 for getting that DQN approach to work, just awesome!

    9 Likes

    Rank: mid-low gold, C#

    First of all, huge thanks for setting up and hosting this contest. I found it difficult to get a good grasp on the game (I had ignored Task Prioritization for the longest time and I didn’t realize the value of Continuous Integration until I starting to watch a lot of high-ranking replays) but still had a lot of fun in chat and competing.

    I didn’t do anything novel. Started with a full heuristic approach. This barely got me into gold and that’s when I decided to rewrite. Main ideas that led to increased playing strength:

    • Simulating current turn and selecting the move that leads to a release: I’m sure it’s still buggy, but I got a huge increase in rank once I fixed a bug that granted the card before giving/throwing
    • Simulating next turn by probabilistically generating potential hands based on draw pile: if any of those resulting states resulted in a position where I could select a move that leads to a release I would score that, in the end chose the move that had highest probability of leading to a release
    • Simple opening book: first 3 moves going after Task Prioritization or Continuous Integration based on play order and opponents moves

    In the end I didn’t manage to sort out the bugs in my sim which is what probably led me to end up in the middle of the gold blob. I also could’ve looked at improving the eval as it really only factored in releasing.

    9 Likes