Fall Challenge 2020 - Feedbacks & Strategies

I ended up mid gold just using random game playouts with both players actions for 20 moves ahead. the score for the game was r0 / (r0 +r1) where r0 is my reward and r1 is opponent reward. At the end I would take the action with the highest mean score. I didn’t include learning actions in the rollouts. I was getting around 6k-10k rollouts and about 150k-200k node states per turn.

I tried changing the rollout depth to full games but couldn’t get a good sense of whether it improved or not. Seemed like it improved things against better bots but worsened against weaker bots.

Looking back I would move to single player rollouts, with a short opponent rollout to catch traps, and probably add MCTS for single player.

Just learned the first 8 spells. Experiemented a lot with other methods but couldn’t outperform this one. I was fiddling with adding 1 learn action to the start of rollouts but couldn’t accurately measure how much it changed the strength. Also fiddled around with some drafting heuristics at the beginning but couldn’t get something convincing by the end.

2 Likes

7th place

Thank for this contest.

My code is beamsearch with iterative deepening.

I use beam size of 1700 nodes. To eval node I use:

I eval the current backpack with following formula:

1 point for each score point
1 points for each tier 0 ingredient in my inventory
2 points for each tier 1 ingredient in my inventory
3 points for each tier 2 ingredient in my inventory
4 points for each tier 3 ingredient in my inventory
0.01 point for each castable spell

and I add point for potions and spells:

brewing potion at depth d gives:
(gamma^d) * (potion price)

taking spell at depth d gives:
(gamma^d) * (gammaSpell)^spellsCnt (1 - tomeIndex/3 + taxCount/6)
where spellsCnt is the number of spells I have taken from tome

I tried many gamma’s and gammaSpell’s and these work best for me:
gamma = 0.98 gammaSpell = 0.6

In my spell’s formula taxCount matter less because opponent can take you tax.
That eval is so small after 6th spell, that eval of spell won’t affect my choice, and I only will take spell if I will use it to brew the potion.

My eval only differ if I brewed 6th potion.
Then I check if I would have more potion then opponent if yes then my score is:
BIG_NUMBER + my_score
and I don’t try searching deeper.

Huge thanks to the CG team for fun contest.
Congratulations to pb4!

Fun fuct:
Code with gamma = 0.95 have like 30% winrate versus version with gamma = 0.98.
Little change but do much.

18 Likes

OMG, I forgot about the patience score, maybe it could improve my AI :frowning:

1 Like

Hi Mazelcop. Can you comment on what depth you got in the beam search and what strategy you use to prune nodes? Also for Magus and the others that used this aproach.

Thanks !

2 Likes

Hi all,

Ended up ~200th in Silver, so ~800th all in all.

It was honestly a really interesting contest to me, and I got to learn more about my language of choice (Python3 here), as well as some algorithms and how they would work!

It started really well for me, with probably a bit of luck! For Bronze, I basically learned 8 spells at the beginning of the game, then worked up A* with naive distance (whatever the tier of the missing element, it’s 1 distance), and it started great! Basically, calculating one path to each possible brew, and then calculate their scoring ratio (i.e. price / turns to reach), maximize the scoring ratio.

Until the opening of the Gold League, I was in the top 200th, and then… well I pretty much stuck there, and let everyone go by!

I still started a few things to solve my issues. Changing the distance to a slighly less naive one (1, 2, 3, 4 based on tier…), trying to go deeper with a “beam search” like algorithm (which made me think, in the end, that A* is a Beam Search, isn’t it? Always learning!), take into account the endgame (i.e. maximize score or speed when someone reaches 5 brews)

My main mistake was to use the inventory and A* instead of… whatever would maximize the score. If my A* and evaluation function were better, it could have worked. But it was inherently flawed due to the distance, as it would eventually find longer “clearer” paths than the shorter “smart” paths. (i.e. for 4 yellow ones, why not do +1 yellow Rest 4 times, right?)

All in all, pretty much as with what happened for the PAC, I feel like I learned a lot again, and like I go out of that with new things to work on and to improve : especially, thinking about better evaluation functions for nodes, trying to minimize node creation time and such.

Also, on a more personal note, I discussed more with some people of my company and of my school, so good way to network a bit as well.

Great contest, still no Gold league for me, but hopefully, I’ll get there by learning and trying!

2 Likes

k4ng0u, 66th in legend (C++)

Thanks Codingame team for hosting this contest that gathered more than 7k codingamers! The designs were really nice and the gameplay was interesting as well. I also appreciated that the game felt less random than the previous ones (thanks to the absence of fog of war?) But due to this, the contest

I started this contest with a simple Javascript bot to get to Bronze. It would basically target a potion and then cast spells to fill the inventory until the potion can be made.

Then I didn’t find any good heuristic so I went down the simulation path. For a week I had to watch my bot make plays beyond my understanding that would be useful (or not when the simulation or eval was wrong) a few turns later. To be honest it was both rewarding and frustrating.

MC Simulation to Gold
My simulation was close to the game engine, except that I always applied the first (+3) and second (+1) potion bonus even when they were all consumed.

At first I tried a MC (ignoring the opponent, though his state info was there) depth 15 with some tuning:

  • On the 7 first turns, learn a free spell (I tried to use a dumb heuristic but it was worse => the gifted tier0 ingredients were often giving a better start to my opponent and make me lose the game)
  • Only enable learn during first 2 turns of a simulation
  • After the 5 first turns, if brew is possible, do it right away and ignore other possible actions
  • Stop the run if the game finishes or if there are no more actions possible

The eval for a game state was as follows, with a decay factor of 0.9 by turn:

playerScore * 100 + sum(nbIngredientsOfATier * (tier + 1))

And on my 6th potion (end game), there is a big bonus/malus depending on if I win (score + number of ingredients of tier 1,2,3 are superior to my opponent’s numbers)

This reached top 150 gold on the last Saturday, but I felt its limits. The main issue was that due to the poor performance of my simulation (~180k states so about 12-20k rollouts per 48ms) it was hard to always find a good action chain.

Cheap Beam Search to Legend
Since my simu was lacking performance, I didn’t even try to go for an MCTS and went for a Beam Search as many mentioned in the chat. Because of the lack of time, I reused my simulation and GameState. This might have cost me a little because I was cloning every time both my info, the opponent info and the corresponding cache structures that were needed to have a faster MC. But in the end, I could still visit about 30k nodes (far from what other might have) which would go to a depth of 10-30 with a beam width of 400.

The main changes on the simulation was to only learn on depth 0 and remove the condition on the brew.

The eval was updated (and no decay was applied):

// potions are still the most important but it also values ingredients in the inventory
playerScore + 0.5*(nbTier0 + 2*nbTier1 + 3*nbTier2 + 4*nbTier3)

And of course the big bonus/malus when the game ends still applies.

Eventually, my bot was quite simple and I regret spending so much time trying to tweak my MC. I should have worked on the beam search from the start and try to include the missing features such as:

  • Take the opponent a bit into account (an easy improvement would be to at least check if he can brew a potion or the max number of tier 1+ ingredients he can hold at the end of the turn)
  • Come a up with a real spell draft
  • Dedupe states (with a hash mechanism?) to avoid having the beam width full of equivalent states
  • Good performances (a lot of people seemed to be able to compute more than 300k states)

But it managed to get to league legend so the mission was accomplished :slight_smile:

8 Likes

Based on data downloaded by eulerscheZahl during the rerun. Methodology similar to what was shown during previous contests (Spring Challenge, Ocean of Code, Bit Runner and others)

Observed winrate between pairs of players during the rerun:

Expected winrate between pairs of players after fitting a score to each player (alternative ranking independent from CG):

Difference between the two figures above

(a strongly colored dot indicates an observed result which was not expected according to the ranking model)

and finally raw ELO scores :

ELO_SCORE					 PLAYER
701                pb4
692                Agade
629                reCurse
611                Mazelcop
604                delineate
604                aropan
598                DomiKo
589                karliso
568                jolindien
559                zasmu
551                kovi
549                ichyo
546                Magus
540                ClosetAI
538                Neumann
533                MichalOp
533                _Royale
529                Rounddice
520                wala
518                YurkovAS
517                bowwowforeach
515                Vadasz
514                eulerscheZahl
507                tanzaku
500                BrunoFelthes
495                Csipcsirip
489                Marrggta
488                dbdr
488                bourgeof
487                ValGrowth
487                Gonny
486                scolen
485                Risen
480                Jirotech
471                Valdemar
470                Zorg1
462                wlesavo
461                Redstrike
451                maras
450                omu
450                KawattaTaido
448                The_Duck
441                RiSuS
434                siman
433                Manwi23
429                takumi152
427                blasterpoard
427                [CG]SaiksyApo
427                Skril
415                viewlagoon
414                bin101
411                Rafbill
408                Samsonov
403                trictrac
394                Fangel
388                Illedan
384                egaetan
382                VincentBab
380                AT274
379                MSz
378                ucarts
376                sensaur
374                aCat
370                Bondo416
369                sumoru
368                k4ng0u
366                y_kawano
363                betrue12
359                NinjaDoggy
354                Saelyos
351                NightLuna
350                iwashi31
342                Shun_PI
341                SlyB
340                MichaelPu
314                Nerchio
311                eki
306                yukine3
300                SeJ
0                    Jeff06
14 Likes

What is the raw elo score and why does its ranking differ from the final rankings?

#24th place #2nd Java

I’m using beam stack search. With java, i could visit between 10k and 20k nodes.

My score is just the Points that I have + the inventory delta (t0 + t12 + t23 + t3*4)
If i have a brew, i increase the score with the brew price plus a multiplier, that is higher at starter nodes, and at the end of the game. (Maybe a patience score could be better here)

While the player with max number of brew is less than 4, I don’t get the best score, I get the most common start node between the best 20 scores…

I emulate the next 5 opponent turns with my AI, store the max number of points at each turn that he can do, and the brews that he is trying to do. I dont try to do the brew after him, and I dont finish the
game if the opponent can do a max score greater than me at the same turn…

For the draft phase, I get the first 3 free, and than, I get more 4, unless if the opponent stop to get spells…
For this 4 spells, I remove the casts from the first turn, so my AI only have the spells to evaluate… And I remove 2 points for each t0 that I will pay, because I will potentially give it to my opponent…

Fun fact:
When I implemented the urgency bonus at my sims, it increased the performance…

9 Likes

CG’s ranking is the official ranking. It is based on Trueskill which allows for iterative calculation of an AI’s score. Trueskill has been designed for humans, whose level may evolve over time. It also won’t give the same score if you present it with a “win -> loss” sequence or a “loss -> win” sequence against the same opponent. This is not ideal.

CG’s rerun procedure aims are limiting the drawbacks of Trueskill for an AI competition.

My method aims at evaluating an AI’s ELO score based on all observed games. Old games are as important as new games. The order of the games doesn’t matter. I believe it is more robust, but it is not realistic for CG to use it during a competition. Hence the name “alternate ranking”, just to satisfy my curiosity :slight_smile:

9 Likes

[Python - Gold - 433 / 7 035]
Really cool challenge, loved the artwork ! Big thank you to all the staff and the cool people in the chat :smiley:
I was aiming for silver and I got pushed to gold on Sunday evening so I’m really happy :smiley: !!

I tried using linear algebra to find which combination of spells could yield the potions :
np.linalg.lstsq(available_spells,target_potion-initial_inv,rcond=None)
The problem was that there was no notion of ordering and could also give negative coeffs

That wasn’t working so I decided to used BFS like everyone else…

Things that made me improve my ranking :

  • (+600 in low silver) When generating children nodes : try to generate brews before rest
  • (+200 in mid-high silver) Change my patience coeff that I set at .75 at the very start to .826 (no idea why that number worked so well) Thanks to @Bob for bringing my attention to patience coeff in the chat on Saturday :smiley:
  • (+200 in mid-high silver) Take into account the remaining ingredients after a brew into the evaluation of the node Thanks to @eulerscheZahl for saying something about that in the chat on Thursday

Things I did well:

  • Prepare in advance for the contest by training on other challenges
  • Never broke my bot and lose motivation
  • Stuck around in chat for great advice

Things I can do to further improve:

  • Optimize nodes (using numpy maybe ?) I only visit ~700 nodes per turn
  • Simulate opponent actions
  • Beam search

I can’t wait till next time =D

8 Likes

I limited my search to depth 35. To prune nodes, my eval is a linear combination of:

  • potions total score
  • potion urgency bonus
  • potions count
  • potions “stolen” from the opponent’s most profitable path,
  • inventory score (inv_0 x 1 + inv_1 x 2 + inv_2 x 3 + inv_3 x 4)

I have no limitation on the depth. With a beam size of 1000, my AI reach depth between 10 and 12.

To prune nodes, I just eval every node. I sort the nodes. And i keep only the 1000 best nodes.

I don’t have any code to avoid duplicates states or something like that.

3 Likes

By scoring each ingredient to a value of tier + 1, most potion prices are equal to the ingredients score.

For instance:

actionId 50 [-2,0,0,-2] price 10 => ingredients score = 2*1 + 2*4= 10
actionId 48 [0,-2,-2,0] price 10 => ingredients score = 2*2 + 2*3 = 10

Applying a 1.1 factor for the potion price slightly encourages the bot to brew instead of keeping the ingredient in its inventory.

2 Likes

What you meant by “score point” in first bullet point?

1 Like

Can you explain this “to buy a spell in exchange of a score “penalty””? Did you mean cast a spell or buy a potion?

Just my current score.
A score of 20 means 20 points in my evaluation function. Or, if you want the code:

result += me.score * COEF_SCORE;

With constexpr double COEF_SCORE = 1.0;

JRFerguson I go to depth 10 to 15 depending if it is the start or the end game, and I don’t need to prune it, because I always get the best 2 nodes at the current depth, expand and score it, add to the next depth, and get the best 2 of the next depth… So, the prune is natural, you will not evaluate bad scored nodes at each depth…

1 Like

20th
I basically did the same as most others: a beam search. First at depth 8 for my opponent to estimate when the game will end. Then for myself in the remaining time.
With about 150k single simulation steps and a beam width of 1200 this gets me around 15 turns lookahead.

As you don’t want to read yet another beam search description, I’ll go with my inventory storage in a 4 byte integer.

I use 5 bit for each tier, which gives a number range from -16 to +15 each. And an extra bit as a buffer to prevent overflows. For faster access I store the total as well.


The free spots in this layout are the 0s.
Negative numbers are stored as complement. This applying a spell or brew on the inventory just becomes an addition followed by an AND operation to remove possible overflow bits.

public static int CreateInventory(int[] delta)
{
    int result = 0;
    for (int i = 0; i < delta.Length; i++)
        result |= (1 << (6 * i)) * (delta[i] & 0x1F); // this also works for negative numbers
    result |= delta.Sum() << 24;
    return result;
}

public static bool IsValidInventory(int inv)
{
    if ((inv & 0b010000_010000_010000_010000) > 0) return false; // negative
    if (inv > (Player.INVENTORY_SIZE + 1) << 24) return false; // full
    return true;
}

private void ApplyDelta(int delta)
{
    Inventory += delta;
    Inventory &= 0b001111_001111_001111_001111_001111;
}
39 Likes

Is it possible to share your code? :slight_smile: