Fall Challenge 2020 - Feedbacks & Strategies

xdcomrade · January 26, 2021, 8:58pm

Python
179 overall
8 gold

First of all: big thanks to the organizers of the event, I really enjoyed it and learned a lot about python.
I know it’s already a while since the end of the contest, but I still would like to share mainly my thoughts and tips, that I got from the contest, which I hope will be useful for the learners, as I myself am.

As many I used BFS at first and then beam search.
BFS - stands for breadth first search, which in short means: to make a list of all possible states after the first turn and then expand every state with the next possible move and add the newly created state to the list. Basically it is just like chess: you have a variety of possible moves (say 10 moves) and the opponent has a certain set of possible moves based on your initial move (for every move you make, he has 10 different moves of his own).
What I noticed, is that Python could not find a long sequence of moves due to the time limit. I think the limit was like 3-4 moves ahead, which is not a lot.
What I also understood, is that 4 moves in a row with “REST” can’t win a game. So this is where the beam search comes in.
Instead of calculating every state of every move you can filter the states based on some kind of a score. In this contest the score was calculated for the inventory and sold brews. A score for the inventory 3-3-2-0 is higher as for the inventory 1-0-0-0. Back to a chess example: after two moves we had 100 possible outcomes (10 first possible move, and 10 moves available for each fist move). Instead of considering all 100 possible outcomes for the further calculation we can rank these outcomes and choose 50 most promising.
This way I managed to get to a depth of 6-7 moves, which was enough to get to the top gold.

The Implementation of the algorithm:
normally a queue is used for a BFS, but a list just did a trick for me.
So I created an empty list, which is called beam_search and would hold the information about different calculated states of the game, and added a node (a list) with an initial state: a current score, a list of moves taken, a list of ready to be used casts and an inventory.
beam_search=[ current_score, [ “list of moves taken” ], [ " a list of ready to be used casts"], [ “inventory” ],]
furthermore I created another list, which is used for a temporary saving of the nodes, and called it second_beam_search.
I pop a node from the beam_search, see, if a brew or a cast can be made. If yes, I add this move to a move list, calculate the new score and the new inventory and add a new node to second_beam_search.
this way a have a list of all possible actions, I can make in my first turn.
second_beam_search = [
[new_score, [ “BREW 17” ], [ " a list of ready to be used casts"], [ “new inventory” ],],
[new_score, [ “CAST 80” ], [ " a list of ready to be used casts except 80"], [ “new inventory” ],],
]
as you probably noticed the first list beam_search is now empty, so I sort the second_beam_search based on the score and copy the first 100 best nodes to the beam_search and the algorithm runs the same.

Just a few important points: I never just copy a current inventory and make changes on it, but instead have a function, which returns me a new list of the inventory depending on the cast or the brew I did in this turn.
a price of a brew is calculated with a patience coefficient: 1.1 * brew.price * PATIENCE_COEF ** (len(current_moves) + 1), what encourages a bot to make a brew sooner.
a price of every piece of inventory is as mentioned before [1,2,3,4].
a repeatable cast for me is just like having several casts, so basically instead of just one cast [-2,+2,0,0] I have 4: [-2,+2,0,0], [-4,+4,0,0], [-6,+6,0,0], [-8,+8,0,0].

what I tried, but failed to accomplish:
integrating the learning to the beam_search. the method I used slows the algorithm, so I get 3-4 moves, which is not good.
excluding the nodes with the same states, but different order or moves. I thought creating a dictionary instead of a list and making the nodes with the keys as a string representation of an inventory (as ‘01030301’ which means [1,3,3,1]) and replacing the node with the same key, if the score is higher, would help, and it seems on paper to work, but in reality the bot just goes directly to the bottom gold.

If you have any suggestions, ideas or questions, I would be happy to hear them out. Especially if I would be able to get to the legendary league.

have fun!