Ocean of Code - Feedback & Strategies

blasterpoard · April 20, 2020, 3:04pm

I finished #15 in legend. My strategy was to focus on parts of the bot one after the other, not caring until my rank until it would be all done. The upside of this approach was that, in the end, every single “part” of my bot worked extraordinarily well. The downside was that when I had to put them all together, I failed.

Tracking

The most boring part; the bot just remembers all the possibilities, and when there were more than 10000, I would remove half of them: I would find pairs of possible “histories” that ended with the sub in the same cell, and I would merge them into one - the visited squares in the new history were intersection of visited squares of the two old histories; mines were union of the old mines (I remembered the square from which the mine was placed, not the squares where it could be). I used 2 bitboards to remember the visited squares and mines; apart from that there was only player’s HP (to be able to prune after torpedoes and triggers) and his position.

Starting position and early game movement

I used the first turn to precalculate the first 33 squares I would visit (unless a mine or a fight happens). I used a genetic algorithm to do this. Every individual in the population consisted of up to 32 directions, stored in a 64-bit variable (2 bits per direction). Evaluation was simply walking from every square on the map, following the directions, and for every step that was valid I added 1 point to the evaluation - this ensured that I couldn’t be found easily at the start of the game. Mutation was simply changing one direction, crossover was taking two subpaths and concatenating them into one. After the GA finished, I chose an individual from the final population and a starting position according to path’s evaluation, number of squares I could reach from the final position (so that I don’t need to surface right after those 33 squares) and number of squares that I could mine when walking along the path.

Movement / Pathfinding

I used a genetic algorithm again, but a different one. This time, the individuals consisted of 225 directions (450 bits); each direction corresponded to a cell of the map, and it was the direction in which the submarine would move next from that square. The evaluation was simply following the directions, starting at my submarine’s current position, and adding bonuses and maluses for each square I visited, with decreasing weights further down the path (decay). These bonuses and maluses were:

a constant (1), to make sure that I would surface as late as possible
a small bonus for being close to the center - there’s usually fewer options for your sub when it’s stuck in the corner
a small bonus for being out of range of any of my mines - to try to claim more space
a large bonus if the path didn’t give my position away - every turn, I precalculated how many cells I could be in (according to my opponent’s tracking) after every path of length <= 6, and used only this information.
a large malus for going to a square that could be in range of opponent’s mine (I calculated expected mine damage for every square, based on my tracking)

To mutate, I walked randomly from a cell on the sub’s path, until I either had no available squares, or merged back to the path. Crossover was following both paths at the start until they diverged, then picking one of them randomly, following it until the paths merged, then repeating the same process until I ran out of squares. Implementing these operations efficiently was the reason why I needed entire maps as individuals. On its own, this pathfinding, along with torpedoing whenever expected damage was above 0.75 and silencing afterwards, was enough to reach legend.

Surfacing, sonar, stuff…

I used sonar whenever the opponent could be in more than 1 sector and I had it charged, and used it on the sector with the most possible opponent positions. Surfacing was based simply on the expected mine damage I mentioned earlier, with the threshold needed for surfacing depending on my HP. I charged actions in order Torpedo > Silence > Mine > Sonar, with giving Mine some additional priority on the early turns, and also preferring Sonar over Mine when the number of possible opponent’s positions was above a threshold. I placed mines whenever I came close to a square that my bot considered good enough (based on number of water cells next to it, and how many of them were already in range of one of my mines), becoming less picky over time. I triggered mines whenever the average damage caused by the explosion would exceed 0.65; having this value too low or high had very bad effects on my bot.

Fighting

I wrote a simple sim that tried every possible combination of my actions (surface, move, torpedo, silence) that made at least some sense - f. e. never torpedoing a cell if the opponent couldn’t be next to it; usually ~2000 combinations. For each one of this combinations, I would find the cells the opponent could think I am in - this would be too time-consuming, so instead of working with all the combinations of cells I could have visited before coming to each cell where my opponent thought I could be, I used an estimation - the “real” visited squares, but I would move the squares I visited since I last used silence by the difference between my actual and “pretended” position - for example, if I’m at [1,1], I visited squares [1,2], [1,3], [2,3], [3,3] since I last used silence from [4,3], and I’m calculating squares I could reach by a combination of actions starting at [2,1], I would add [1,0] to those coordinates, and get [2,2], [2,3], [3,3] (and stop there, because [4,3] was visited before using silence), and the rest of the map would be the same. Then, I made an eval (HPs, numbers of possible positions after all the actions) for every combination of possible opponent position and torpedo that made some sense, and averaged them with some weights (opponent’s torpedos that hit more squares had higher weights), added some additional bonuses and maluses based on cooldowns after the turn and distance, how useful silence was in the part of the map where the fighting took place, how many cells I could still reach after my actions, and on expected mine damage on that cell. Going deeper than 2 turns wasn’t very useful, because once the abilities were used, players only charged torpedo and eval was simple at that point and it was easy to estimate the result from just the HPs and how much information the players had about each other’s positions.

And indeed, this worked great - whenever I tested this against my final version, it won a large majority of games where the two submarines met and fought to death.

… “whenever I tested this against my final version”?

Where it went wrong

I think you can already tell that this bot had a lot of things in it. So many, that the most difficult part became deciding which of these algorithms should I believe in this specific turn. I had a precalculated path from the first turn. I had a genetic algorithm that suggested a different path - this GA had less time to calculate it, but more information, and different priorities. I had an evaluation of how good each one of my possible actions was when it came to fighting the opponent with torpedoes and silence. I also had some heuristics… and it became a mess. The fighting algorithm couldn’t tell whether the sub should try to silence and torpedo towards a possible enemy position, or keep playing calmly, and somehow trying to use differences in evaluations was futile; pathfinding algorithm kept walking into torpedo range at 2 HP, … A few days before the end I realized that by adding more stuff (assumptions about enemy positions, using mines offensively, and more things I had prototypes for), I would probably just make it worse (and I was dangerously close to 100k characters), so I just thoroughly tested what I had already made, and it turned out that a bot that used the old, bugged fighting algo (a predecessor of what I described) worked best against the other players (except kovi, who kept losing to the new bot for some reason, and winning all the games against the submitted one). Maybe I could have somehow made it work, but with all the variance on the leaderboard and even in the benchmarks, I was lost and gave up.

Congratulations to everyone who managed to make sense of this game, and thanks to creators for giving us a game that we didn’t even come close to solving despite playing it for a month.

cegprakash · April 20, 2020, 3:06pm

I joined the fun 3 days before the contest ended, yet reached mid Silver!
Update: I reached gold in the multi

Wood2: Random shooting Just throwing Torpedos at random spot + handling SURFACE got me out off wood2.

Wood1: The SONAR strat. I started storing last known enemy grid through my sonar results and opponents’ SURFACE commands. I used SONAR restricted to grids where I can use TORPEDO. This drastically improved my rank. (I only use SONAR after my move command so that the result is more accurate as I always use Move before Torpedos). I restricted myself never to enter into choke points where there is less room to breathe. I prefer to finish my current grid before moving to the next grid. This gave me enough time to spam SONARS only in grids around me instead of jumping into enemy and get surprised. All these logic and I was still stuck on top 10. So I added some random firing chance even when I don’t know enemy grid and that promoted me.

Bronze: The enemy tracking strat and stay out off his firing range: I started tracking enemy movements. Whenever enemy uses a silence, I cleared the enemy history. Every turn I recalculate all the possible end points of enemy (from enemy’s turn 1 movement). I removed all the SONAR code and started charging SILENCES. As I removed it, I also removed the logic where I prefer to stay in the same grid. It was okay to surprise enemy and fire at a random enemy spot I predicted. I replaced my move command with SILENCE with distance 1 whenever I had cooldown. I finally fixed my spawning inside the pond. Used Torpedos before Move commands. Easiest league so far.

Silver: The combo finder strat: I decided to implement a search algorithm to shuffle my command orders in order to generate nice combos. I was able to run it with an average of 15000 simulations easily. Did optimisations in tracking code and fixed a lot of bugs here. I removed all manhattan calculations and replaced it with reachable distance (because of islands).

Here is my final eval during the contest:

int GameState::getScore(int moveDir) {
	Point currPos = players[myId].pos;
	int score = 0;
	score += (10000 - players[1 - myId].life) * lifeMultiplier;
	score += spacesMultiplier * players[myId].getSpacesFromPosition();
	score += distanceFromEnemyPoints[currPos.x][currPos.y] * stayAwayFromEnemyMultiplier;
	score += 4 - moveDir;
	return score;
}

What I did great: In a short span I wrote a very optimized sim which runs sooooo fast. And a looot of debugging. Bugs Bugs everywhere.

What I could have done Better I forgot to fire torpedos at random positions in my final sim version (which explains my rank drop after switching to sim). Also I forgot to add FINISHER blow scoring in my eval.

What was amazing in this contest: I felt this is a very balanced game with no real meta at least until Silver. Lot of strategies can beat each other and it’s nice to see Miners kill Silencers and Silencers use nice combos for sneak attack. I didn’t sleep on the last day (similar to many CG last days) trying to get into Gold fixing bugs ignoring the scream of my body.

What could have been better: This was one of the games which takes lot of time to code for wood2. Because the input was large (plus the confusing spawn turn where you don’t read any inputs) and 3 possible outcomes: MOVE, SURFACE and TORPEDO as well as combination of those (MOVE+TORPEDO on the same turn). It would have been nice to have been a wood3 league with 3 lives with only MOVE and SURFACE commands with boss using SURFACE command often to kill himself.

MichaelPu · April 20, 2020, 3:09pm

First thing first - Thanks to @Illedan, @eulerscheZahl and @G-rom for this, the timing of the contest was perfect, the rules were simple and the game behaved as expected, without weird bugs.

This was my first CG contest. Ended up 24th, probably since the quarantine gave some extra time for coding
Since I was coding in python, the time limit was difficult, and most of the turns took 10-40 ms for calculations.

The main features in my bot:

Tracker:
Consisted of multiple “guesses”, started with 1 guess.
Each guess had:
A set of possible current enemy locations, with relative weights (starting with all water tiles)
A set of mine dropping positions relative to starting pos
A set of visited positions relative to starting pos
Locations were removed from the set based on damage, torpedo range, mine activation positions, and moving.
Every time the enemy had silenced, the guesses were duplicated, and whenever there were no allowed locations in a guess, or a cell was visited twice, it was deleted. I also combined guesses with the same mine positions whenever there were too many of them, to avoid timeout from flood spammers.
Using sets and dictionaries instead of lists or numpy arrays was crucial for the speed up in lookup and adding/removing information.

I used this tracker both ways (self and enemy). This was handy for finding bugs (for example raising an error whenever the tracker is in conflict with my real location) and for calculating information given to the enemy.

most high ranking players used some kind of mine avoidance, so I exploited it and tweaked probabilities accordingly.

Attack&move evaluator:
Just some sophisticated heuristics, with a minimization of cost, without minmax planning:

Costs for using damaging actions (torpedo & trigger) or surfacing
Cost for damage (to myself and to the opponent)
Cost for information (gained and given away)
Costs for ending up on a tile that could be possibly harmed by enemy mine
Costs for the remaining water area to go to
The costs for using torpedo before and after move were different.
Every cost was multiplied by the corresponding probabilities from the tracker and self tracker, to find an optimal combination of move+action.

Surfacing:
Same idea with costs (of information, danger from enemy mines, and remaining water tiles).
Placing mines:
Basically tried to spread them away from each other, and away from land.
Sonar:
Loaded only when all the other abilities were ready, and needed to load something.
used it only when the probability of finding the enemy on a tile was 50% (with a small tolerance), hence gaining maximal information.

In the last days of the contest, I was mainly optimizing parameters, that was the difference between 45th-ish place to 24th.

GG everyone

k4ng0u · April 20, 2020, 3:43pm

So this was almost my 1 year anniversary for playing codingame actively. I started with code a la mode where I spent a day implementing a dijkstra (on a map with 16 or so possibilities) to a be able to have it as a basic algorithm for my bots. For that I am already greatful for codingame to make me a much better developer.
Regarding this particular contest, I found it particularly hard not only because the challenge lasts longer and people were able to come out with great algofrithms but also because of the multiple possible strategies (mine everywhere? silence after mine to entertain uncertainty, toroedo/mine everywhere to get more information about the opponent?)
In the end I couldn’t invest as much time as I would want on this challenge but the key points of my bot (lower rank in legend league) was to:
_determine opp position as well as possible (moves/silence/sonar/torpedo spots/mine trigger sports, damage…)
_hide your position as much as possible (see previous criteria)
_try to not walk into mines (and if you do, spam silence until you get out)
_torpedo/trigger mines only when it’s worth (try to not reveal yourself)
_only surface when you are in a advantageous position (myLife>oppLife but this could certainly be improved)

Overall it was a great challenge that I feel was harder than usual challenges (smaller timeframes, easier to get out of wood/reach legend). But that would be expected during this confinement period where some ppl have a lot of spare time to kill. Overall it was very enjoyable and I am looking forward to the next one!

AntiSquid · April 20, 2020, 4:27pm

Here’s my postmortem which only contains my solution, not my personal opinion about the game.
https://github.com/SpiritusSancti5/OceanOfCode/blob/master/README.md

As for how i feel about the game, i honestly didn’t like the game itself. It’s not my type of game and it felt like i need to do too much of an initial write-up just to get started. Definitely disliked that aspect.

I guess it was a good way to fill time for the lockdown period so far.

Some issues with contests in general and not specific to this contest:
I don’t mind the strong wood bots, in fact they are much appreciated if they aren’t nerfed once you commit to beating them. My only issue is that you can’t jump back and forth between wood and bronze. This is troubling because at times you might want to only specific game elements. In this challenge it was perhaps not so much of an issue since you could submit a restricted bot on the ladder and test against it, but i don’t consider that a solution and training offline isn’t the same either. The only other option is to make a smurf each time you want to test something in wood, which isn’t officially allowed.

Also i might want to skip to bronze at the beginning of the contest to see full rules and then jump back to wood to implement one skill at a time.

Another issue is with the leaderboard in general again. There amount of games played should scale based on the amount of people in the league. By the end of the contest gold got very big! Just imagine how the contest would be like if all those 7000 registered players would participate.

I think submits should be more localized, something like playing 50 players above + 50 players below. Only 100 within range anyway.

JFB · April 20, 2020, 5:22pm

I finished #89 in Gold (#153 overall) so it looks that I do not written very good bot. What interesting – my bot for some reason – I do not know why - have very good win rate (much bigger that 50%) with Gold Boss. Unfortunately very rarely play with Gold Boss

I made well only one thing – opponent position prediction. I started with simple bot in Python. I have had plan to port to C++, but never done this. So I have to find a way to do prediction fast. My solution was to work with bits – using Python ability to work on arbitrary big numbers.

I model game board using 256 bits (16 * 16). 16 = 15 points of the arena + 1 point for border. So moves was simply shift (1 or 16) and AND (|) with the bit mask of islands and borders. For each path (stored as bits in integer) I have 1 int with bits representing possible submarine positions. On the start I have 1 path with only 1 point (initial position) and bits on all ocean points. After first move path has 2 points – and in int representing submarine position it was less points after shift and & with bit mask. For all possible moves in SILENT I created new path and int with bits for submarine positions. Sometimes I have had more that few thousands such pairs. To minimise probability of timeout – for big number of such pairs – I use only 0 and 1 move in SILENT. When, for given path, int with positions == 0 – I delete it. If it is no predictions (very rarely – only for opponent with a lot of SILENTs and moves longer that 1) – I wait for SURFACE and start with points from the SURFACE sector.

So deal with TORPEDO, TRIGGER, SONAR, SURFACE is very easy – simply & with corresponding bit map. Because on path I store inf about MINE too – when opponent TRIGGER mine I use this info to predict his position too.

I use TORPEDO or MINE only when I am sure that I will score points. It is easy – using combination of bit shifts and & for possible opponent positions I can find bits corresponding to points for all detonations which can reach all possible opponent positions. If such points are in torpedo range or in mine position – I use torpedo or mine.

I predict my position too. I use it for MOVE calculation – I calculate number of my predicted positions after all possible 2 moves (up to 16 possibilites) – and move in direction with the biggest score function based on number of predicted positions and number of cells which I can reach from new position.

I put mines if possible – in place which add max number of new cells in mines explosion range.

When I decide what to do (MINE + MOVE + TORPEDO + TRIGGER) I predict my position once again. If it is 9 or less possible positions – I add SILENT if possible. Only 1 move – in direction of 2nd MOVE in 2 moves calculations I made for MOVE direction selection.

As you see – I predict opponent position only once – but my position up to 17 times per turn

Weak points of the bot (points to improve): I do not avoid opponent mines at all, I do not use my SONAR, I do not use my mines to check opponent position, I do not have special moves for fight. So for bot without such things - #89 in Gold and > 50% win rate with Gold Boss – maybe is not as bad as it looks

Skril · April 20, 2020, 5:25pm

This was my second contest, after Unleash The Geek, and I really enjoyed both of them
I finished #7, and used C++.

The opponent tracker represents half of the code size. The information was stored in 2 separate ways, with for each cell :

2 bit-boards, union of mine positions and remaining free areas for each path ending here,
a list of ‘paths’ (in terms of remaining free area and drop positions).

Upon opponent moves, a first filter was applied on the bit-boards, and the ‘paths’ were then updated for the remaining valid cells.
The list of paths was reduced at the end of each turn, merging (not necessarily identical) paths sharing the same mine drops and the same (or a subset of the same) remaining space (a flood-fill was run after each move or silence update). This divided the amount of paths by several orders of magnitude on some situations, without loss of information.
PATH
I used a simplified version of Edmonds-Karp’s algorithm to assign triggers to valid mines drop positions at the time of trigger (removing subsets when a perfect match between triggers and drop positions was found).

Each possible opponent cell also had a probability attached. On each move, the number of available neighbors from the starting cell was accumulated for the path. The relative amount between cells’ worst cases was used to assign a probability of presence. This lead to an increase in probability when the opponent ran along islands or edges, or passed through corridors : the cell with the smaller count was very often the cell where the opponent really was . This however didn’t work so well against a few players who seemed to try to fake their initial position (mimicking island avoidance as is they were on a different part of the map ).
I also used this accumulation counter to decrease probability for non-0 and non-4 silence moves.
These probabilities were useful when considering torpedo shots and triggers, and also for the computation of a danger map (mine drops associated to paths with high probability contributed more).
I thought about taking into account my mine covered area to add some kind of fake islands when doing the accumulation ; it might have increased the efficiency.

The main loop was the following :

At the beginning of a turn, I evaluated if an instant kill (or draw) was possible,
If not, I ran a MCTS to search for the best move direction (giving bonuses when close to map center, and penalties when above a threshold in danger map, or for scatter loss at depth 1).
If the score was too low I ran a second search considering silence + move (trying to avoid mines or reduce scatter loss), and eventually, if score was too low again, I ran another one adding surface.
Torpedo shots and triggers were then evaluated at each sub position,
If available, a silence was added (if it improved my scatter) when opponent was near, or after surface and torpedo shots,
Then sonar was triggered if at least 1/3 of opponent scatter could be filtered,
At last, a final check could add a last chance torpedo shot (with no associated score threshold) when the opponent could insta-kill us on next turn.

I had plenty of improvements TODOs left, but I instead mostly spent my last days tuning some thresholds and coefficients, which eventually had a substantial impact (for me it wasn’t the log(x)/log(3) mentioned on the chat, but exp(-0.35 * x) ).

Again, many thanks to the authors for this very entertaining contest !

JFB · April 20, 2020, 5:32pm

Strange, for me it was very easy to start. Maybe because of language I chosen: Python. Disadvantage was that I have had problems with speed in higher leagues.

JBM · April 20, 2020, 5:49pm

Evening 1, 1h22m50s: hey, cool topic. Oh, I have to actually implement tracking and torpedoes to get out of the first wood league, with my PC melting down for uncorrelated reasons? Sign me in!

Evening 2, 2h57m50s: reach Wood-1 by completing:

basic tracking: starting from full map, maintain set of possible opponent positions.
torpedoing: shoot any any cell that has a >0 chance of having an opponent. By “any” I mean furthest reachable (intent: deal more damage than I take). More on that later.
movement: nearest unvisited cell, priority to N/S, then W/E. A left-leaning sub.

Evening 3, 2h54m55s: reach Bronze by implementing:

SILENCE 0 S whenever I’ve got enough power for it. I power it whenever my torpedo is fully charged yet I don’t know where to shoot.

Evenings 4 (1h40m47s) and 5 (1h31m35s): Make code clean. Eliminate a few “how on earth could it have worked before?” kind of mistakes.

Evening 6, 2h15m37s: end up mid-Silver by:

moving with the depth-1 tron heuristic
restrict torpedoing to when opponent cover is less than 20 cells and shot won’t harm me. If you followed correctly with the previous heuristic, you’ll notice I could (and would!) torpedo myself a lot before that.

One of those evenings happens to be on the same day as another’s: IIRC I split the stream when I passed a league once.

This sums up nicely to 12h53m34s worth of coding.

Tagging @leojean890 for the next PM.

Thanks a lot @eulerscheZahl and @Illedan for the last-minute work on this!

jolindien · April 20, 2020, 5:58pm

two mcts

YannT · April 20, 2020, 6:05pm

Finished #11 and very happy with this result!
(would’ve been happier if I made top10 but hey, it’s already my best result anyway!)

Some parts of my bot are pretty much the same as what has already been detailed above, some are more esoteric.
It can be split into 3 main components: Tracking, Pathing and Evaluation

1. Tracking

This is probably the least original part, as everybody pretty much had the same features.
I would assume the opponent could start anywhere that’s not an island and then track his path, removing any that wasn’t possible.
The same would be applied to myself for being able to evaluate how stealthy I’m being.

I chose to represent a path as an array of 16 ints (32 bits).
Each of the first 15 int represented a row, in which the first 15 bits where visited tiles in the row, and the next 15 bits where positions a mine was layed, if any (more on this later).
The last int was position.

After each MOVE, TORPEDO or TRIGGER, the paths that are impossible (moving into an island or out of the map, out of range for torpedo, couldn’t have layed a mine there) are removed.
After each SILENCE, I duplicate every existing path and from those create new paths to represent every valid silence possibility.
After each SURFACE, the visited tiles of every possible path are reset (but not the mine component of the paths!) and the out-of-sector paths removed.
After getting SONARed, TORPEDOed or TRIGGERed, the appropriate paths would be removed based on what positions are impossible given the results.

For the opponent, this was done at the beginning of the turn when processing his orders and would typically be very fast.
For myself, this was done when simulating order combinations and would typically take most of the evaluation time (more on this further down).

What went wrong with that:
Since SILENCE makes the number of possible paths dramatically explode, I had to implement a cuttoff at which I would reset my tracking by creating a new set of paths from the opponent’s possible positions.
Doing this however would also make me lose track of his possible mines, obviously not ideal, but thankfully that cutoff was hardly every reached except against chain-silencers.
To avoid timing myself out as well, I would also have a cutoff at which I would simply not consider silencing, based on the assumption that if I’ve got a gazillion paths, I probably don’t need to anyway.
HOWEVER, the GC being what it is, I would still sometimes timeout when the number of paths got close to the cutoffs.

TRIGGER tracking the way I was doing it (with just one bit for marking a position from which a mine is layed) has one obvious shortcoming: it is perfectly valid for someone to lay a mine, go elsewhere, surface, come back and lay another mine from the same position.
This is impossible to handle with my chosen bitboard, so I pretty much simply chose not to based on the fact it didn’t happen very often. I still prevented my tracking from removing paths based on TRIGGERs if there was any doubt to avoid completely losing track of the opponent in those rare cases.

It is also possible for a given mine to be layed from multiple different positions in a given path, making it impossible to know which mine laying position to remove.
In that case, I simply consider that all mines are still there based on the better safe than sorry principle.

2. Pathing

I basically didn’t have any pathing algorithm, instead I though I’d evaluate how good every potential position I finish my turn in is.
To do this I used a BFS of my next 4 MOVE actions (didn’t take SILENCE into account in that depth search), and for each depth I would score with a decaying factor the position based on available space and the probability of a mine being there.

Available space is the maximum result of floodfills from every accessible tile around my current position.
Probability of a mine being there is defined as the sum of the mine laying positions (divided by the number of mines that can be laid from that position) from which a mine that can reach could’ve been laid, divided by the number of possible paths of the opponent.
(This is definitely not proper probabilities, but close enough to get the job done)

All of this with some magic coefficients would allow me to score the value of that position.

What went wrong with that:
Mostly everything. Obviously my bot couldn’t see what was further than 4 tiles away and so would sometimes take a turn that will in 5 turns or more lead it to some dead-end or more often into a minefield and certain death.
Likewise because there was no proper pathing, it would sometimes make a small loop, SURFACE and do a similar loop again, just because it doesn’t see problems that are too far away.
This was probably the worst part of my bot and made me lose many games in silly ways.

3. Evaluation

While my tracking was pretty standard (everybody had those features) and my pathing was clearly below average, I think this part is what made my bot stand out.

Every turn, after processing the opponents orders, I would start by establishing a list of my possible orders that “made sense”, based on some heuristic rules and the state of my cooldowns.
The rules for valid order combinations were:
TRIGGER? > TORPEDO? > SURFACE? > MOVE > SILENCE? > TORPEDO? > SILENCE?
Only MOVE was mandatory, everything else is optional.

For TRIGGER and TORPEDO, I only consider the best possible one from my position at that time, based on the opponent’s possible positions.
“best” is defined as the amount of guaranteed damage that this TRIGGER or TORPEDO will do, which can be 0, 1, 1.5 or 2.
1.5 damage is a guaranteed 1 damage blast that also hits a possible ennemy position, as that is obviously better than a blast that covers all ennemy positions but hits none of them.

For TORPEDO before moving and TRIGGER, I actually disallowed anything that doesn’t hit at least every ennemy position minus one (I allowed for just a little bit of tolerance).
TORPEDO after moving had to hit every ennemy position (damage>=1); it’s stricter because it also usually puts you at a charge disavantage in case of return fire, so better not miss.

This bit of heuristic allowed me to only consider one TRIGGER per turn (since it only happens before moving), and one TORPEDO per MOVE/SILENCE option, dramatically reducing branching.

Next I would just simulate all of those order combinations and evaluate the result as such:

	if (damage >= opponent.life)
		return Double.POSITIVE_INFINITY
	score = (the scoring of the depth search as explained in the previous section)
	if (hasNotSilenced)
		score += (some function based on the number of my positions according to my self-tracking)
	if (hasNotSurfaced)
		score += (some function based on the numer of available tiles)
	score += (some function based on the damage done from trigger and torpedo, if any)
	score -= (some function based on the probability of getting a torpedo in the face)
	return score

Probability of torpedo in the face is defined as the number of paths that the ennemy can shoot at me from, divided by his total number of possible paths (now that’s proper probabilities!)
Or zero if his cooldown is not up, assuming that the opponent always charges his torpedo as a first priority.

Note that removing the bonus based on my number of possible positions when the order combination includes a SILENCE, or the number of available tiles in case of a SURFACE, is my way to assign a cost to those actions.
It turns out to pretty naturally balance itself out: the more detected you get, the less a SILENCE costs. The less space you have, the less a SURFACE costs.
And of course the depth search and torpedo in the face parameter can also make those actions more likely as silencing to a more favorable location will reduce/remove those penalties.

Finally, the first condition make killshots happen very naturally as well.

What went wrong with that:
Not a whole lot to be honest, I feel like this is the part that works. It can be hard to balance out all of the parameters properly, but I think I managed. My pathing was often the death of me, but the eval itself tended to take the right decisions.
One thing I tried to improve towards the end was to allow for more order combinations. For instance, always doing MOVE before SILENCE will usually give you access to the same positions as SILENCE and then MOVE, except in those rare cases where the islands are set up just so.
That didn’t seem to add anything of value and more branching though, so in the end I just scratched that.

So there you go, that’s about all of my 600 LoC explained.
There’s a bunch I could’ve done better, but I’m very happy with the end result!
Thanks to everyone who made this contest happen, I had a lot of fun with it!

BlitzProg · April 20, 2020, 6:30pm

BlitzProg, 399th.

I haven’t much to say original about my AI this challenge - It does what you expect from a mid-gold player. I was too busy so I invested only what was necessary to enter the gold league, then I dropped out, because I have other things to do (another challenge and other very distracting real life events). This is a bit sad tho, because I started to like the challenge near the end. I need more lives to do everything I want!

I allocate 4096 possibilities for me, and 4096 for the opponent. each possibility has a visit map, which is stored in a array of 15 ints (I guess many people had this idea)
when the possibilities for the opponent shorten, i start charging torpedo, else I charge silence + mines. I lay mines behind me and trigger the mines when there is a good chance the opponent is here.
I try to move in a way there are as many tiles as possible left for me to move on (so if there is a path to the left and another to the right but the right path is only a few tiles left before I must surface, I probably won’t go there) - while still trying to obfuscate where I could be. when my possibilities shorten too much, I try a silence.

This is enough to reach top Silver.

I search for silence targets. That is, a possible target that uses silences that are always the same distance (Silver boss that always do a Silence 0 / players that only use Silence 1 (direction). If almongst the possibility there is one and only one where it is all Silence 0 (direction) or Silence 1 (direction), it immediately becomes my silence target. This is almost opportunity free and harshly punishes predictable players.

And with that, I was sitting happily in the gold league once again.

Thanks for the challenge, and thanks for the chat for teaching me how to read STDIN in C++ to relieve my pain of reading many commands in a single line. Yup, I didn’t know how to do that before!

vbplaya · April 20, 2020, 6:43pm

I really enjoyed doing this … and I feel like I’ve learned a lot. Since this was my first time doing a contest, I don’t really feel I got a lot to say on the game itself because the whole thing is new to me and have nothing to compare it to. The only thing I’d like to comment on was the leaderboard. I found it hard to search by schools. Frequently, I tried putting in my school and got no results. I also put in some other school that I saw right there in the leaderboard, and again, got no results. The search just didn’t seem to work. I wanted to see if there were any more participants from my school so I could see how they were doing but I couldn’t. When I did the “Top 100” for schools, it only showed like 37 schools. The other thing I was confused on with the schools was did these people who were in a school group join a group together or where they automatically merged because they had the same school? I am trying to get a couple of my alumni buds to join the next one but I want to make sure we can see how we are doing and not get left out of the mix.

But the contest itself was enjoyable and I am looking forward to the next one.

_BZH_TidyMaze · April 20, 2020, 6:54pm

Hey, here is my quick PM after this contest where I reached #259 (Mid gold).

Warmup

I started using Scala, I implemented a basic tracking with positions to get from wood to bronze, it was enough to get an almost bug-less set of positions where enemy was even if I didn’t consider everything (visited path to reduce even more).
Quickly I discovered JVM was pretty bad for quick response due to warmup. I didn’t want to warmup individually each function so I decided to change language and pick Haskell for many reasons: an eleguant, compiled, pure FP and strongly typed language. It took me ~2 days to recode everything. Not a big deal even if I’m quite new to contest in Haskell. My Scala code was already very FP and I got the same exact results with almost no timeouts.
I used a BFS called for each neighbor, taking the one that could led to furthest cell. It’s enough to avoid traps, even if it sometimes create suboptimal path.
I decided to totally ignore mines (owned by opponent or me), my goal was to reach gold without because I’m lazy and it seemed fun.
I used a mean deviation to decide if enemy candidates were packed enough (I prefer 6 candidates all very close from each other rather than 2 distant by 6 cells …). If it was packed enough I would shoot the center. I thought it was pretty clever.
Spawn at center of map

How to get to gold

I really wanted 0 timeouts so I dug into some black magic for high performance: IOUArray, {-# UNPACK #-} and !bangs (thanks Stilgart and JBM for all your help). Also it is necessary to pick the best data structure for the need (Map, Set, Array etc …). I spent a lot of time replacing everything and pushing in arena without noticing any difference. I spent time coding a bitstream to export a full state to execute it locally and target the slowest function with profiteur.
It was necessary to track visited cells (using a Set) instead of just position to reduce candidates even more. Surprisingly performances have not been too much impacted as long as you use a lot of sonars :-).
Replacing mean deviation by a scan to find the center of a 3x3 square that would lead to at least 1 damage was really useful and simple.
I coded what is described by other as FATALITY but using it even for 1 damage without kill. It combines MOVE, SILENCE and TORPEDO. I allow taking damage if I am sure to deal more damages than I receive.
Use sonar if I don’t know at all where enemy is and candidates are in more than 1 sector. Pick the sector with most candidates.
Use self-tracking, if less than 20 candidates replace MOVE N by SILENCE N 1. Not sure if SILENCE 0 would be better, didn’t try.

What I did wrong

Spending a LOT of time doing nothing, watching replays without really analyzing. Next time I will read Bob advices again.
I still didn’t use a proper eval, rather a sort by very few criteria in tuple (+diffDmg, +dealed, -actionsLength).
I didn’t use a fancy algorithm like MC(TS) or minmax even if time was really long enough.
Ignoring mines is dangerous. Ignoring things is dangerous in general.
Didn’t code a nice spawn (avoid small water areas).

Conclusion

I’m not sure to pick Haskell for a next contest if I need REALLY high performance because it can quickly get very complicated and C++ is still a beast (I will probably try Rust). Anyway the resulting 600 LOC are pretty clean, readable and safe!

It was a very nice contest, thanks! The tracking was a barrier for some newcomers I think, but it was interesting to solve. Overall you can build a simple and efficient strategy and have a lot of fun!

GaborZeller · April 20, 2020, 6:59pm

Thank you for the competition, really glad that this was released during the outbreak.

Language: Typescript / Final place 109

Highlights of the game strategy

Tracking the enemy

Since neither player knows the opponent’s starting location it was crucial to have a system that tracks the opponent’s movements and possible locations.

In order to achieve that at the start of the game I loop through the map cells and create a phantom submarine on each empty water cell. At the beginning of each turn I loop through these phantom submarines and filter them based on my and the opponent's commands.

This got tricky with the introduction of the silence command at the later levels, because that is the only command that increases the number of possbile locations.

Originally when the opponet was using the silence command I created copies of the submarines, but unfortunately this resulted in timeouts, especially with bots that were spamming the silence command. So instead of creating copies, for all the submarines that ended up on the same coordinates after executing the silence command I created a composite of those submarines.

Tracking mine deployment

The mine tracking system went through several iterations and was one of the most difficult parts. The final version looks like this:

Every phantom submarine has a mine tracker which is a dictionary for mine ids and coordinates
Every time a mine is deployed a new key is created on the mine tracker dictionary (the id of the mine), and that object gets populated with all the coordinates where the mine could be deployed.
When a mine is triggered I remove the possible locations from each phantom

This way if a phantom submarine gets eliminated the possbile locations for mines also get eliminated alongside it, and if the opponent executes the silence command, the mineTrackers get merged.

Decision making

Basically the way my AI works is it loops through a list of command sets grouped together and chooses one command from each set if the utility is above a certain threshold.

command set example:

[SURFACE, TRIGGER, MINE]
[TORPEDO]
[MOVE]
[TORPEDO]
[MINE, TRIGGER]
[SILENCE]

If a command was already chosen from a previous set then it skips that calculation.

The utility calculation for each command is a combination of smaller reusable utility functions (e.g. calculateDesireToHideUtility, calculateFireTorpedoAtCoordinatesUtility) .

Looking back at the challenge I wish I did something simpler that would have allowed easier tweaking, but was curious of what could come out of this approach.

Summary

Overall I was more satisfied with the bot than not, I wish I could get with it to the highest Legend league, but that is for the next challenge.

Parts of the code that I am proud of

The code that tracks the possible locations for the opponent and mines came together nicely
The smaller utility functions that were the core building blocks in the decision making

Parts of the code that I am dissatisfied with

Although the building blocks for the utility AI were in place, the actual implementation for decision making had its “issues”.

I still need to study a lot more on how to write the part of the AI that makes the actual decisions

Again, thanks for the competition!

dabino · April 20, 2020, 8:47pm

This was really exciting contest, a great, deep game with many opportunities to improve code and learn something new. Big thanks for the contest’s creators!
I finished #56 in Legend. I am really happy with the result, and finally getting into Legend after two days of trying to get above top 30 Gold, and to fight the Boss, was a huge relief!
My bot is #1 haskell, purely functional, no IOUArray and similar magic, just Vectors, Sets and Maps. It is quite similar to many algorithm features already explained by others, so I will only touch what’s different.
Opponent tracking was very elegant and precise, about 100 LOC, also very efficient. It worked surprisingly well with haskell’s persistent data structures and lazy valuation: creating new “variants” of reality was very cheap, as no new maps/tracks are created; pruning impossible tracks was very easy. I only had to start collapsing possible tracks when I started tracking opponent’s intel on me, no performance problems before.
In-house game simulation was a big factor in my last day attack on Gold boss. I implemented all the rules during the first week and since that I was testing all new features/versions against a set of previous versions. I tested 1000 pairs of games, and it was a good predictor ob bot’s performance during submission, despite the fact that both algorithms were similar.
With my own simulation, I also tried very interesting research on how many moves decide the game. Two bots (bot1 and bot2) with slightly different coefficients were playing games, and when bot1 wins I tried to understand, how many moves should bot2 do differently (using bot1’s coefficients) to revert the game result. Interestingly, 90% of games were decided only by 1 or 2 key moves. Analysing key moves in hundreds of games was insightful to understand, what are key abilities to improve.
I spent a week not touching the bot, but trying to use cut-vertices / biconnected component search algorithm in order to have reliable estimation (something better than flood fill) for the next turn. I just could not use randomised path search algorithm due to performance limitations. As some point I gave up with the algo. But on the last day, faced with timeouts problem, I finally implemented it in the right way, and it allowed me to get to Legend.
Many hours my computer (not me) spent trying to optimise many algorithm parameters with GA search (all the weights and threshold, which were initially assigned by common sense). To my surprise, 24 hours of optimisation with genetic algorithm, 100Ks of games played - gained nothing, ks0 (initial coefficients) were still optimal! “Don’t spend time playing with constants, implement new features” - this was proved by me. Again.

Saelyos · April 20, 2020, 9:15pm

It finished #6. This contest was really cool, lot of depth so well suited for a 1-month contest. I’m also impressed by the balance of the game considering the conditions it has been created on.

The main parts of my bot are:

Weighted tracking

I record all the paths of the opponent and myself by storing a bitboard of the visited cells and a bitboard of the possible mines placement, and merge them by position if their number is too high, by performing a bitwise AND for the visited cells and a bitwise OR for the mines. At the start of each turn I also perform a merge to produce paths that will be used for the computation part. That way I keep track of every path but only do calculations on a number of paths equal to the number of possible positions, which ensure a low execution time not matter how silence has been spammed.

One thing I added near the end was to give each path a probability. This probability is added when 2 paths are merged, and it changes according to the opponent actions and positions: for example, silences are often performed with 0 or with the maximum possible distance (not necessarily 4), so those paths are given a higher probability. I originally planned to download replays and give weights to actions according to those replays, but I didn’t manage to extract something interesting from them, so I ended up giving random weights for only specific cases (silence and placement). It can probably be widely improved by treating more cases into account than I did, but at least it gave a nice boost for my bot.

Heuristics

SONAR, MINE and TRIGGER are performed with heuristics: mines are placed in order to cover the maximum number of cells (at least n new cells, n decreasing over time). They are triggered if they have a good probability to inflict damage. I only use sonar if the opponent doesn’t have a silence loaded, and if it helps cut a good percentage of paths.

Brute Force

For the other actions I try every possibility of doing (or not doing) the actions in the given order: SURFACE -> TORPEDO -> MOVE -> TORPEDO -> SILENCE -> TORPEDO. When I simulate an action, I also simulate it for all my paths to know which ones are going to be cut. The torpedoes are shot at the cell with the highest probability of dealing damage.

Then, for every opponent path I also do a brute force but without silence (SURFACE -> TORPEDO -> MOVE -> TORPEDO), keeps the best damage they expect to deal, and see what damage it will really deal me. The damage I expect to take back is then the average for all those paths (weighted by the paths probability). This makes my close combat decent, clearly not perfect as I still have some bugs to compute the expected damage I will take (there are corner cases when he can shoot at multiple good places; when he has to surface to inflict more damages will he do it ?..).

This computation was done most of the time in less than 2ms, but was sometimes above 50ms (especially at the start of the game for non-critical actions when there still are a lot of paths) so I just do 1 ply first with a damage back approximation, and keep this result if I don’t have the time for 2 plies.

Evaluation

My evaluation function takes into account:

the lives
my cooldowns
the average damage I expect to deal
the average damage I expect to take back
the number of positions where I can be
the probability of being on a mine (only if I have a low number of positions, otherwise the opponent will never trigger anyway)
a flood fill

In the end I don’t have any long term planning at all, the flood fill and trying to maximize the number of paths were just ok to not have something meaningless, but I thing I miss some crucial Elo here… as I do with all the details that are badly implemented, the unoptimized magic numbers, or the features I don’t have… we’re far from mastering this game!

Many thanks to the creators and CG to occupy us on those times, congratulation to the winners, and see you next time!

aCat · April 20, 2020, 9:31pm

I will start with a great prise for the authors of the contest. In my opinion it’s one of the best-designed games on CG.

Deep (pun intended), but without using dirty tricks like “you control N entities”; forcing you to directly approach the imperfect information issue (which could be completely ignored in e.g., LOCM), and simultaneously nearly always giving you something more to code and improve - it’s harder to reach the plateau that requires full code rewriting (e.g., need to recode full engine and sim) to improve - nearly on any stages the changes could be done incrementally.

Secondly, thanks to CG for giving us additional leaderboards of School and Companies. Small thing but a great thing. Surely makes competition more appealing for some people, and supports some platform-institution advertisement that goes both ways. CG4Edu!

Also - nice work with copying code straight to multi - although I miss double XP, this removes the artificial burden of downgrading your bot to pass till bronze and other strange cases (some people not resubmitting their bots, etc.). A step in the right direction - definitely.

Few things about my AI.
I ended #119 (Python till silver, than C#), a little bit disappointed. I started making a bot probably at the end of the second week of the contest - when showing OOC during my Institute open days stream.

I had tracker but only based on possible squares occupied by the opponent (or me in case of selftracking), not the paths. But I used some backtracking of moves to work partially as a path tracking. E.g. when opponent surfaces, and I’m updating the assumption he is on a certain square - I backtrack his orders before this silence and mark some squares as visited - thus removing usually nearly half of silence possibilities.

I tried to use a similar technique with the mine tracking - but this was mess, and the results were so noisy and unreliable that I turned this part off (which was probably a mistake).

I really liked the killermove part: I tested all combinations of silence, torpedo, surface, trigger, move to deal the final damage. It required pruning of some redundant combinations as otherwise I had timeouts.

I struggled with the movement strategy, finally I was choosing move (best node on shallow 3 level search) trying to minimize the chance that opponent will track me plus some checks on available water squares (floodfill) and a number of reachable squares in some horizon (BFS). Otherwise, torpedo/mine/trigger before or after the move. With some thresholds on torpedo/triggers expected damage, and the number of water squares/overlay with other mines for mine placement.

Lastly silence, if the opponent pretty much knows where I am. Either defensive (go as far from the opponent as possible when I had very low hp) or just the opposite, try to stick close.

Finally, I ended up in a situation in which I had no clear, easy feature to add, and tuning parameters is a hell, so I submitted on Sunday and went sleep without touching bot in the last minutes ;-). But - if I ever will try to improve my bot for the multi - I already have some ideas thanks to the all great the postmortems posted

So thank you all for the great challenge and great fun!

Flyingkid · April 21, 2020, 4:32am

Thanks for the great game and posting it to multi so quickly. I was not expecting that to be so fast. Guess that means we can work on it another 16 more days in preparation for the next contest!!

My bot did not make it very far at all. It finished in Wood 1 with my two submissions to the contest both having taken place 22 days ago … . Only implementing a crude flood-fill for pathing. I had started working on improvements to my pathing but I never finished. I guess I have more time now!!

TwoSteps · April 21, 2020, 10:43am

Great game! It’s awesome that the game worked so well considering how fast it was created.

A bit disappointed by my result (top 400). With the time invested, I was looking to make it to top 200. I think the overall level of bots was better than usual (with 4 weeks, I guess it’s normal).

I deliberately ignored mines (and I complain that I didn’t get top 200 ) because I wanted to have a very “aggressive” strategy with a few sonars and torpedoes. But enemy early silence commands countered me hard. That was fun anyway

Thank you for all the kind words and the sharing of your strategies. I hope you’ll enjoy the Spring Challenge 2020 coming soon.

@vbplaya we take into account what’s filled in the player profiles to assign them to their team. For the Spring Challenge 2020, any player will be able to contribute to their school and company. Could you please give me more details in PM about this search issues? (the leaderboard is still available, so we can perhaps reproduce it)