Coders Strike Back: Your ideas!

kazrakel · March 7, 2016, 7:59pm

Great idea!!

Hastings · March 7, 2016, 8:39pm

4th here

My AI was composed of two levels, a heuristic bot designed to make a decent move with very little computation, and a minmax bot that uses the heuristic bot, which plays both sides, combined with a simulator to evaluate positions.

The heuristic bot is somewhat simple. It first checks if it is the furthest along on its team. If is is, it plays offense, otherwise it plays defense.

On offense, the heuristic bot targets the point nextCheckPointPosition - 4 * velocity, and only thrusts if it is approximately facing the point it is targeting. The defense bot uses a similar formula to either block the opposing bot, or beat it to the checkpoint after the one it is currently passing. See if you can work out why this formula approximates correct behaviour!

The minmax bot selects the closest enemy pod as the opponent it will minmax against. For each turn in the future, it considers four moves for itself, and four moves for the selected opponent, giving 16 combinations of moves per turn. The other two pods are controlled by the heuristic bot. These are each simulated recursively for three steps, giving 6500 positions. This is where the two level hierarchy comes in: to evaluate further into the future, the bots are assumed to all be controlled by the heuristic bot for an additional 4 moves. Each position’s score three moves in the future is given by a simple function of the predicted position after an additional 4 heuristic moves, giving a total depth of 7 moves.

Once each of the 6500 positions are evaluated, the bot simply selects the best using the standard minmax formula: maximizing the score, assuming that the opponent will select moves that minimize the score.

Obviously I have made some simplifications and omissions here, largely in the details of the “simple” score function and the details of simulating motion and collision. However, I think I’ve presented the important parts of the algorithm clearly

pb4 · March 7, 2016, 8:56pm

I was looking for good basic heuristics, and was not able to think about this one. I like it a lot !

Psyho · March 7, 2016, 9:19pm

8th place here

The core of my strategy is pretty naive beam search for pathfinding (0/50/100/150/200 + -18/-9/0/+9/+18 first move and 0/200 + -18/+18 for subsequent). Without any big optimizations, I was able to get depth of 60 and width of 1000 with duplicates removal (interchanging left/right rotations). Evaluation is done based on (laps, checkpoint, distance_to_next_checkpoint). WIth 60 steps simulation, I can safely ignore rotations, etc. By default pathfinding ignores position of all of the ships.

I’ve added “no go” zones to my beam search and a different evaluation function for tracking enemy player.

Racer/Blocker (for both players) split is based on EV using smaller beam search, it works quite well during usually very chaotic start. My blocker avoids my racer using “no go” zones in beam search. Racer tries to avoid the enemy blocker if the blocker is far from the next checkpoint. Otherwise it will just mindlessly ram the enemy pod (which is not such a good strategy). Blocker tries to intercept enemy pod based on expected path (computed via beam search) for the enemy racer and tries to maximize velocity during collision. In a way, it works great because it can very accurately tell when is the closest interception moment (assuming enemy bot is focused on racing) and can calculate “snipes” up to 15-20 turns ahead.

Overall, my AI did horrible job at blocking / pushing away enemy blockers (adding to my eval collision angles would help tremendously). I also didn’t simulate the collisions, so quite often my blocker helped enemy pod by pushing them into correct spot. On the flipside, when enemy didn’t have any good blockers (which wasn’t the case in top10), I was able to quickly race through the checkpoints.

And that’s pretty much all. Lots of time wasted on reverse engineering the physics and in the end not implementing it, since I ran out of time in order to have any chance of getting into top3. At the same time, staring at the visualization was fun, so I should complain that much

PierreBizeul · March 7, 2016, 9:38pm

This heuristic is mysterious for me, it seems to work quite well on a drawing, but where does it come from ? and why 4 ?

@closedeyez : I don’t really get how you can go to depth 60, with a width of 1000, how do you select the candidates moves ?

Hastings · March 7, 2016, 9:49pm

I used 4 because it performed better than 3 or 5.

PierreBizeul · March 7, 2016, 10:05pm

Ok, and where the idea of considerin Next_CP -k*velocity came from ? I mean, once said it seems interesting but it’s not the first thought you get, right ?

Psyho · March 7, 2016, 10:25pm

@closedeyez : I don’t really get how you can go to depth 60, with a width of 1000, how do you select the candidates moves ?

Beam Search is basically a BFS where you take only the best WIDTH states at every turn. I already told the evaluation function (it’s the same for selecting intermediate and final states). This is a very standard technique, although not as widely knows as for example A* (which sucks here compared to BS).

Roxa · March 7, 2016, 10:45pm

Oh yes ! I really need your PRO advices to progress !

C’est vraiment une super idée de faire ça !

Merci d’avance

Bleuj · March 7, 2016, 11:06pm

Interesting approach !
My early bot used the same technique as your heuristic bot but with 3.5 coefficient instead of 4 if I remember well (also I released the thrust when not facing the CP). Nice way to estimate a normal behaviour rapidly. I didn’t think of using it to gain depth when I moved into prediction.

_CPC_Herbert · March 7, 2016, 11:22pm

26th here

I also did some Monte Carlo tryouts until running out of time. I had a model with a target point for each pod in game, and I assumed that each pod would accelerate towards that target for at least a certain amount of turns. For runners, this was usually a point near their next checkpoint, for blockers they usually targeted somewhere near the other runner.

For the enemy pods, I predicted these target points either from the intersection of the acceleration vector of the previous turn and the one of the current turn if there was any, or an arbitrary target point 5000units ahead in the direction of the current acceleration. Also, during the turn prediction, if I detected that a blocker was near my runner, I assumed that it would target its (position + speed + acceleration) point.

To choose the best move, I selected a random target point for my pod, in 5000 units around its position, and a random thrust value, and I simulated an acceleration towards it for 10 turns (I did some experiments with more or a random number of turns but it wasn’t more effective). For each turns, I predicted the whole system, including collisions, and the enemy supposed trajectories, and evaluated each trajectory with different evaluation functions depending on the pod strategy.

I wanted to evaluate the moves of my two pods simultaneously but couldn’t manage to get good results. So I evaluated my blocker strategy first, with the evaluation function trying to maximize the distance of the ennemy runner to its next checkpoint, with penalties when it reached it, as well as the proximity of the blocker to it. The runner evaluation was the opposite, trying to minimize the distance to the next CP, with bonuses when passing through it and for proximity to the checkpoint after it.

I guess it wasn’t too bad, but that’s probably because I did it a lot of time…

Regarding the shield usage, I had it initially as a randomized parameter, but I finally moved it just before outputting the result, and activated the shield every time there was a collision going to happen with 100% certitude, except when the collision was on the runner and from behind. In the end it was pretty effective to kick other pods out of my way.

Bleuj · March 7, 2016, 11:34pm

@Psyho_ : Thanks for the clarification. (I tried to reduce Width at some point, but in my DFS… now I feel stupid.)

15th place here, first AI contest

About my ideas, I just had two runners and I used basic prediction of depth 5/6. I tried to both minimize the remaining distance and improve some angles. There are a couple extra things like shield implementation but that’s it. I got 15th, which seems to be a level far worse top 10 and quite similar to the rest of top 30 bots but for some reasons it seems slightly better than people behind me, or I was just lucky.

I can’t wait to modify my bot with beam search or genetic algorithmn and see the result.

dimkadimon · March 8, 2016, 12:34am

I used the same heuristic except with 2.5 factor. This gave me a top 150 solution (at the time) with just a couple lines of code! Annoyingly this heuristic was pretty hard to beat with more complex methods.

My final solution searches over all factors (0 to 4 in steps of 0.4) and all thrusts (0 to 200 in steps of 10) for the next two checkpoints. I activate the shield if I am about to hit an opponent and push him away from his next checkpoint. I finished 262 overall.

R4N4R4M4 · March 8, 2016, 7:12am

Hi,
Same for me too, with next position (enemy pod) instead of next cp(enemy pod) and * 0.6 * distance (my pod, enemy) * velocity (enemy) instead of - 4 * velocity. This heuristic is only for very far targets. for near targets, i’m using next position of target. The choice between attack pod or defense his next next checkpoint is based on the formula distance(my pod, enemy next cp) < 1.2 * distance (enemy pod, enemy next cp) then attack only if 6 * distance(my pod, enemy next cp) < distance (enemy pod, enemy next cp). This give my pod time to be well oriented before thrusting.

Agade · March 8, 2016, 11:11am

I arrived 19th.

Summary

My bot used 2 “runners” who ran the race as fast as they could while pretty much ignoring each others’ and the enemies’ movements.
I had a greedy algorithm that, at every turn, would try the angles from -18 to 18 by steps of 0.25 degrees. And it tried those angles for thrusts 0 and 200. Every move was evaluated and the best one was picked by its perceived value for the future. The key was my evaluation function which gave a good representation of my pod’s ability to get to its target quickly.
The pods would obviously aim for the current checkpoint. However, an important part of the program was to check, with pretty much the same AI, if it could start aiming for the next checkpoint and still pass through the current checkpoint within 10 turns.
There was also a sort-of hardcoded behavior that if my evaluation function said the best move had thrust 0, I wouldn’t aim for the target given by the evaluation function but instead turn the pod towards the target as much as possible, that is to say I would test +18 and -18 degrees and compare the angles.
If a collision is detected with an enemy pod for the next turn, assuming I don’t accelerate and the enemy pod doesn’t accelerate, I shield if the relative velocity norm(v1-v2) is superior to 500.
Computational cost is very low, bot answers in <0.2ms though it went up to ~1ms at the end of the contest because I looked at angles by steps of 0.25 instead of 1. But it didn’t seem to help too much because we don’t need to be that accurate.

What is a scalar product?
Many of you know that if you have two vectors (a,b) and (c,d) you can take the scalar product S=ac+bd. But not all of you understand what this scalar product represents.
What you should understand is that if you have a vector d=(x,y) and you normalize it, that is to say divide by its norm sqrt(x^2+y^2), you get a vector of norm 1 that points in the same direction, let’s call it unit_d. If you then take a vector v and take its scalar product with unit_d, you get v’s component in the unit_d direction.
For example, you all know that for a vector (a,b), a is its “x component”. This is because (1,0) is the unit vector that points in the x direction and (a,b)*(1,0) = 1a+0b=a.

Evaluation function
To evaluate that a pod is going towards a target a really important thing to look at is the projection of its speed in the direction of its target, that is to say the scalar product between its speed v and a unit vector going from it to its target. If the pod’s position is r and the target’s position is t this vector is t-r. However, if you take this as your evaluation function your pod will basically always aim towards the next checkpoint and accelerate at 200 if its angle is within 90 degrees of the correct angle and not thrust at all if it faces the wrong way. If you code this you will find your pod spinning around the target alot.
To fix this we need to take into account the other component of our speed vector, the orthogonal component. If we have our unit vector d_norm=(a,b) which points towards the target, we can make an orthogonal unit vector o=(-b,a). We can then take the scalar product of our speed with this orthogonal vector to get the component of our speed that is pointing at 90 degrees and making us spin around the target. If I call d the distance to my target a very good quantity that represents the problems that the orthogonal speed is causing us is: (orth_speed^2)/(d).
If we call coll_speed the component of the speed that goes towards the target our evaluation function is score=coll_speed-5*(orth_speed^2)/(d). The 5 is a value I found by hand. The minus sign is because this orthogonal speed is taking us away from the target.

Refinements

Don’t take the vector between the pod and its checkpoint as target, take the vector between the pod’s position after the move and the checkpoint. This better represents your future.
When moving towards your current checkpoint and checking whether or not you can start aiming for the next checkpoint, project your speed on the vector between the next checkpoint and the current checkpoint, not the vector between the pod’s current position and the next checkpoint. That is because you want to move your pod keeping in mind that you are going to be at the next checkpoint.

Blockers
On the last day I made a blocker that would intercept the enemy by moving it with my own AI and looking for the highest speed interception within ~15 turns but this wasnt so conclusive. I think a couple of high speed interceptions arn’t the best blocking method, it’s probably better to manage to put yourself between the checkpoint and the enemy and ram/block him over and over. At this point in the contest submits were extremely slow so I wasnt able to tell if having 1 blocker and 1 runner was helping or not, so I switched back to 2 runners for the final submit.

Proof of (orth_speed^2)/(d) component
If my pod is at a distance d to the target and I move by y at 90 degrees, by pythagorus, my distance becomes sqrt(d^2+y^2). We want to taylor expand this function with respecto to y. If I differentiate sqrt(d^2+y^2) I get a first order derivative x/(sqrt(x^2+d^2)) which cancels out when x==0 so we take the second derivative (d^2)/((x^2+d^2)^(3/2)) which evaluates to 1/(d) when x==0. The first term in the taylor expansion of sqrt(d^2+y^2) is therefore y^2/(d). We can therefore say that our distance is getting bigger by (orth_speed^2)/(d)

_CG_Aude · March 8, 2016, 2:23pm

Simply awesome! Can we push it on a blog article, quoting you?

PierreBizeul · March 8, 2016, 6:38pm

@Agade : I’ve tried evaluation functions with v_eff (effective speed/velocity) and v_ortho but well the Taylor’s development was a good idea, have you tried to push to the second order ? And what exactly was you method to decide when to aim the next CP, I’m interested in that.

About v_ortho, I actually had the idea of computing the total derive/divert (dont know the correct word in English) it would represent : derive= v_ortho*(1/(1-0.85)) (i.e v_ortho*(0.85+0.852+0.853…) = 5.6*v_ortho. It means how much you would derive/divert if you aimed the next CP without steering somehow (u point nextCP). then derive/d gives you an interesting deviation angle perhaps, (maximum deviation from the straight line between the two checkpoints, at the first order - it’s asin(derive/d)) and you can take that as an interesting score perhaps. I actually did not give it a try

pb4 · March 8, 2016, 6:59pm

Here is a link to a few pages describing my approach to this Coders Strike Back challenge :

My approach to Coders Strike Back

Magus : I do not go deep into the description of evolutionary algorithms, you can still write this article you mentionned

Agade · March 8, 2016, 8:14pm

Your idea of looking at the total “drift” due to v_ortho is interesting.

I didn’t push the developpement further, the truth is I found v_orth^2 after some time by looking alot at the trajectories (v_ortho^1 just didn’t give nice curves) and I wanted to put 1/d because of how I saw my pod drift too much at short distances. Then I found this “proof” and was satisfied to put v_orth^2/d. The truth is instead of taking the next order you should just take sqrt(d^2+v_ortho^2)-sqrt(d^2), it will give you the entire series, I most likely will test that when the multi AI comes out.

As for how to decide to aim for the next CP I tried to explain in my post but I don’t blame you if you didnt understand. I simulate 10 moves of my AI moving to the next CP. If in this simulation if I pass through the current CP then I know for sure I can run my AI on the next CP instead of the current one. This is actually a very robust idea Magus gave me during the short SF contest.

PierreBizeul · March 8, 2016, 11:20pm

Yeah ok, though it was quite straightforward i wasn’t sure what your pod was doing in the 10 turns. Yeah, overall I’m quite seduced by your algo