Neural Network Ressources

Actually, to train neural nets there must be environment for personal use. If there is a way to interact with environment locally, it would be great. Otherwise you need write it on your own, but obviously often it’s not possible.
It would be good if there is a server.exe file for environment or something similar.

Hello,
Can you help me?
What to do to save the weights for neural networks in “codingame”?

Hello Batya_S1
you can’t save from online IDE.
You only can train locally with your computer… then you can create a list of weights to use online.

1 Like

Thank you jolindien

After 4 years here I come.
I have a question to dataset.
I really don’t understand input.

int D = 602+fastRandInt(0, 10000)*sqrt(fastRandDouble());
double angle = fastRandDouble(-PI, PI)*sqrt(fastRandDouble());
double angleV = fastRandDouble(-PI, PI)*sqrt(fastRandDouble());
double normV = fastRandDouble()*fastRandDouble()*1000.0;
double angleNextDir = fastRandDouble(-PI, PI);
double distanceCP1CP2 = fastRandInt(1300, 10000)*sqrt(fastRandDouble());

could you describe those variables?

I guess that angle is current pod angle, angleV is angle of speed, normV is norm of speed vector, distanceCP1CP2 is distance between next and next next checkpoint.

But what is angleNextDir. I can’t guess it.

angleNextDir is the angle between (pod->CP1) and (CP1->CP2)

I have copied the header of the datafile here and reorganized/added a few comments :

// Section 1 : distribution of random gamestates used for this database
int D = 602+fastRandInt(0, 10000)*sqrt(fastRandDouble()); // Distance between pod and CP1 (his next CP)
double angle = fastRandDouble(-PI, PI)*sqrt(fastRandDouble());	// Assuming angle 0 is "straight to CP1", this is where the pod looks
double angleV = fastRandDouble(-PI, PI)*sqrt(fastRandDouble());// Assuming angle 0 is "straight to CP1", this is angle of the normalized pod     velocity
double normV = fastRandDouble()*fastRandDouble()*1000.0; // Current speed of the pod
double angleNextDir = fastRandDouble(-PI, PI);// Assuming angle 0 is "straight to CP1", this is the angle between (pod->CP1)     and (CP1->CP2)
double distanceCP1CP2 = fastRandInt(1300, 10000)*sqrt(fastRandDouble());// Distance between CP1 and CP2

// Section 2 : features calculated from this distribution
double angleScalD = dot(vec2ByAngle(angle), vec2(1.0,0.0));
double angleCrossD = cross(vec2ByAngle(angle), vec2(1.0,0.0));
int vScalD = (int)(normV*dot(vec2ByAngle(angleV), vec2(1.0,0.0)));
int vCrossD = (int)(normV*cross(vec2ByAngle(angleV), vec2(1.0,0.0)));
double nextDirScalD = dot(vec2ByAngle(angleNextDir), vec2(1.0, 0.0));
double nextDirCrossD = cross(vec2ByAngle(angleNextDir), vec2(1.0, 0.0));

// Section 3 : data from section 1 and 2 is normalized this way before being provided to the neural network as input (saved in the database on one line with space separators)
(D-600)/20000.0 
angleScalD
angleCrossD
vScalD/1000.0
vCrossD/1000.0
nextDirScalD
nextDirCrossD
(distanceCP1CP2 - 1200.0) / 10000.0

// Section 4 : same as section 3, but for the output (saved in the database on one line with space separators)
(gameAction.deltaAngle*180/PI+18.0)/36.0
gameAction.thrust/200.0

// Miscellaneous : 
dot(const vec2 &a, const vec2 &b) {
  return a.x*b.x+a.y*b.y;
}
cross(const vec2 &vec, const vec2 &axe) {
  return vec.y*axe.x-vec.x*axe.y;
}
6 Likes

Thank you so much :smiley:

Can you tell me what does vec2ByAngle do (alternative in python), and for ‘angleV’ we have to multiply angle with sqrt(vx) or sqrt((vx * vx)+(vy * vy))?
Can you please clear that.

Vec2ByAngle(angle) is {cos(angle), sin(angle)}

as for angleV, this was a value used internally to generate the positions in the database. It does not appear in the input features or in the output values of the neural network. If you wanted to recalculate it for whatever reason, the formula would be atan2(vy, vx) in radians.

There is no relation betwen angle and angleV, except for the poor uninformative names which have changed in my codes now :slight_smile:

To give some reference points to others who would be tempted to go the neural network route for Mad Pod Racing / Coders Strike Back:

  • My very old C# genetic algorithm from back in the day got me in legend at the time
  • Training a neural network runner based on @pb4’s public dataset got me to about 650 in legend (python + TensorFlow 2)
  • Emboldened by this success, I started on the reinforcement learning path (which was significantly more work) and it got me to about 500 in legend

I kept my network small (6x64x64x5) so all I needed was to encode the TensorFlow weights in base64.

At this point it looks like you’d need to also train a “blocker” to progress further!

but how do you guys train your AIs outside of codingame?

To train a network on some existing data like @pb4’s public dataset you don’t need anything too special as you have the inputs and outputs already. At this point you need to realize your neural network will simply learn/approximate the behaviour of the decent “runner” that generated that dataset.

The next step is to write your own neural network library (best to learn the fundamentals) or use an existing one like TensorFlow/Keras or PyTorch (good option too). You feed the inputs and outputs and hope the network learns. You’ll want to visualize the loss over time at this point.

You probably want to write some kind of visualization tool so that you can see what your AI does and double check that what your bot is learning makes sense (but I guess that’s technically optional?).

Once your network has learned something, you can extract the weights and somehow make them accessible to your codingame script. This could be in raw text form e.g. arrays of float numbers or it could be more efficient/advanced like base 64 encoded (easy), some custom encoding (base85?) or take advantage of unicode magic (probably harder). Don’t worry too much about this step, keep your network small so that a basic encoding will not hit the 100k code limit.

Once you got the weights in codingame you need to implement a simple neural network forward pass. That’s the basic algorithm that multiplies your inputs by the weights, add them and apply an activation function. Rince and repeat if you have multiple hidden layers.

Talking about inputs, if your network learned from an existing dataset, make sure you transform the codingame state/inputs for the turn to whatever format your network learned on. This can involve computing angles, distances, … Triple check everything as it’s easy to have bugs in that part.

That’s it, you have now your network outputs in codingame. Map that to what your AI needs to output to the game (turn/thrust) and you’re in business! It probably won’t work the first time your try but don’t give up :slight_smile: At some point your runner will move nicely and you will have applied machine learning to a codingame challenge, congratulations!

For reinforcement learning, that’s a bit more involved as you basically need to create some kind of simulator for the game you can run on your local machine.

In most of the recent competitions, a java version of the referee (the game engine) was provided so you can reuse it or rewrite it in a different language. For older games you may be able to find 3rd party implementations online. Check the readme on this repo for some links.

You can train a good runner by building a simplified simulator that only runs a single pod (no bounce, no shield).

To train a team with a “blocker” you might need a full fledged simulator with all the pods, possibly using self play. The training is going to be an order of magnitude harder.

Good luck!

6 Likes

Thank you, pb4, for your dataset and all those explanations. Unfortunately it was almost impossible for me to create the mapping from the game output to the input for the network meeting the criteria for this dataset.

I am very thankful to FrancoisB, who supplied me with some kind of verification set. I therefore was able to check and debug my mapping. Maybe it might help someone else, as well.

For anyone wondering: angles going “to the left” are supposed to be negative and those to the right, positive. Oh, and try to differentiate between deg and rad functions :wink:

x =           [ 3,    3000, 1000, 1000, 413]
y =           [ 5,    4000, 2000, 1000, 4577]
angle =       [ 40,   85,   225,  265,  330]
vx =          [-50,   80,   120, -213, -80]
vy =          [ 80,   100,  120,  155, -90]
nextCPx =     [ 3000, 2000, 3000, 8000, 4023]
nextCPy =     [ 8000, 2000, 2000, 8000, 7582]
nextnextCPx = [ 8000, 9000, 5000, 5000, 8123]
nextnextCPy = [ 2000, 1000, 2000, 5000, 3523]

//maps to 

(D-600/2000)                = [ 0.3969134397041162   0.08180339887498948  0.07                 0.46497474683058326     0.20485168191861008]    
angleScalD                  = [ 0.8707763031101143  -0.9300008585858587  -0.7071067811865475  -0.7660444431189782      0.3457185604113709]     
angleCrossD                 = [-0.4916793975161887  -0.3675573465863057  -0.7071067811865476  -0.6427876096865391     -0.9383382529701586]     
vScalD/1000                 = [ 0.05735940292011353 -0.12521980673998823  0.12000000000000001 -0.041012193308819826   -0.1190645081677152]     
vCrossD/1000                = [ 0.07489925831841109  0.026832815729997458 0.12                 0.2602152954766495     -0.017990077675765652]   
nextDirScalD                = [-0.4946314746513534  -0.31622776601683805  1.0                 -1.0                     0.09608170209599205]    
nextDirCrossD               = [-0.8691028157152798   0.9486832980505138   0.0                 -1.2246467991473532e-16 -0.9953734507823367]     
(distanceCP1CP2-1200)/10000 = [ 0.6610249675906654   0.5871067811865475   0.08                 0.30426406871192846     0.4569357069899557]     

All credits go to FrancoisB and pb4, of cause!

3 Likes

If you use neural network with 2 outputs - angle and thrust, what is your typical losses on this dataset? I’m trying something similar for search race, and so far I fail to get meaningful results on supervised learning. Tweaking randomly weights via neuroevolution yielded better results so far.

Hey! Sorry to bother you after so much time, but I was wondering whether you could share what inputs you used for your NN. You mentioned you have 80 inputs. I’m assuming you preprocessed the inputs (e.g. normalized, relativized, etc.) in order to preserve translational and rotational invariance and facilitate training.

Thank you!

How do you calculate angleScalD ?
for the exemple of an angle of 40:
radians = (40 * pi) / 180 = 0.698132

angleScalD = dot(vec2ByAngle(angle), vec2(1.0, 0.0))
angleScalD = dot(vec2ByAngle(0.698132), vec2(1.0, 0.0))
angleScalD = (cos(0.698132) * 1.0) + (sin(0.698132) * 0.0)
angleScalD = 0.766044

but according to your verification set the result is:
0.8707763031101143

ezgif.com-gif-maker (1)
I’m trying reinforcement learning (cliped PPO) for Mars Lander EP3.
The model can land from a half way now. Still training for a success landing from starting point.

Thanks a lot for your sharing! I’ll probably need "somehow make them accessible to your codingame script. " later.

3 Likes

ezgif.com-optimize(2)
finally a working policy!!!
lots to do to “somehow make them accessible to your codingame script.” :)))

update: my 4 layer x 200 nodes NN is too big,
just one 200x200 matrix is near 200k (in base64 format). not possible to use here…

any suggestion?

2 Likes

Do you really need that much nodes for mars lander? How many weights / floats do you have in total. If lowered precision isn’t the issue, I can do ~15.5 bit precision float per character.

Thanks for your help!

Could you please show more details about “15.5 bit precision float per character.”? I am kind of newbie about this part.

I didn’t try a smaller NN yet, the 4 layer of 200 nodes is my first working network. maybe I will try to train a smaller one.
in terms of parameters, input 13x1, then, 200x13,200x200,200x200,200x200,5x200 to get output.
numpy arrays in np.float32. (one 200x200 weight matrix numpy array means a length of 213336 base64 string)
Is it possible to make the whole thing under 100k limit?

Thanks again~