Guys, I need some help. I already solved Mars Lander with an hardcoded strategy based on if/then/else, but I wanted to practice stochastic algorithms, and as Jeff06 pointed out, Mars Lander may be a good candidate for this.
I'm trying Simulated Annealing at the moment, as it looked easier than GA to start with (less parameters to tune, easier to "figure out"). I'll for sure try GA and MC/MCTS later.
However, I just don't manage to make it work, and I'm crashing far more shuttles than ESA
Here's what I've done:
1) simulate the "game engine": DONE (took some time due to rounding issues as mentionned above, but always correct now, including crash and landing prediction)
2) implement SA algorithm, based on wp and this: DONE
3) tune the SA, deciding on initial temperature, annealing schedule, defining what is a "solution", generating the first solution, generating a neighbour from a solution, and writing the "energy" fonction (score, fitness, whatever you call it): DONE but most probably wrong in one or several of these
4) have performant code to have enough simulations done in 100 ms (or 1s for the first turn): ONGOING, at the moment in Java i can "apply" the game engine on a gamestate roughly 500k times in 100 ms. The number of playouts then varies depending on the chosen depth of a solution (for example, 10k with depth 50). I'm not spending time here at the moment, I think that it's enough and that I should more focus on 3.
And here's what I get as results: crashes, crashes and crashes I'm not even trying to optimize the fuel or working on complicated levels.
If I set the depth too high like 150, I'm not even finding one solution in the first second. They all end in crash, so the evaluation of the solution is always the worst possible one.
If I set it too low like 10, then the shuttle goes too fast, not anticipating early enough the coming crash.
Inbetween like 50 does not work any better.
So here are my questions :
1) Is it correct to try SA on this problem ? I think it is, but if you think otherwise please tell me why.
2) What depth should a solution have ? If I understood well some comments, some people take only the remaining fuel at landing into account to compute the score of a solution. It then means that solutions should have enough depth to at least allow a landing, so 150 looks like a minimum ?
3) For the score/fitness, at the moment I'm using a mix of the distance to landing target (middle of the landing zone) and of the speed of the shuttle. When I'm far away from the target, I focus on reducing the distance, when I'm close enough I'm more focusing on controlling the speed, in order to allow for a valid landing. For sure I'm not going to optimize anything like that, but I should easily find solutions. At least this is what I thought, but CG contradicts me
4) Any advice to share on the initial temperature and it's annealing scheduling ? For now with a depth 50 for instance I start at 100, removing 0.1% at each iteration, so ending close enough to 0 at the end, which is what we want.
5) Any advice on how to generate the first solution ? For now it's a completely random chain of actions (within the boundaries of rotation / thrust changes allowed by the game from the previous state)
6) Any advice on how to generate a neighbour ? For now I'm picking randomly one round, then modifiying it randomly again.
7) Any advice on the score/fitness function ? Does my distance / speed mix makes sense ?
8) Did I completely misunderstood SA ?
I think that my biggest problem lies in 6, or maybe 5. Feedback would be much appreciated
I can share replays if it can help to diagnose the issue, or also pseudo-code.