Very educational puzzle, clear description, nice youtube links - so congrats, I liked this!!! ![]()
The puzzle itself is not ‘very hard’ at all, we just need to follow the instructions in the statement carefully. Not getting lost with such high number of different variable names and array indexes was the main challenge. The ability to play around with the modell in puzzle part-2 was also nice. I found some good parameters after 5 minutes of try-and-error.
Interestingly, having a deeper NN or with more nodes per layer did not always improve the results, so in the end, modifying Eta and maxing out the # of training iterations within the timeout threshold proved the way to go for me.
Is there some rules of thumb to choose the correct number of hidden layers, nodes per layers or eta? Are more training runs always better or some overfitting for noise can kick in?