Well, it took me three months and fifteen submits. It got to around 120ms for the last test case in C++ (my solution turned out to be somehow similar to Agade's).
Throughout the three months, I've read through this entire thread multiple times. There were, of course, some discouraging posts up there amidst the ones that were actually helpful.
This puzzle wasn't entirely easy for me but I'm happy that it wasn't easy - the process to get to the end was exhilarating.
If you're stuck, persist; and if you haven't started the puzzle already, do start and try it out.