I have recently been doing some tests to see why I cant seem to get as many simulations in some multiplayer games as other people and I noticed a huge bottle neck. My programs are greatly slowed down by nonlinear functions, specifically trigonometric functions. Using Trigonometry is quite essential in lots of the multiplayer challenges and just as important is computation time. But these two things just don’t seem to work together for me.
So I was just wondering how people are getting around this, if anyone has some tricks or advice they would like to share I would greatly appreciate it
Ps: I am using Java and just call the standard libraries (Math.cos, Math.sin)
You’ll never get as many simulations as a C++ code. But this aside, trigonometric functions are slow (even in C++). You can try to use a cache for this functions or a precalculated table but i don’t know anyone using this on CodinGame.
The main thing with trig functions (and stuff like square roots and, to a lesser extent, division) is to only use them when there is no other way around it. The plain truth is that they take more steps to calculate than other functions and so will always be slower than add, subtract, multiply, and compare. Barring the precalculated table approach that Magus mentioned, of course. It’s not the language or the computer, it’s just math.
Vector math can help you avoid some of the trig. There’s also stuff you can do when you don’t need an exact answer, e.g. comparing squares when looking at distances rather than using square root. Otherwise it’s a matter of asking ‘Do I really need to know this thing I’m trying to calculate, and can I find it another way?’
And tables don’t always make things faster. Quite often they bloat the code and hit the CPU caches hard.
Thanks for the quick replies!
@Magus that makes me a bit sad, do you have an idea of just how big the performance difference between Java and C++ is? And so do you generally just take the hit and use the functions or do you try your darndest to work out a way not to use them?
EDIT: I just ran some tests, C++ was almost 4 times faster than Java when calculating the cosine of a random angle…
@MrAnderson I suppose you are right, just try work around them. But in regards to vector math, I assume you would then not represent your vectors in polar form and rather as changes in X and Y coordinates?
@rOut well there goes my last trick I was going to try XD meh, I think I will investigate that route anyway, just for the sake of being thorough (and learning)
Having experimented recently with lookup tables for sin() and cos() in CSB, here is the performance gain I saw :
 old code with no lookup table, valgrind says that 31% of my instructions are used for sin() and cos() calculations.
 new code with lookup table : on the CG server, I gained roughly 20% simulations compared to the old code
Interesting, is the code compiled with O3, inline and fastmath pragmas? From my tests, it wasn’t beneficial at all.
@Magus has it kind of correct. Memoization is probably the only addition I’d add rather than make use of a precalculated table.
Now, I’m off to go back to doing evil things in C with pointers that don’t actually work–because doing evil things in C with pointers never properly works… (doing things in C with pointers is fine… once you move into black magic/voodoo territory, you start to risk releasing the magic smoke).
Thankfully Codingame is set up to allow me to practice black voodoo and never release the magic smoke! Hooray!
So I did some tests and my results were the following:
I wrote a piece of code that repeats a function call for 100 milliseconds and repeats that for one minute (so that I can get an average number of function calls for 100 milliseconds).
I also created an array of 7000 doubles that maps evenly distributed angles to Math.cos calls.
The accuracy of the lookup function is 99.68974% and the performance differences are as follows:
Library function Math.cos:
An average of 638 981 function calls in 100 milliseconds.
lookup function:
An average of 1 118 509 function calls in 100 milliseconds.
That is a 175.05% increase in function calls per 100 milliseconds!
Now, this is great news but not too great. It seems to be much more efficient but this was on my computer and not on the CG server. Thus results may vary in an actual challenge (I will do those tests too  if anyone is interested) not to mention that this does take quite a bit of memory and my local IDE (eclipse) was complaining that the array was too big (when I tried to map 10 000 values) and that you can only store about 65 000 bytes of constants. I have no idea if that applies to the CG servers or not. Furthermore the accuracy of the lookup function isn’t that great. If you used it in CSB, your pod could end up fair distance away from your simulated position (sometimes up to 1000 units when simulating at a depth of four moves).
So, now I leave you all to decide if you think it is worth it to use a lookup table or not
 Beware that this type of optimization should be tested in real conditions too. (ie: within the main code !).
As written by rOut :
And tables don’t always make things faster. Quite often they bloat the code and hit the CPU caches hard.

Also, with 1 000 000 function calls in 100ms, your trig functions enter the realm where you’ll start to notice the time it takes to call the getTimeOfDay() function. Just as an experiment, you should try to replace “fixed time > count the number of iterations” by “fixed iterations > count the time” : you might be surprised

In CSB, you communicate with the CG server by giving a target point. If you calculate the position of the target point with the lookup table, there is no loss of precision in your simulation !

You and @rOut are 100% correct, I have yet to test it in an actual application and when I do I will let you know

Correct again! I used a for loop and this time the lookup was about 5 times faster than the library function! I did a lot less testing that before on this but it still looks quite solid.

One more time, spot on! I forgot about that
Hi.
You can try this library :
Its “sin” and “cos” functions can be 2 to 3 times faster than Math ones if called intensively,
while being about as accurate, but it also has faster “sinQuick” and “cosQuick” functions
if you can bear their 1.6e3 accuracy.
You can also fork it and reduce lookup tables size, or remove one or two derivative terms,
to make things even faster by sacrificing some accuracy.
Thanks!