 # Bender - Episode 3 - Puzzle discussion

#1 #2

Anyone score 100% on this one ? I get all tests right in the IDE but fail at the validator for n^3. I tried several formulas for testing which complexity is a better match but always fail at validator for n^3. Anyone had this problem ?

#3

I’m not sure of the right way to solve this as I did some tweaking in the IDE before submitting (then it validated 100% on first try) but my issues were mostly with `n log n` and `n^2 log n` while `n^3` and others were easy to match. Which formula are you using?

#4

For each complexity I store all time/correspondingFunction(item number), then I calculate the average of these ratios then the average gap between ratios and average ratio. Finally I pick the complexity which has the lowest average gap. I believe that this is a good indicator of how the data is close to be O(something), returning the complexity which has the most “consistent” factor. I also tried formulas involving variance, standard deviation, or extrem points of the function but I don’t get a better result

#5

This is not that far from what I did before I found a formula to get the order of growth. Again, I’m not sure of it and I don’t want to gives too many hints, but it gives more understandable values. #6

Thanks, I will look into that. I was trying to use correlation coefficient but O(1) makes it impossible I believe (standard deviation is 0)

Edit: even with a trick to dectect O(1) in an other way, correlation coefficient doesn’t always return expected results. That’s weird : how can a function be “more likely” to be O(log(n)) if correlation with O(n) is better ?

#7

I tried time/correspondingFunction(num) ratio approach also and can’t find a decision method that works. I’ve tried a variety of least squares polynomial fitting and residual variance checks but they fail because the errors are not comparable between the various functions. I’ve tried to scale the ratio means to 1 and still no dice. I’m sure something close to that will work but haven’t hit on the right combination yet. I’ve tried to google what the mathematicians have said about how to solve this but haven’t found the right paper yet.

#8

Just got my 100% For each complexity, I too compute the time / fn(num) ratios, to get an estimate of the constant factor C. At first, (again, for each complexity) I tried to use the average of these ratios as the C, and then performed least squares to get the best complexity match. But that didn’t work… Then, I did the same, except instead of using the average ratio I used several ratios ranging from the smallest to the biggest one, and kept the best match. That worked perfectly, though later I realized that just using the smallest ratio for each complexity is enough (I don’t know why this happens though).

#9

Got my 100% after stripping 5% highest and 5% lowest deviant results

#10

I had the same problem on O(n^3). It seems that the validation test is not really in O(n^3), but rather in O(n^2.8), which is a bit strange (perhaps the noise in time is not multiplicative?). The problem is trivial for these O(n^x) cases, as you expect a correlation of (log(n), log(t)) very close to 1, and a regression slope of x. Interestingly, when x is intermediate between integer values, it indicates one of these tricky log n complexities – this is probably some kind of cheat and I am not sure it always works.

#11

Technically, even O(n^2.1) is O(n^3) and not O(n^2). Maybe the “most probable value” is meant as the “most likely amongst those which are possible”.

#12

It makes a lot of sense: to answer the question properly, one needs a proper statistical model with some assumptions about the distribution of residuals, and compare all models based on e.g. their AIC (assuming that the right model is always among the proposed set) or their BIC (in which case we get the closer model). The main problem here is to figure out how the noise was generated in the data.

#13

I used the exact same trick and validated with 100%.
I think there’s a cleaner way to do this, I’ll code it if I ever get the time.

Basically, the idea is to use the least squares fitting method on 3 models :

• A*(x^B)
• A*(x^B)*log(x)
• A*(B^x)

The least squares fitting method will give you values for A and B.
You can then compute the sum of the squared residues of those three models and compare them: the model with the lowest squared residues will most probably be the one that represents the trend of the input.
You can then use B to get the order of growth.

#14

The method of least squares works 100% here
I studied econometrics and statistics, we studied that method and trained to use it

#16

I’m also having problems with n^3. My program passes all the tests, but fails for the n^3 “validator”. Is there any way to see the various debug messages printed by my program, to know what I need to change ?

#17

yes, there is such way
every time you start a new task you are given a template with some comments - if you read all of them carefully you will find how to make a debug output which will not mess with regular output expected by validator
for Java it is System.err.println(“message”), for Pascal it is Writeln(stderr, “message”), etc.

#18

I know that, and I already use it a lot. It’s just that I would like to know what those messages are with the tests used by the validator. In most problems, it is possible to view an animation showing what is happening in all the tests of the validator. But for this problem it is impossible, which makes it quite hard to debug.

I’ve made lots of tweaks to my program and now all of the test of the IDE are succeed without any doubt, but with the test of the validator for n^3, my program thinks it’s O(n^2 log n) and I don’t see how I can change this.

#19

as i said earlier, use least squares method https://en.wikipedia.org/wiki/Non-linear_least_squares#Geometrical_interpretation, https://en.wikipedia.org/wiki/Ordinary_least_squares#Estimation

i also have a problem during validation in genome sequencing problem, so i think i understood what are you talking about

#20

I know that there are other methods that may be more appropriate, but I don’t see why mine succeed for all the tests of the IDE and not all of the validator.

I also already use something that is quite close to the least squares. Statisctics are not my favorite part of the mathematics, but I’m a math teacher, so roughly know how it works (even though I never looked at the method in particular).

I just find frustrating not knowing what is the problem with my program.

I assume that the main problem is that I’m assuming that if something is O(n^2) for example, it is roughly like kn^2 for some k. But in fact, it is bounded by kn^2 which is not exactly the same thing.

I prefer finding the problem in my program than just using some well known method. It is just less fun.

#21