[Community Puzzle] Benford's law

CommunityBot · July 16, 2020, 1:59pm

https://www.codingame.com/training/easy/benfords-law

Send your feedback or ask for help here!

Created by @Alcalyn,validated by @Deltaspace,@M_C and @_CG_XorMode.
If you have any issues, feel free to ping them.

Wei-1 · July 17, 2020, 2:30pm

There are many cases in validation not covered in the test cases.
Such as currency sign in the front or starting with 0, should have these cases in the test as well.

E_pur_si_muove · August 11, 2020, 9:34am

This puzzle has the most disgusting validation set. It absolutely not focuses on the main task to check if the transactions are fraudulent, or not. I hated it.

BandianConde · August 29, 2020, 9:48pm

Hello,
My algorithm works well on all the test cases but do not validate the first validation case I don’t see why.
Thanks in advance

Djoums · August 30, 2020, 4:32am

The currency can be displayed before the numbers, if your test only checks for + or - it won’t work.

BandianConde · August 30, 2020, 6:24am

My algorithm removed all characters not between 1 and 9 so it also removes the currency at the beginning.

ItsAFeature · August 30, 2020, 7:44am

The percentages of the numbers in the first validator are:
1: 31.1%, 2: 18.5%, 3: 11.3%, 4: 9.8%, 5: 6.9%, 6: 6.9%, 7: 5.1%, 8: 5.3%, 9: 5.1%
Put them in your program and see if it it prints “false” like it should.

BandianConde · August 30, 2020, 8:20am

It prints False but I do not have the same percentage as you.

ItsAFeature · August 30, 2020, 8:57am

That were the percentages for the first validator, not the testcase.
Now you know that the bug in your program is not in the evaluation of the percentages, but in the part, where you find the percentages.

The percentages for the first testcase should be:
1: 29.2, 2: 17.5, 3: 12.9, 4: 8.4, 5: 9.2, 6: 7.1, 7: 6.2, 8: 5.4, 9: 4.1

Alcalyn · August 31, 2020, 7:25am

Hello, I’m the creator of this puzzle.

I can improve it, add test cases, but in your replies, I can understand what is wrong or ambiguous.

Please provide your source code, say what you expect and what you get instead, it could help me to improve the puzzle.

Else, if you have trouble with validation tests, to debug your program, you may check the validation test cases here: https://www.codingame.com/contribute/view/5174ac7fc58805b480a2d13951728e63dda1

But I know it would be better to add some test cases, so let me know

ItsAFeature · September 1, 2020, 7:10am

Hello @Alcalyn,

I solved your puzzle without any problems, got 100% on the testcases, submitted and got 100% on the validators. But I took a look on the contribution and noticed some things that could be improved in my opinion:

You have no testcase but three validators with lines with currency signs at the beginning.
According to 1. you also have no testcase but three validators with spaces before the numbers.
You could add two testcase-validator-pairs, where it’s necessary to count carefully to get the right answer. That would avoid it, that you can make mistakes on the counting and get 100% in the testcases but not in the validators (example: now you can pass all testcases even when you ignore all numbers with a 0 at the beginning).
I would suggest one testcase and validator with “true” as output and one pair with “false”.
An example for the percentages of the “true”-testcase could be: 1: 29%, 2: 20%, 3: 2%, 4: 10%, 5: 8%, 6: 6%, 7: 9%, 8: 8%, 9: 8%.
An example for the “false”-testcase: 1: 20.5%, 2: 27%, 3: 3%, 4: 0%, 5: 17.5%, 6: 0%, 7: 15%, 8: 3%, 9: 14%

I hope my suggestions will help you by improving your puzzle. If you have any questions about my ideas, feel free to ask.

Husoski · September 11, 2020, 6:14pm

No validation is possible, since no currency formats are given and no action to be taken for invalid input lines is given. Therefore no validation is needed. If there’s a valid base-ten number in the input, you’ll find its leading digit by picking the leftmost nonzero digit.

Husoski · September 11, 2020, 7:02pm

The lack of a data format for the transactions seems to be confusing people.

What confused me at first was the requirement to report “true” for a false account!

Also, the 10% fixed absolute error margin is not realistic. An input with only 1-4 as leading digits will pass, if n is large enough (n>=40 is sufficient). A smaller data set like (1, 1, 1, 1, 2, 2, 3, 3, 4, 5) is easy to construct. Any repetition of that pattern will pass.

A realistic program would probably pick bounds proportional to E/sqrt(n), where E is the expected percentage. With the right constant of proportionality, that will be roughly a fixed number of standard deviations away from the expected percentage E.

Alcalyn · September 13, 2020, 10:57am

Thanks @ItsAFeature for your suggestion ! You’re right, there is no test case with a currency or space before amount, so I added one.

I didn’t had this problem because I just /[1-9]/ so I didn’t guess it was a problem
Also, thanks to peers review, I specified in statement all possible cases. So this new test case will reject all quick readers (I also miss some rules when solving other puzzle^^).

@Husoski “the 10% fixed absolute error margin is not realistic” you’re right, I just wanted to make this law known without doing complex mathematics, so I just used this naive rule (+/- 10%) to keep focus on the law. That’s why I didn’t use standard deviation or E/sqrt like you suggested.

ElaineMarley · July 19, 2022, 7:26pm

Now I need to go and check my real account not to be suspicious.