[Community Puzzle] 25x25 Sudoku

Coding Games and Programming Challenges to Code Better

Send your feedback or ask for help here!

Created by @yoch,validated by @cedricdd,@Timinator and @Lanfeust.
If you have any issues, feel free to ping them.

Is this even solvable in c? i feel like execution delays are insanely restrictive is it just on that particular language? iā€™ve tryed a lot optimizing tricks bit masking linkchaining and list checking paired with row/column/block counters to auto out singles and nothing seems to works, im runing out of ideas? all other ideas i have litteraly take to me to c# or c++ā€¦

C is one of the fatest languages, and I can see that some people made it in python, so you definitively can make it in C. I didnā€™t make it so I canā€™t help you about ideas.

I solved it in Python with the mixed integer linear programming module from Scipy, so itā€™s technically Cython not pure Python, but solved the worst case in less than 500ms. You can certainly do better in C. The other Python solutions here are using a very popular algorithm. If you search for sudoku solving algorithms youā€™ll find it.

ā€˜technically Cā€™ in Python means a benefit in time of about 10x versus really in C, since time limit for C in puzzles is about ten times lower than for Python, right?

1 Like

From my experiments, itā€™s more like 20x, but the MILP approach is terribly inefficient. I used it because Iā€™m practicing it, not to avoid the time limit. The pure Python solutions using better algorithms still finish well below the time limit.

Anyway, itā€™s certainly possible to solve this in C because thereā€™s one published solution here. Itā€™s not easy, itā€™s a Very Hard puzzle after all, but itā€™s definitely possible.

1 Like

I am trying to pass this puzzle using Timinator and 5DN1Lā€™s playground (in C++), and I am getting some timeouts.

After benchmarking a bit, the code is consistently slower on codingame by a factor of 3:

  • Test1: ~60ms on computer vs ~180ms on ide. ā†’ 626 cycles.
  • Test2: ~80ms on computer vs ~240ms on ide. ā†’ 1393 cycles.
  • Test3: ~350ms on computer vs TIMEOUT on ide. ā†’ 13899 cycles.
  • Test4: ~380ms on computer vs TIMEOUT on ide. ā†’ 18660 cycles.
  • Test5: ~550ms on computer vs TIMEOUT on ide. ā†’ 30315 cycles.

Timeouts make sense since from test 3 onwards multiplying by 3 gets me over the 1s barrier.
I compiled without flags (g++ file.cpp).

I donā€™t have any insight into these timing differences. Just to clarify, I have a couple of Algorithm X questions:

  1. Have you already done the easier Sudoku puzzles with your C++ translation?

  2. Are you breaking out of the search after Algorithm X finds the first solution.

I have a general comment that might help anybody trying to solve this puzzle. If you are running into timing issues, the least painful path might be to do some logical, problem-space reduction before doing any search with backtracking.

Algorithm X Playground Pg 60: Sudoku Challenge

Using the problem-space reduction techniques discussed on the website that I reference in the playground, I am able to reduce the size of the problem for Algorithm X and my slowest times are less than 30% of the Python time limit.

To solve this puzzle in Python without any problem-space reduction was painful for me. I had to play around with the DLX column ordering until I matched the authorā€™s setup which I measured by the number of calls to solve(). This might not be a problem for a Python person because my solver now defaults to this order. In hindsight, I should have made a bigger deal out of this when I was reviewing the puzzle, but I didnā€™t. So, the bottom line is that column order can make a big difference on this puzzle for us Python folk and it might for you too if you are using Algorithm X.

2 Likes

C++ on CG is compiled in debug mode (= slow). Add #pragma gcc optimize ā€œ-O3ā€ at the very top of yor code for release mode. I am unsure whether speed is really your issue. I have solved this one with Python without any of the optimisations suggested by Timinator

3 Likes

Thanks for your answers.

I solved the 16x16 version with an average solve time of 50ms.

Yes, as there is no yield equivalent in C++ (as far as I know), I return the first valid solution found.

Makes sense. I didnā€™t realize there were many more info regarding solving the sudokus on the playground, I will look into it. ^^

Do you know if they use some -fsanitize level of debug ?
Indeed using #pragma GCC optimize(ā€œO3ā€) speeds everything up, I only timeout on test5 with it. My goal will be to pass without it as it feels like a cheat code.

Currently I am leaking all over the place, I guess it doesnā€™t help my case.

1 Like

I managed to pass it in debug mode, with test 5 taking 225ms.
I doubt this will surprise anyone, but it was indeed a me problem.

For anyone following the AlgorithmX playground and coming in this thread, here are some steps I took to achieve this (still C++), going from ~550ms to ~100ms on my computer on test case 5:

  1. I put every attribute of the DLXCell class as public. At first I wanted to have a ā€œcleanā€ class with private attributes, getters, and setters to handle them. But welp, going public brought me down to ~475 ms.
  2. I moved every DLXCell instances from the HEAP to the STACK. To be honest, I thought this was gonna be gamechanger but it was quite a let down with no big gain in execution time. At least I donā€™t have memory leaks anymoreā€¦ I use a macro toggle to switch from heap to stack allocation and Iā€™d say stack goes ~10ms faster.
  3. I got rid of the O set (list of optional requirements). I originally kept it in the code because I guess Iā€™ll have to implement it anyway later in the playground. Turns out checking if an element is in a set hundreds of thousands of times takes time, even if the set is empty (who would have guessed). This got me to my current ~100ms.

There is still a lot of room for optimization, for now I see mainly:

  1. Storing the actions and requirements as ints/longs instead of tuples. For now I store them as a string, ie ā€œplace value 0 0 4ā€ instead of {ā€œplace valueā€, row, col, value}, which does the same but with a lot less memory usage and slightly more parsing to do on the solution. I figure using 4(or 8) bytes with the 1st byte being a key to differentiate between several requirements and the rest adjusted to our needs will greatly reduce memory consumption.
  2. Using the problem-space reduction techniques discussed by Timinator above. I managed to mess it up when I implementing them, so for now I donā€™t use them. I get the same amount of reduction with ā€œLone singlesā€ but with ā€œHidden singlesā€ I somehow only deduce 5-10 more cellsā€¦
2 Likes