[Community Puzzle] 25x25 Sudoku

CommunityBot · June 23, 2023, 4:00pm

Coding Games and Programming Challenges to Code Better

Send your feedback or ask for help here!

Created by @yoch,validated by @cedricdd,@Timinator and @Lanfeust.
If you have any issues, feel free to ping them.

T_19 · February 24, 2024, 2:43pm

Is this even solvable in c? i feel like execution delays are insanely restrictive is it just on that particular language? i’ve tryed a lot optimizing tricks bit masking linkchaining and list checking paired with row/column/block counters to auto out singles and nothing seems to works, im runing out of ideas? all other ideas i have litteraly take to me to c# or c++…

Remi · February 24, 2024, 3:46pm

C is one of the fatest languages, and I can see that some people made it in python, so you definitively can make it in C. I didn’t make it so I can’t help you about ideas.

pjwerneck · February 24, 2024, 6:45pm

I solved it in Python with the mixed integer linear programming module from Scipy, so it’s technically Cython not pure Python, but solved the worst case in less than 500ms. You can certainly do better in C. The other Python solutions here are using a very popular algorithm. If you search for sudoku solving algorithms you’ll find it.

jddingemanse · February 25, 2024, 6:49am

‘technically C’ in Python means a benefit in time of about 10x versus really in C, since time limit for C in puzzles is about ten times lower than for Python, right?

pjwerneck · February 25, 2024, 12:35pm

From my experiments, it’s more like 20x, but the MILP approach is terribly inefficient. I used it because I’m practicing it, not to avoid the time limit. The pure Python solutions using better algorithms still finish well below the time limit.

Anyway, it’s certainly possible to solve this in C because there’s one published solution here. It’s not easy, it’s a Very Hard puzzle after all, but it’s definitely possible.

yhuberla · February 2, 2025, 4:30pm

I am trying to pass this puzzle using Timinator and 5DN1L’s playground (in C++), and I am getting some timeouts.

After benchmarking a bit, the code is consistently slower on codingame by a factor of 3:

Test1: ~60ms on computer vs ~180ms on ide. → 626 cycles.
Test2: ~80ms on computer vs ~240ms on ide. → 1393 cycles.
Test3: ~350ms on computer vs TIMEOUT on ide. → 13899 cycles.
Test4: ~380ms on computer vs TIMEOUT on ide. → 18660 cycles.
Test5: ~550ms on computer vs TIMEOUT on ide. → 30315 cycles.

Timeouts make sense since from test 3 onwards multiplying by 3 gets me over the 1s barrier.
I compiled without flags (g++ file.cpp).

Timinator · February 2, 2025, 6:48pm

I don’t have any insight into these timing differences. Just to clarify, I have a couple of Algorithm X questions:

Have you already done the easier Sudoku puzzles with your C++ translation?
Are you breaking out of the search after Algorithm X finds the first solution.

I have a general comment that might help anybody trying to solve this puzzle. If you are running into timing issues, the least painful path might be to do some logical, problem-space reduction before doing any search with backtracking.

Algorithm X Playground Pg 60: Sudoku Challenge

Using the problem-space reduction techniques discussed on the website that I reference in the playground, I am able to reduce the size of the problem for Algorithm X and my slowest times are less than 30% of the Python time limit.

To solve this puzzle in Python without any problem-space reduction was painful for me. I had to play around with the DLX column ordering until I matched the author’s setup which I measured by the number of calls to solve(). This might not be a problem for a Python person because my solver now defaults to this order. In hindsight, I should have made a bigger deal out of this when I was reviewing the puzzle, but I didn’t. So, the bottom line is that column order can make a big difference on this puzzle for us Python folk and it might for you too if you are using Algorithm X.

darkhorse64 · February 2, 2025, 7:58pm

C++ on CG is compiled in debug mode (= slow). Add #pragma gcc optimize “-O3” at the very top of yor code for release mode. I am unsure whether speed is really your issue. I have solved this one with Python without any of the optimisations suggested by Timinator

yhuberla · February 3, 2025, 9:12am

Thanks for your answers.

I solved the 16x16 version with an average solve time of 50ms.

Yes, as there is no yield equivalent in C++ (as far as I know), I return the first valid solution found.

Makes sense. I didn’t realize there were many more info regarding solving the sudokus on the playground, I will look into it. ^^

Do you know if they use some -fsanitize level of debug ?
Indeed using #pragma GCC optimize(“O3”) speeds everything up, I only timeout on test5 with it. My goal will be to pass without it as it feels like a cheat code.

Currently I am leaking all over the place, I guess it doesn’t help my case.

yhuberla · February 4, 2025, 4:28pm

I managed to pass it in debug mode, with test 5 taking 225ms.
I doubt this will surprise anyone, but it was indeed a me problem.

For anyone following the AlgorithmX playground and coming in this thread, here are some steps I took to achieve this (still C++), going from ~550ms to ~100ms on my computer on test case 5:

I put every attribute of the DLXCell class as public. At first I wanted to have a “clean” class with private attributes, getters, and setters to handle them. But welp, going public brought me down to ~475 ms.
I moved every DLXCell instances from the HEAP to the STACK. To be honest, I thought this was gonna be gamechanger but it was quite a let down with no big gain in execution time. At least I don’t have memory leaks anymore… I use a macro toggle to switch from heap to stack allocation and I’d say stack goes ~10ms faster.
I got rid of the O set (list of optional requirements). I originally kept it in the code because I guess I’ll have to implement it anyway later in the playground. Turns out checking if an element is in a set hundreds of thousands of times takes time, even if the set is empty (who would have guessed). This got me to my current ~100ms.

There is still a lot of room for optimization, for now I see mainly:

Storing the actions and requirements as ints/longs instead of tuples. For now I store them as a string, ie “place value 0 0 4” instead of {“place value”, row, col, value}, which does the same but with a lot less memory usage and slightly more parsing to do on the solution. I figure using 4(or 8) bytes with the 1st byte being a key to differentiate between several requirements and the rest adjusted to our needs will greatly reduce memory consumption.
Using the problem-space reduction techniques discussed by Timinator above. I managed to mess it up when I implementing them, so for now I don’t use them. I get the same amount of reduction with “Lone singles” but with “Hidden singles” I somehow only deduce 5-10 more cells…